Introduction to PrestoDB | PrestoDB Tutorials

Introduction to PrestoDB

Presto is a distributed system that runs on a cluster of machines. A full installation includes a coordinator and multiple workers. Queries are submitted from a client such as the Presto CLI to the coordinator. The coordinator parses, analyzes and plans the query execution, then distributes the processing to the workers.

Requirements

Presto has a few basic requirements:

Linux or Mac OS X
Java 8, 64-bit
Python 2.4+

Connectors

Presto supports pluggable connectors that provide data for queries. The requirements vary by connector.

HADOOP / HIVE

Presto supports reading Hive data from the following versions of Hadoop:

Apache Hadoop 1.x
Apache Hadoop 2.x
Cloudera CDH 4
Cloudera CDH 5

The following file formats are supported: Text, SequenceFile, RCFile, ORC and Parquet.

Additionally, a remote Hive metastore is required. Local or embedded mode is not supported. Presto does not use MapReduce and thus only requires HDFS.

CASSANDRA

Cassandra 2.x is required. This connector is completely independent of the Hive connector and only requires an existing Cassandra installation.

Pin it

About Mariano

I'm Ethan Mariano a software engineer by profession and reader/writter by passion.I have good understanding and knowledge of AngularJS, Database, javascript, web development, digital marketing and exploring other technologies related to Software development.

Introduction to PrestoDB | PrestoDB Tutorials

Introduction to PrestoDB

Requirements

Connectors

HADOOP / HIVE

CASSANDRA

About Mariano

0 comments:

Featured post

Political Full Forms List

Recent comments

Introduction to PrestoDB | PrestoDB Tutorials

Introduction to PrestoDB

Requirements

Connectors

HADOOP / HIVE

CASSANDRA

About Mariano

RELATED POSTS

0 comments:

Featured post

Political Full Forms List