Apache Drill for On-the-Fly Analysis of Multiple Big Data Formats Training Course
Course
In City Of London
Description
-
Type
Course
-
Location
City of london
Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and other Cloud and file storage systems. The power of Apache Drill lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. Apache Drill is the open source version of Google's Dremel system which is available as an infrastructure service called Google BigQuery.
In this instructor-led, live training, participants will learn the fundamentals of Apache Drill, then leverage the power and convenience of SQL to interactively query big data across multiple data sources, without writing code. Participants will also learn how to optimize their Drill queries for distributed SQL execution.
By the end of this training, participants will be able to:
Perform "self-service" exploration on structured and semi-structured data on Hadoop
Query known as well as unknown data using SQL queries
Understand how Apache Drills receives and executes queries
Write SQL queries to analyze different types of data, including structured data in Hive, semi-structured data in HBase or MapR-DB tables, and data saved in files such as Parquet and JSON.
Use Apache Drill to perform on-the-fly schema discovery, bypassing the need for complex ETL and schema operations
Integrate Apache Drill with BI (Business Intelligence) tools such as Tableau, Qlikview, MicroStrategy and Excel
Audience
Data analysts
Data scientists
SQL programmers
Format of the course
Part lecture, part discussion, exercises and heavy hands-on practice
Facilities
Location
Start date
Start date
Reviews
Subjects
- MS Excel
- Systems
- SQL
- Apache
- Excel
Course programme
Introduction to Apache Drill
How does Apache Drill compare to Spark SQL, Hive and Impala?
Overview of Apache Drill Features and Architecture
Apache Drill Components
Performing SQL Queries in Apache Drill
Understanding Data Types and Formats
Working with Schemas
Case Study and Exercise: Querying Sales Data for the Year
Performing Queries on JSON Data
Combining Data Types in SQL Queries
Creating and Dropping Tables and Views
Using Nested Data and Window Functions
Performing Data Analysis with Apache Drill
Case Study and Exercise: Analyzing the Results of a Marketing Campaign
Designing a Query Plan in Apache Drill
Optimizing Queries in Apache Drill
Integrating Apache Drill with MS Excel
Using Apache Drill ODBC/JDBC drivers to plug into Tableau, MicroStrategy, Qlikview, etc.
Case Study and Exercise: Visualizing the Data and the Power of a Good Story
Understanding Apache Drill's Decentralized Security Model
Apache Drill Performance and Debugging
Summary and Conclusion
Apache Drill for On-the-Fly Analysis of Multiple Big Data Formats Training Course