Apache Arrow for Data Analysis across Disparate Data Sources Training Course

Course

In City Of London

Price on request

Description

  • Type

    Course

  • Location

    City of london

Apache Arrow is an open-source in-memory data processing framework. It is often used together with other data science tools for accessing disparate data stores for analysis. It integrates well with other technologies such as GPU databases, machine learning libraries and tools, execution engines, and data visualization frameworks.
In this onsite instructor-led, live training, participants will learn integrate Apache Arrow with various Data Science frameworks to access data from disparate data sources.
By the end of this training, participants will be able to:
Install and configure Apache Arrow in a distributed clustered environment
Use Apache Arrow to access data from disparate data sources
Use Apache Arrow to bypass the need for constructing and maintaining complex ETL pipelines
Analyze data across disparate data sources without having to consolidate it into a centralized repository
Audience
Data scientists
Data engineers
Format of the Course
Part lecture, part discussion, exercises and heavy hands-on practice
Note
To request a customized training for this course, please contact us to arrange.

Facilities

Location

Start date

City Of London (London)
See map
Token House, 11-12 Tokenhouse Yard, EC2R 7AS

Start date

On request

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Emagister S.L. (data controller) will process your data to carry out promotional activities (via email and/or phone), publish reviews, or manage incidents. You can learn about your rights and manage your preferences in the privacy policy.

Reviews

Subjects

  • Data analysis
  • Apache
  • Access

Course programme

Introduction
Apache Arrow vs Parquet

Installing and Configuring Apache Arrow

Overview of Apache Arrow Features and Architecture

Exploring Data with Pandas and Apache Arrow

Exploring Data with Spark and Apache Arrow

Exploring Data with R and Apache Arrow

Exploring Data with MapD and Apache Arrow

Other Data Analysis Integrations
PySpark, Parquet files on S3, and Oracle tables and Elasticsearch indices

Troubleshooting

Summary and Conclusion

Apache Arrow for Data Analysis across Disparate Data Sources Training Course

Price on request