SPARK TRAINING

Training

Online

£ 500.08 VAT inc.

*Indicative price

Original amount in INR:

₹ 52,714

Description

  • Type

    Training

  • Level

    Intermediate

  • Methodology

    Online

  • Duration

    1 Month

  • Start date

    Different dates available

  • Online campus

    Yes

  • Delivery of study materials

    Yes

  • Support service

    Yes

  • Virtual classes

    Yes

Learn the fundamentals of Spark, the technology that is revolutionizing the analytics and big data world!

Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical MapReduce program cannot provide, Spark is the way to go.

Facilities

Location

Start date

Online

Start date

Different dates availableEnrolment now open

About this course

Learn how it performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining.
Learn how it provides in-memory cluster computing for lightning fast speed and supports Java, Python, R, and Scala APIs for ease of development.
Learn how it can handle a wide range of data processing scenarios by combining SQL, streaming and complex analytics together seamlessly in the same application.
Learn how it runs on top of Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources such as HDFS, Cassandra, HBase, or S3.

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

Subjects

  • Components
  • Spark
  • Purpose
  • RDD
  • Distributed Dataset
  • Installing
  • Spark standalone
  • Python
  • Python shell
  • Dataframes
  • Parallelized
  • Datasets

Course programme

COURSE DETAILS & CURRICULUM

Module 1 - Introduction to Spark - Getting started
  1. What is Spark and what is its purpose?
  2. Components of the Spark unified stack
  3. Resilient Distributed Dataset (RDD)
  4. Downloading and installing Spark standalone
  5. Scala and Python overview
  6. Launching and using Spark’s Scala and Python shell
Module 2 - Resilient Distributed Dataset and DataFrames
  1. Understand how to create parallelized collections and external datasets
  2. Work with Resilient Distributed Dataset (RDD) operations
  3. Utilize shared variables and key-value pairs
Module 3 - Spark application programming
  1. Understand the purpose and usage of the SparkContext
  2. Initialize Spark with the various programming languages
  3. Describe and run some Spark examples
  4. Pass functions to Spark
  5. Create and run a Spark standalone application
  6. Submit applications to the cluster
Module 4 - Introduction to Spark libraries
  1. Understand and use the various Spark libraries
Module 5 - Spark configuration, monitoring and tuning
  1. Understand components of the Spark cluster
  2. Configure Spark to modify the Spark properties, environmental variables, or logging properties
  3. Monitor Spark using the web UIs, metrics, and external instrumentation
  4. Understand performance tuning considerations

SPARK TRAINING

£ 500.08 VAT inc.

*Indicative price

Original amount in INR:

₹ 52,714