Introduction to Spark - University of California

edX

Course

Online

Free

Description

  • Type

    Course

  • Methodology

    Online

  • Duration

    2 Weeks

  • Start date

    Different dates available

Learn the fundamentals and architecture of Spark, the leading cluster-computing framework among professionals.   With this course you earn while you learn, you gain recognized qualifications, job specific skills and knowledge and this helps you stand out in the job market.

Facilities

Location

Start date

Online

Start date

Different dates availableEnrolment now open

About this course

Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark). Previous experience with Spark NOT required.

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

This centre's achievements

2017

All courses are up to date

The average rating is higher than 3.7

More than 50 reviews in the last 12 months

This centre has featured on Emagister for 8 years

Subjects

  • Data analysis
  • Statistics
  • Spark
  • Computing Framwork
  • Computer

Course programme

Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued. This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL. This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.

Introduction to Spark - University of California

Free