Intro to Data Analysis
Course
Online
Description
-
Type
Course
-
Methodology
Online
-
Start date
Different dates available
Explore a variety of datasets, posing and answering your own questions about each. You'll be using the Python libraries NumPy, Pandas, and Matplotlib.
Facilities
Location
Start date
Start date
Reviews
Subjects
- Data analysis
- NumPy
- Pandas
- Analysis Process
- Dataset
Course programme
Approx. 6 weeks
Assumes 6hrs/wk (work at your own pace)
Join thousands of students Course SummaryThis course will introduce you to the world of data analysis. You'll learn how to go through the entire data analysis process, which includes:
- Posing a question
- Wrangling your data into a format you can use and fixing any problems with it
- Exploring the data, finding patterns in it, and building your intuition about it
- Drawing conclusions and/or making predictions
- Communicating your findings
You'll also learn how to use the Python libraries NumPy, Pandas, and Matplotlib to write code that's cleaner, more concise, and runs faster.
This course is part of the Data Analyst Nanodegree.
Why Take This Course?This course is a good first step towards understanding the data analysis process as a whole. Before delving into each individual phase, it is important to learn the difference between all phases of the process and how they relate to each other. After taking this course, you will be better positioned to succeed in other courses in the Data Analyst Nanodegree program. For example, a student who started with Data Analysis with R, which covers the exploratory data analysis phase, might not understand at that point the difference between data exploration and data wrangling. By taking this course first, you will learn what each phase accomplishes and how it fits into the larger process.
This course also covers the Python libraries NumPy, Pandas, and Matplotlib, which are indispensable tools for doing data analysis in Python. Their many convenient functions and high performance make writing data analysis code a lot easier!
Prerequisites and RequirementsTo take this course, you need to be comfortable programming in Python.
- You should be familiar with if statements, loops, functions, lists, sets, and dictionaries. To learn about any of these topics, take the course Intro to Computer Science.
- You should also be familiar with classes, objects, and modules. To learn about these topics, take the course Programming Foundations with Python.
See the Technology Requirements for using Udacity.
What Will I Learn? Projects P2: Investigate a Dataset Choose one of Udacity's curated datasets and investigate it using NumPy and Pandas. Go through the entire data analysis process, starting by posing a question and finishing by sharing your findings. Syllabus Lesson 1: Data Analysis ProcessIn this lesson, you will learn about the data analysis process, which includes posing a question, wrangling and exploring your data, drawing conclusions and/or making predictions, and communicating your findings. You will complete an analysis of Udacity student data using pure Python, with minimal reliance on additional libraries.
Lesson 2: NumPy and Pandas for 1D DataIn this lesson, you will start learning to use NumPy and Pandas to make the data analysis process easier. This lesson focuses on features that apply to one-dimensional data. You'll learn to use NumPy arrays, Pandas Series, and vectorized operations.
Lesson 3: NumPy and Pandas for 2D DataIn this lesson, you'll continue learning about NumPy and Pandas, this time focusing on two-dimensional data. You'll learn to use two-dimensional NumPy arrays and Pandas DataFrames. You'll also learn to group your data and to combine data from multiple files.
Final Project: Investigate a DatasetIn the project, you will use NumPy and Pandas to go through the data analysis process on one of a list of recommended datasets.
Intro to Data Analysis