Postgraduate Diploma In Applied Data Science
Postgraduate
Online
Description
-
Type
Postgraduate
-
Methodology
Online
-
Duration
9 Months
-
Start date
December
Managing a business or organization in today’s world is more science than art. At the core of that science is data, and the ability to unleash its power and extract value from it is critical to any company’s success. Data science and machine learning have transformed entire industries and continue to do so. The data revolution has led to a spike in the demand for data scientists and machine learning practitioners that shows no signs of slowing down.
The Postgraduate Diploma in Applied Data Science is designed to help participants master data science, from the critical foundations of statistics and probability to working hands-on with machine learning models using Python, the world's most popular programming language. Analytical models are more powerful when they are built with the right statistics, and this comprehensive diploma can help you learn the key statistics and probability concepts to build effective models, enhance your data interpretation skills and make well-informed decisions.
Facilities
Location
Start date
Start date
About this course
The diploma requires an undergraduate knowledge of statistics, (descriptive statistics, regression, sampling distributions, hypothesis testing, interval estimation etc.) linear algebra and probability. You would be required to possess a knowledge of programming concepts like variables, loops, functions, OOP etc.
Some hands on knowledge with Python Language and Jupyter Notebook IDE will be necessary. All assignments/application projects will be done in Jupyter Notebooks using the Python programming language. Emeritus offers a complimentary Python for Data Analytics certificate course to meet
Reviews
Subjects
- SQL
- Data analysis
- Statistics
- Probability
- Lists
- Functions
- Data Extraction
- Dictionaries
- Conditional Statements
- Assignment operations
- Mutability
Course programme
Course 1: Tools and Data Management
Python Basics - How to Translate Procedures into Code
Python data types (basic and Boolean), conditional statements, functions, assignment operations
Intermediate Python - Data Structures for Analysis
Lists, dictionaries, mutability, and iterations with examples on data structures
Relational Databases - Where Big Data Is Typically Stored
Basics of databases and normalization
SQL - Ubiquitous Database Format/Language
Using SQL for Python, SQL workbench, working with multiple tables
Data Extraction - Getting Data from the Internet - Part 1
Extracting data from the web using JSON, Google API, and XML
Data Extraction - Getting Data from the Internet - Part 2
Using the Beautiful Soup mechanism to extract data, the Epicurious example
Course 2: Statistics and Exploratory Data Analysis
Statistical Distributions - The Shape of Data
Types of distributions: Normal (examples), Poisson, Geometric, Exponential, Lognormal, and Bernoulli
Sampling - When You Can't or Won't Have ALL the Data
Size and sampling techniques, central limit theorem and motivation, sample size distribution (fixed sample size), polling techniques (given sample size, given target accuracy), estimating proportions
Hypothesis Testing - Answering Questions About Your Data
Calculating and interpreting confidence levels, t-tables and t-multipliers, determining P-values and A/B testing example
Data Analysis and Visualization - using Python's NumPy for analysis
Introduction to using Numpy and Pandas for data visualization, Pandas datareader, time-series analysis, risk return analysis, regression
Data analysis and visualization - using Python's Pandas for Data Wrangling
Data cleaning and data visualization using Pandas, using the groupby function to organize data
Course 3: Fundamentals of Machine Learning
Machine Learning - Basic Regression and Classification
Machine learning using wines dataset and rocks and mines dataset, classification metrics, classification metrics using rocks and mines dataset
Linear Regression
Introduction to linear regression, using dummy variables in regression, measuring outputs of regression, making predictions with regression, collinearity, overfitting and how to prevent it
Logistic Regression
Introduction to regression, classification problems, and building a logistic regression model, and practice
Machine Learning - Decision Trees and Clustering
Understanding decision trees – example and visualization, regression trees (using the wines dataset), classification trees (rocks and mines dataset)
Ensemble Methods
Decision trees, bagging and boosting concepts, feature importance, and hyperparameter tuning
Naïve Bayes Classifiers
Discrete and conditional probabilities, Baye’s theorem, spam filtering, and practice
Neural Networks
Neural networks in keras, the perceptron, real-life examples: movie review classification and predicting housing prices
K-means Clustering
Unsupervised models, k-means clustering models and examples, Gaussian mixtures and examples
Dimensionality Reduction
Data projections, dimensionality reduction (DR), other DR techniques, principal component analysis
Text Mining - Automatic Understanding of Text
Text mining techniques: sentiment analysis, complexity analysis, and named entity analysis, text summarization, and topic modelling techniques
Time Series Analysis
Datetime and introduction to time series, exploring time series, descriptive statistics, partial autocorrelation, autoregressive models, the ARIMA model
Capstone Project
Acting in the role of consultant, test the efficacy of an office supply company’s telemarketing campaigns for a select audience and help them leverage the test results to their advantage.
Postgraduate Diploma In Applied Data Science