A Practical Introduction to Data Analysis and Big Data Training Course

Course

In City Of London

Price on request

Description

  • Type

    Course

  • Location

    City of london

Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools.
Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class.
The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability.
Audience
Developers / programmers
IT consultants
Format of the course
Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.

Facilities

Location

Start date

City Of London (London)
See map
Token House, 11-12 Tokenhouse Yard, EC2R 7AS

Start date

On request

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

Subjects

  • Data analysis

Course programme

Introduction to Data Analysis and Big Data

  • What makes Big Data "big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Big Data Roles and Responsibilities

  • Administrators
  • Developers
  • Data Analysts

Languages used for Data Analysis

  • R language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability
  • Choosing the right solution for the problem
  • The future of Big Data
  • Summary and Conclusion

A Practical Introduction to Data Analysis and Big Data Training Course

Price on request