Getting Started with Haskell Data Analysis
Course
Online
Description
-
Type
Course
-
Methodology
Online
-
Start date
Different dates available
Put your Haskell skills to work and generate publication-ready visualizations in no time at all.Data analysis is part computer science and part statistics. An important part of data analysis is validating your assumptions with real-world data to see if there is a pattern, or a particular user behavior that you can validate. This video course will help you get up to speed with the basics of data analysis and approaches in the Haskell language. You'll learn about statistical computing, file formats (CSV and SQLite3), descriptive statistics, charts, and onto more advanced concepts like understanding the importance of normal distribution. Whilst mathematics is a big part of data analysis, we’ve tried to keep this course simple and approachable so that you can apply what you learn to the real world.About the AuthorJames Church lives in Clarksville, Tennessee, United States, where he enjoys teaching, programming, and playing board games with his wife, Michelle. He is an assistant professor of computer science at Austin Peay State University. He has consulted for various companies and a chemical laboratory for the purpose of performing data analysis work. James is the author of Learning Haskell Data Analysis.
Facilities
Location
Start date
Start date
About this course
Learn to parse a CSV file and read data into the Haskell environment
Create Haskell functions for the common descriptive statistics functions that you already know about: range, mean, median, mode, and standard deviation
Learn to create a SQLite3 database using an existing CSV file
Learn the versatility of the SELECT query for slicing data into smaller chunks
Learn to craft regular expressions through simple examples
Learn to apply regular expressions in large-scale datasets using both CSV files and SQLite3 files
Understand the formula for normal distribution and how the parameters affect the shape of the distribution
Learn to create a kernel density estimator visualization, which is an application of normal distribution
Reviews
This centre's achievements
All courses are up to date
The average rating is higher than 3.7
More than 50 reviews in the last 12 months
This centre has featured on Emagister for 6 years
Subjects
- Data analysis
- Statistics
- Install
Course programme
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "median." The median is the center value of the sorted dataset provided that there are an odd number of values. If there are an even number of values, then the median is the mean of the two center-most sorted values
- We find the sort the values using "sortBy" and then determine whether we have an even number or odd number of values in the list. Then we determine the median
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "mode." The mode of a dataset is the value that appears most frequently in that dataset
- We find the mode of a list in a naļve way similar to that of the algorithm for run-length encoding. Are there faster ways? Yes
- We drop that function into a module file and show how it can be called from IHaskell
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "median." The median is the center value of the sorted dataset provided that there are an odd number of values. If there are an even number of values, then the median is the mean of the two center-most sorted values
- We find the sort the values using "sortBy" and then determine whether we have an even number or odd number of values in the list. Then we determine the median
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "mode." The mode of a dataset is the value that appears most frequently in that dataset
- We find the mode of a list in a naļve way similar to that of the algorithm for run-length encoding. Are there faster ways? Yes
- We drop that function into a module file and show how it can be called from IHaskell
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Obtain a raw CSV file
- Install the Text.CSV library
- Use Jupyter and the Text.CSV library to obtain a column of the data
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "range." The range is the smallest interval that contains all of the data
- We find the smallest values in a dataset using "minimum". We find the largest using "maximum". We write a quick function to combine these two values into a tuple
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "mean" and "standard deviation." The mean is the sum of all values divided by the number of values. The standard deviation is the mean of the distance of each value from the original mean
- We find the sum values in a dataset using "sum". We find the number of values in a dataset using "length". We write a quick function to return this value, as well as the standard deviation
- We drop that function into a module file and show how it can be called from IHaskell.
- Define what is meant by "median." The median is the center value of the sorted dataset provided that there are an odd number of values. If there are an even number of values, then the median is the mean of the two center-most sorted values
- We find the sort the values using "sortBy" and then determine whether we have an even number or odd number of values in the list. Then we determine the median
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "median." The median is the center value of the sorted dataset provided that there are an odd number of values. If there are an even number of values, then the median is the mean of the two center-most sorted values
- We find the sort the values using "sortBy" and then determine whether we have an even number or odd number of values in the list. Then we determine the median
- We drop that function into a module file and show how it can be called from IHaskell
- Define what is meant by "median." The median is the center value of the sorted dataset provided that there are an odd number of values
Data Mode
We have a collection of data. We would like to quickly identify the mode of the data.
- Define what...
Additional information
Getting Started with Haskell Data Analysis
