- Some programming experience (e.g. C, C++, Java, QBasic (!) etc. )
- At least high school level math skills will be required.
- Passion to learn
- Most popular IDE for Data Science is Anaconda. You can download and install from here. Make sure your download Python 3.7 distribution.
» I don’t have the admin permission to install any software (Don’t worry !)
- Google Colab [if you already have Google Account ]
Module 1: Python – A Quick Review
In this module, you will get a quick review on Python Language. We will not going in depth but we will try to discuss some important components of Python Language. Please note, this is not meant to be a comprehensive overview of Python or programming in general
Hands-on : Environment Setup and Jupyter Notebook Intro.
Hands-on : Python Code Along
Hands-on : Python Review Exercise
Module 2: Python for Data Analysis ( Pandas )
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
Hands-on : Using Python Pandas Library
Module 3: Data Visualization/EDA/Data Analysis ( Seaborn)
In this part of the course we will discuss methods of descriptive statistics. You will learn what cases and variables are and how you can compute measures of central tendency (mean, median and mode) and dispersion (standard deviation and variance). Next, we discuss how to assess relationships between variables, and we introduce the concepts correlation and regression.
Hands-on : Using Python Seaborn Visualization Library
Module 4: Data Analytics / Machine Learning
In this part of the course we will discuss one of the best known Machine Learning Library Scikit-Learn, a package that provides efficient versions of a large number of common algorithms. Scikit-Learn is characterized by a clean, uniform, and streamlined API, as well as by very useful and complete online documentation. A benefit of this uniformity is that once you understand the basic use and syntax of Scikit-Learn for one type of model, switching to a new model or algorithm is very straightforward.
Hands-on : Using Python scikit-learn Library
Azure ML Cheat Sheet
- Data Science Mindmap : https://gitmind.com/app/doc/fd05946837
- Data Concept
- To know more about Data Concept you can click [this] link.
- ML Performance Metrics:
- AzureML End-to-End Lecture Series
Regression Performance Matrix
Classification Performance Matrix
Open Jupyter Notebook
Mindmap for Python
Google Colab Notebook (Python Intro)
Code Along for Python Pandas (Goolge Colab link)
Download Sample Excel File: Sample Alarm Data
Exercise: 1 for Pandas
- Download .ipynb file from here [SF Salaries Exercise Salaries ] ( It’s a zip file . You need to Unzip and use)
- Dataset you can download from here [Salaries] ( It’s a zip file . You need to Unzip and use)
- Solution Colab Link is here
Exercise: 2 for Pandas
Code Along for Python Seaborn (Goolge Colab link)
Exercise: 1 for Seaborn
- Download .ipynb file from here [Seaborn Exercises ] ( It’s a zip file . You need to Unzip and use)
- Solution Colab Link is here
Exercise: 2 (Capstone)
Code Along for Python Machine Learning – Sklearn (Goolge Colab link)
I wanted to point out some helpful links for practice. Don’t worry about being able to do these exercises, I just want you to be aware of the links so you can visit them later.
More Mathematical (and Harder) Practice:
List of Practice Problems:
A SubReddit Devoted to Daily Practice Problems:
A very tricky website with very few hints and touch problems (Not for beginners but still interesting)