Presentation File:
Machine Learning – Introduction
Related Materials:
- Data Concept
- To know more about Data Concept you can click [this] link.
- ML Performance Metrics:
Primary Requirements
- Some programming experience
- At least high school level math skills will be required.
- Passion to learn
IDE Requirements
- Most popular IDE for Data Science is Anaconda. You can download and install from here. Make sure your download Python 3.7 distribution.
I don’t have the admin permission to install any software (Don’t worry !)
- Google Colab [if you already have Google Account ]
- Azure Notebook [if you already have Microsoft Account]
- Both are Free ! to use
Is there anyway I can do Machine Learning Analytics with Less Code or No Code?
Yes ! We can.
How?
Step 1 : Please go to this site https://studio.azureml.net/
Step 2 : Use any Microsoft Account(youremail@hotmail.com / outlook.com) to Register and Login
Azure ML Cheat Sheet
Algorithm Summary
Source: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/
Predicting Used Car Prices
The Problem
The prices of new cars in the industry is fixed by the manufacturer with some additional costs incurred by the Government in the form of taxes. So, customers buying a new car can be assured of the money they invest to be worthy. But due to the increased price of new cars and the incapability of customers to buy new cars due to the lack of funds, used cars sales are on a global increase (Pal, Arora and Palakurthy, 2018). There is a need for a used car price prediction system to effectively determine the worthiness of the car using a variety of features. Even though there are websites that offers this service, their prediction method may not be the best. Besides, different models and systems may contribute on predicting power for a used car’s actual market value. It is important to know their actual market value while both buying and selling.
The Client
To be able to predict used cars market value can help both buyers and sellers.
Used car sellers (dealers): They are one of the biggest target group that can be interested in results of this study. If used car sellers better understand what makes a car desirable, what the important features are for a used car, then they may consider this knowledge and offer a better service.
Online pricing services: There are websites that offers an estimate value of a car. They may have a good prediction model. However, having a second model may help them to give a better prediction to their users. Therefore, the model developed in this study may help online web services that tells a used car’s market value.
Individuals: There are lots of individuals who are interested in the used car market at some points in their life because they wanted to sell their car or buy a used car. In this process, it’s a big corner to pay too much or sell less then it’s market value.
The Data
The data used in this project was downloaded from Kaggle. It was uploaded on Kaggle by Austin Reese who Kaggle.com user. Austin Reese scraped this data from craigslist with non-profit purpose. It contains most all relevant information that Craigslist provides on car sales including columns like price, condition, manufacturer, latitude/longitude, and 22 other categories.
Dataset Collected from here : https://www.kaggle.com/austinreese/craigslist-carstrucks-data
Solution
There are two ways we can do this; either we can solve this with Azure ML Designer (No Code) way or We can do this using python notebook.
- Let’s do this using Azure ML Designer (Azure ML Studio -Classic)
- If you’re Python savvy you can follow [this] link for get your ipynb files.
Heart Diseases Prediction
The Problem
The term “heart disease” is often used interchangeably with the term “cardiovascular disease”. Cardiovascular disease generally refers to conditions that involve narrowed or blocked blood vessels that can lead to a heart attack, chest pain (angina) or stroke. Other heart conditions, such as those that affect your heart’s muscle, valves or rhythm, also are considered forms of heart disease.
This makes heart disease a major concern to be dealt with. But it is difficult to identify heart disease because of several contributory risk factors such as diabetes, high blood pressure, high cholesterol, abnormal pulse rate, and many other factors. Due to such constraints, scientists have turned towards modern approaches like Data Science and Machine Learning for predicting the disease.
The Data
In this practicec, we will be applying Machine Learning approaches (and eventually comparing them) for classifying whether a person is suffering from heart disease or not, using one of the most used dataset — Cleveland Heart Disease dataset from the UCI Repository.
Data Source URL : http://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/processed.cleveland.data
Solution
There are two ways we can do this; either we can solve this with Azure ML Designer (No Code) way or We can do this using python notebook.
- Let’s do this using Azure ML Designer (Azure ML Studio -Classic)
- If you’re Python savvy you can follow [this] link for get your ipynb files and to read the blog about this problem scope you can visit this [link]
Hints:
- Edit Metadata info and put new column name : age,sex,chestpaintype,resting_blood_pressure,serum_cholestrol,fasting_blood_sugar,resting_ecg,max_heart_rate,exercise_induced_angina,st_depression_induced_by_exercise,slope_of_peak_exercise,number_of_major_vessel,thal,heart_disease_diag
- Edit Metadata info and Change Data type to Integer for following Columns: heart_disease_diag,age,sex
- Edit Metadata info and make it categorical for following Columns: sex,chestpaintype,exercise_induced_angina,number_of_major_vessel,slope_of_peak_exercise,fasting_blood_sugar,thal,resting_ecg
- Clean Missing Value
- Apply SQL Transformation
SELECT *,
CASE
WHEN heart_disease_diag < 1 THEN 0
ELSE 1
END AS HeartDiseaseCat
FROM t1;