 Machine Learning Courses

A brief summary of the topics covered in this course is as below. This is 150 hours course, it is suggested to complete this course in 2 Months. Apart from my classroom course, you will be given exercises, and it will take another 100 hours in the course duration to complete these exercises.

## Introduction to Machine Learning Supervised Learning

• What is Machine Learning?
• Supervised vs Unsupervised Learning
• Type of ML problems
• High Level view of ML Project Lifecycle

## Linear Regression

• Introduction to regression – equation, limitations
• Types of regressions
• Simple linear regression – Best-fit line, OLS, goodness of fit, Assumptions
• Model building
• Model Evaluation (regression parameters), Residual analysis and prediction, model interpretation
• The Mathematics of regression (parameter estimation using OLS, the gradient descent algorithm, ANOVA)
• Transformation of variables : Scaling and Standardization
• Polynomial regression
• Ordinary Least Squares
• Linear Regression

## Multiple linear regression

• SLS vs MLR
• Multicollinearity
• Dummy Variable
• Polynomial regression
• Feature Selection
• Model Building: BACKWARD, FORWARD, STEPWISE
• R Square and Adjusted R Square
• Loss: RMSE , MSE, MAE Comparison
• Interpreting coefficients of MLR

## Regularization

• Introduction to Regularization
• Regularized linear models
• Ridge regression
• Lasso regression
• Elastic net

## Classification

• Introduction : Regression vs classification, types of classification, evaluating classification models
• Logistic Regression : Best-fit sigmoid curve, odds & log odds, multivariate logistic regression
• Building Logistic Regression Model
• Model Evaluation: Confusion metrics and accuracy, sensitivity & specificity, precision & recall, trade-offs, ROC-AUC, predictions
• Transformation of variables : Scaling and Standardization (optional)
• Decision Trees : Descriptive vs Discriminative classification, the decision tree algorithm, measuring purity (Gini index, Entropy, Information gain),
• Building Decision Trees Model
• K-Nearest Neighbor Model
• Telecom Churn Case Study

## Ensamble Model

• Introduction to Ensemble Modelling
• Bagging (Bootstrap Aggregation) Model Introduction
• Random Forest
• Boosting Model Introduction
• Stacking
• Bledning
• Out of Bag (OOB)
• Feature importance in random forests
• Building Random Forest Model
• Building Boost Based Model

## Support Vector Machine (SVM)

• Linear SVM classification
• Mathematical/ geometrical  intuition
• In-depth geometrical intuition
• Soft margin classification
• Nonlinear SVM classification
• Polynomial kernel
• Gaussian, RBF kernel
• Data leakage
• SVM Regression
• Mathematical/ geometrical intuition

## Naïve Bayesian

• Introduction to Bayes theorem
• Multinomial naïve Bayes
• Gaussian naïve Bayes
• Various type of Bayes theorem and their intuition

## Clustering & Market Basket Analysis

• Introduction to clustering, types of clustering, Euclidean distance & centroid
• K-means clustering algorithm
• Transformation of variables : Scaling and Standardization (Optional)
• Building K-means model
• Introduction to market basket analysis, cross-selling & upselling, bag vs basket of products, the Apriori algorithm,
• Gaussian Mixture Model
• K-Means
• K-Means++
• Batch K-Means
• Hierarchical Clustering
• DBSCAN
• Evaluation of clustering
• Homogeneity, completeness and v-measure
• Silhouette coefficient
• Davies-bouldin index
• Contingency matrix
• Confusion matrix

## Model Evaluation & Model Selection

• Principles of model selection – model & learning algorithm
• Simplicity, Complexity & overfitting, bias-variance trade off.
• Tuning Complexity and Regularization
• Regularization, hyperparameters, and cross validation
• Model building & Model evaluation
• Hyperparameter tuning using grid-search and randomized-search CV
• Handling class imbalance
• Model Selection

## Feature Engineering

• Feature engineering – introduction
• Handling numeric features, handling categorical features, handling time-based features
• Feature selection using CV
• Feature selection
• Recursive feature elimination
• Backward elimination
• Forward elimination
• Handling missing data
• Handling outliers
• Filter method
• Wrapper method
• Embedded methods
• Feature scaling
• Standardization
• Mean normalization
• Min-max scaling
• Unit vector
• Feature extraction
• PCA (principle component analysis)
• Introduction to Data encoding
• Nominal encoding
• One hot encoding
• One hot encoding with multiple categories
• Mean encoding
• Ordinal encoding
• Label encoding
• Target guided ordinal encoding
• Covariance
• Correlation check
• Pearson correlation coefficient
• Spearman’s rank correlation
• VIF

## Handling Imbalance Data

• Introduction to Data Imbalance
• Up-sampling
• Down-sampling
• K-Fold Cross Validation
• Stratified K-Fold
• Synthetic Minority Oversampling technique (SMOTE)
• Random Oversampling
• Data interpolation
• Choosing Right Evaluation Metric
• Treat problem as Anomaly Detection

## Model Evaluation Metrics

• Confusion Matrix
• Accuracy, Recall (Sensitivity/ TPR), Precision, F1, ROC, AUC
• Error Rate, Specificity, FPR, Prevalence
• RMSE, MAE, MSE
• R Square, Adjusted R Square

## Loss Function

• Introduction to Regression and Classification Loss Function
• Root Mean Square Error (RMSE)
• Mean Square Error (MSE)
• Mean Average Error (MAE)
• Huber Loss
• Maximum Likelihood Estimation
• Binary Cross Entropy Loss
• Hinge Loss
• Multi Class Cross Entropy Loss
• KL (Kullback Leibler) Divergence Loss

## Model Monitoring

• Introduction to model monitoring
• Model Drifting
• What to monitor?
• How frequently evaluate?
• How to take decision?

## Model Retraining

• Introduction to model retraining
• Retraining on same algorithm and new data
• Trying new features
• Trying new algorithms

## Dimensionality reduction

• The curse of dimensionality
• Dimensionality reduction technique
• PCA (principle component analysis) Introduction & Maths
• Scree plots
• Eigen-decomposition approach
• tNSE

## Decision Trees Based ML

• Decision Tree
• Definition of Ensemble techniques
• Bagging technique
• Bootstrap aggregation
• Random forest (bagging technique)
• Random forest repressor
• Random forest classifier
• Complete end-to-end project with deployment

## Recommendation Systems

• Introduction to Recommendation Systems
• Application of Recommendation Systems
• Collaborative Filtering
• Content Based Filtering

## ML Libraries / Algorithm

• scipy (pandas, numpy, matplotlib, sympy, scikit-learn, scikit-image)
• scikit-learn, scikit-image, statsmodel

Tags:

Categories:

Updated: