There are currently no items in your shopping cart.

User Panel

Forgot your password?.

Feature Engineering for Machine Learning

Video Introducing this tutorial

Introduction :
Course curriculum overview
Course requirements
How to approach this course
Guide to setting up your computer
Installing XGBoost in windows
Presentations covered in this course
Jupyter notebooks covered in the course
FAQ: Data science and Python programming

Types of variables :
Variables | Intro
Numerical variables | Intro
Numerical variables | Notebook demo
Categorical variables | Intro
Categorical variables | Notebook demo
Date and time variables | Intro
Dates | Notebook demo
Mixed variables | Intro
Mixed variables | Notebook demo
Bonus: Learn more about the Lending Club dataset

Types of problems in variables :
Problems in variables | Intro
Missing values
Rare values
Bonus: Machine Learning Algorithms Overview - Table
Bonus: Additional reading resources on variable problems
FAQ: How can I learn more about machine learning?

Machine learning model requirements :
Variable magnitude
Linear assumption
Variable distribution
Bonus: Additional reading resources

Engineering missing values (NA) in numerical variables :
Complete Case Analysis
Mean and median imputation
Random sample imputation (part 1)
Random sample imputation (part 2)
Adding a variable to capture NA
End of distribution imputation
Arbitrary value imputation

Engineering missing values (NA) in categorical variables :
Frequent category imputation
Random sample imputation
Adding a variable to capture NA
Adding a category to capture NA

Bonus: More on engineering missing values :
Overview of missing value imputation methods
Conclusion: when to use each NA imputation method

Engineering outliers in numerical variables :
Top-coding, bottom-coding and zero-coding (part 1)
Top-coding, bottom-coding and zero-coding (part 2)

Engineering rare values in categorical variables :
Engineering rare values (part 1)
Engineering rare values (part 2)
Engineering rare values (part 3)
Engineering rare values (part 4)

Engineer labels of categorical variables :
One-hot-encoding - variables with many labels
Ordinal numbering encoding
Count or frequency encoding
Target guided ordinal encoding
Mean encoding
Probability ratio encoding
Weight of evidence (WOE)
Comparison of categorical variable encoding
Bonus: Additional reading resources

Engineering mixed variables :
Engineering mixed variables (part I)
Engineering mixed variables (part II)

Engineering dates :
Engineering dates

Feature Scaling :
Normalisation - Standarisation
Scaling to minimum and maximum values
Scaling to median and quantiles

Gaussian Transformation :
Transformation with functions
Transformation with functions - Fare
Box Cox transformation

Discretisation :
Equal frequency discretisation
Equal width discretisation
Domain knowledge discretisation
Discretisation with classification trees
Bonus: Additional reading resources

NEW: Engineering features with Feature_Engine :
Introduction to Feature engine and downloading the package
Feature Engine: missing value imputation
Feature Engine: categorical variable encoding
Feature Engine: variable discretisation
Feature Engine: outlier handling
Feature Engine: variable transformation

Putting it all together :
New: Regression with Feature_Engine

Final section | Next steps :
BONUS: Discounts on my other courses!