Feature selection with Python
Discover what you'll learn in the course (enable cookies if the video doesn’t play).
What you'll learn
👉 Build simpler, faster and robust machine learning models.
👉 Why feature selection matters.
👉 Filter, embedded and wrapper methods.
👉 Forward and backward search.
👉 Select features with Lasso and decision trees.
👉 Recursive feature selection.
👉 Apply feature selection with Python open source.
What you'll get
Lifetime access
Instructor support
Certificate of completion
Access on mobile
💬 English subtitles
What our students say
Pricing
Master Feature Selection for Machine Learning
$49.99
Create simpler, faster and more reliable machine learning models.
- Learn filter, wrapper and embedded methods
- Discover state-of-the-art feature selection methods
- Apply recursive feature elimination
- Learn MRMR, single feature classifiers and more
30 days money back guarantee
If you're disappointed for whatever reason, you'll get a full refund.
So you can buy with confidence.
Instructor
Soledad Galli, PhD
Sole is a lead data scientist, instructor and developer of open source software. She created and maintains the Python library Feature-engine, which allows us to impute data, encode categorical variables, transform, create and select features. Sole is also the author of the"Python Feature engineering Cookbook" by Packt editorial.
More about Sole on LinkedIn.
Course description
Welcome to Feature Selection for Machine Learning, the most comprehensive course on feature selection available online.
In this course, you will learn multiple feature selection methods to select the best features in your data set and build simpler, faster, and more reliable machine learning models.
What is feature selection?
Feature selection is the process of identifying and selecting a subset of features from the original data set to use as inputs in a machine learning algorithm.
Data sets usually contain a large number of features. We can use multiple algorithms to quickly disregard irrelevant features and identify those important features in our data.
Feature selection algorithms can be divided into 1 of 3 categories: filter methods, wrapper methods, and embedded methods.
Filter methods comprise basic data preprocessing steps to remove constant and duplicated features and statistical tests to assert feature importance. Wrapper methods wrap the search around the estimator. They use backward and forward selection to examine and identify the best set of features. Embedded methods combine feature selection with the fitting of the classifier or regression model.
Why do we select features?
Feature selection is key to creating easier to interpret and faster models, as well as to avoiding overfitting. When creating machine learning models to use in the real-world, feature selection is an integral part of the machine learning pipeline.
What will you learn in this online course?
In this course, you will learn multiple feature selection techniques, gathered from scientific articles, data science competitions and my experience as a data scientist, to identify relevant features in your data sets.
You will learn the following filter methods:
- Chi-square test for categorical variables
- ANOVA for continuous variables and binary or multiclass target variables
- Pearson’s correlation for continuous variables in regression
- Information gain
- Mutual information
You will learn the following wrapper methods:
- Forward selection of features
- Backward selection of variables
- Exhaustive search
You will learn the following embedded methods:
- Lasso regularization
- Linear models coefficients
- Feature importance derived from decision trees and random forests
You will learn the following hybrid methods:
- Recursive feature elimination or addition
- How to select features based on changes in model performance after feature shuffling
Throughout the tutorials, you will implement the feature selection methods in an elegant, efficient, and professional manner, using Python, Scikit-learn, pandas, MLXtend and Feature-engine.
At the end of the course, you will have a variety of tools to select and compare different feature subsets and identify the ones that return the simplest, yet most predictive machine learning model. This will allow you to minimize the time it takes to put your predictive models into production.
Who is this course for?
You’ve taken your first steps into data science. You know the most commonly used machine learning models. You've probably trained a few linear regression or decision trees. You are familiar with data preprocessing and feature engineering techniques like missing data imputation and encoding categorical variables. At this stage, you’ve probably realized that many data sets contain an enormous number of features, and some of them are identical or very similar. Some of them are not predictive at all, and for some others, it is harder to say.
You wonder how you can go about finding the most predictive features. Which ones are OK to keep and which ones could you do without? You also wonder how to code the methods in a professional manner. You probably did your online search and found out that there is not much around there about feature selection. So you start to wonder: how are things really done in tech companies?
This course will help you! This is the most comprehensive online course in variable selection. You will learn about a huge variety of feature selection procedures used worldwide in different organizations and in data science competitions, to select the most predictive features.
Course prerequisites
To get the most out of this course, you need to have a basic knowledge of machine learning and familiarity with the most common predictive models, like linear and logistic regression, decision trees, and random forests, and the metrics used to evaluate model performance. You also need basic knowledge of Python and the open source libraries, Numpy, Pandas, and sklearn.
To wrap-up
This comprehensive feature selection course contains approximately 70 lectures spread across 8 hours of video, and ALL topics include hands-on Python code examples that you can use for reference, practice, and re-use in your own projects.
Course Curriculum
- Correlation - Intro (2:41)
- Correlation Feature Selection (5:32)
- Correlation procedures to select features (3:37)
- Correlation | Notebook demo (11:49)
- Basic methods plus Correlation pipeline
- Correlation with Feature-engine (8:01)
- Feature Selection Pipeline with Feature-engine (2:19)
- Categorical variables and correlation (3:03)
- Additional reading resources
- Added Treat: A Movie We Recommend 🍿
- Statistical tests for feature selection – intro (3:25)
- Statistical tests for feature selection - characteristics (4:39)
- Mutual information (8:26)
- MI for continuous variables (6:23)
- Select features with MI (4:15)
- Mutual information demo (4:39)
- Chi-square test (16:15)
- Chi-square | Demo (5:54)
- Chi-square considerations (9:19)
- Chi2 - calculating the expected frequencies (optional) (3:51)
- Chi-square quiz
- Anova (12:59)
- Anova | Demo (3:51)
- Correlation with the target (5:32)
- Correlation with the target - demo (1:28)
- Select features based of p-values (10:32)
- Basic methods + Correlation + Filter with stats pipeline
- Reading resources
- Filter Methods with other metrics (3:04)
- Univariate model performance metrics (5:52)
- Univariate model performance metrics | Demo (4:23)
- Univariate model performance with Feature-engine (4:54)
- KDD 2009: Select features by target mean encoding (6:39)
- KDD 2009: Select features by mean encoding | Demo (6:59)
- Target Mean Encoding Selection with Feature-engine (5:20)
- Reading resources
- Extra Treat: Our Reading Suggestion 📕
- Wrapper methods – Intro (6:39)
- MLXtend
- Step forward feature selection (3:14)
- SFS - MLXtend vs Sklearn (4:06)
- Step forward feature selection | MLXtend (6:00)
- Step forward feature selection | sklearn
- Step backward feature selection (3:13)
- Step backward feature selection | MLXtend (5:50)
- Step backward feature selection | Sklearn
- Exhaustive search (2:45)
- Exhaustive search | Demo (3:37)
- Introduction to hybrid methods (1:50)
- Feature Shuffling - Intro (2:41)
- Shuffling features | Demo (8:41)
- Recursive feature elimination - Intro (2:21)
- Recursive feature elimination | Demo (5:42)
- Recursive feature addition - Intro (2:06)
- Recursive feature addition | Demo (2:55)
- Feature Shuffling with Feature-engine (5:39)
- Recursive feature elimination with Feature-engine (4:53)
- Recursive feature addition with Feature-engine (3:22)
Frequently Asked Questions
When does the course begin and end?
You can start taking the course from the moment you enroll. The course is self-paced, so you can watch the tutorials and apply what you learn whenever you find it most convenient.
For how long can I access the course?
The course has lifetime access. This means that once you enroll, you will have unlimited access to the course for as long as you like.
What if I don't like the course?
There is a 30-day money back guarantee. If you don't find the course useful, contact us within the first 30 days of purchase and you will get a full refund.
Will I get a certificate?
Yes, you'll get a certificate of completion after completing all lectures, quizzes and assignments.
Can I ask questions if I get stuck?
Absolutely! Under each video there is a comments section. Just pop your question in there, and the instructors will reply as soon as they can.
Is the course mobile-friendly?
It is indeed. Download Teachable's app on Google Play or Apple Store, log in with your Train in Data credentials and enjoy the courses from your mobile phone.
Can I get an invoice for my company?
Yes — we’re happy to help with company invoices 😊
Just send us an email at pricing@trainindata.com with your company’s name, tax ID, physical address (and company number, if applicable), the course or courses you’d like to purchase, and the email address(es) of the students to be enrolled. We’ll send you the invoice, and once everything looks good on your side, you can proceed with payment. As soon as we receive it, we’ll take care of enrolling the students.
Can I gift a course?
Yes, absolutely! 🎁
To gift a course, just email us at pricing@trainindata.com with the name and address of the person giving the gift, along with the email address of the person who will be enrolled. Don't forget to mention the course you want to gift. We’ll send you an invoice, and once payment is complete, we’ll take care of enrolling the recipient in the course.