Feature Selection for Machine Learning

Feature selection with Python

Discover what you'll learn in the course (enable cookies if the video doesn’t play).

What you'll learn

👉 Build simpler, faster and robust machine learning models.

👉 Why feature selection matters.

👉 Filter, embedded and wrapper methods.

👉 Forward and backward search.

👉 Select features with Lasso and decision trees.

👉 Recursive feature selection.

👉 Apply feature selection with Python open source.

What you'll get

00

Hours on-demand video

00

Jupyter Notebooks

00

Quizzes and Assignments

Lifetime access

Instructor support

Certificate of completion

Access on mobile

💬 English subtitles

Enroll Now

What our students say

Train in Data Reviews

Pricing

Master Feature Selection for Machine Learning

$49.99

Create simpler, faster and more reliable machine learning models.

Learn filter, wrapper and embedded methods
Discover state-of-the-art feature selection methods
Apply recursive feature elimination
Learn MRMR, single feature classifiers and more

Enroll Now

30 days money back guarantee

If you're disappointed for whatever reason, you'll get a full refund.

So you can buy with confidence.

Instructor

Soledad Galli, PhD

Sole is a lead data scientist, instructor and developer of open source software. She created and maintains the Python library Feature-engine, which allows us to impute data, encode categorical variables, transform, create and select features. Sole is also the author of the"Python Feature engineering Cookbook" by Packt editorial.

Course description

Welcome to Feature Selection for Machine Learning, the most comprehensive course on feature selection available online.

In this course, you will learn multiple feature selection methods to select the best features in your data set and build simpler, faster, and more reliable machine learning models.

What is feature selection?

Feature selection is the process of identifying and selecting a subset of features from the original data set to use as inputs in a machine learning algorithm.

Data sets usually contain a large number of features. We can use multiple algorithms to quickly disregard irrelevant features and identify those important features in our data.

Feature selection algorithms can be divided into 1 of 3 categories: filter methods, wrapper methods, and embedded methods.

Filter methods comprise basic data preprocessing steps to remove constant and duplicated features and statistical tests to assert feature importance. Wrapper methods wrap the search around the estimator. They use backward and forward selection to examine and identify the best set of features. Embedded methods combine feature selection with the fitting of the classifier or regression model.

Why do we select features?

Feature selection is key to creating easier to interpret and faster models, as well as to avoiding overfitting. When creating machine learning models to use in the real-world, feature selection is an integral part of the machine learning pipeline.

What will you learn in this online course?

In this course, you will learn multiple feature selection techniques, gathered from scientific articles, data science competitions and my experience as a data scientist, to identify relevant features in your data sets.

You will learn the following filter methods:

Chi-square test for categorical variables
ANOVA for continuous variables and binary or multiclass target variables
Pearson’s correlation for continuous variables in regression
Information gain
Mutual information

You will learn the following wrapper methods:

Forward selection of features
Backward selection of variables
Exhaustive search

You will learn the following embedded methods:

Lasso regularization
Linear models coefficients
Feature importance derived from decision trees and random forests

You will learn the following hybrid methods:

Recursive feature elimination or addition
How to select features based on changes in model performance after feature shuffling

Throughout the tutorials, you will implement the feature selection methods in an elegant, efficient, and professional manner, using Python, Scikit-learn, pandas, MLXtend and Feature-engine.

At the end of the course, you will have a variety of tools to select and compare different feature subsets and identify the ones that return the simplest, yet most predictive machine learning model. This will allow you to minimize the time it takes to put your predictive models into production.

Who is this course for?

You’ve taken your first steps into data science. You know the most commonly used machine learning models. You've probably trained a few linear regression or decision trees. You are familiar with data preprocessing and feature engineering techniques like missing data imputation and encoding categorical variables. At this stage, you’ve probably realized that many data sets contain an enormous number of features, and some of them are identical or very similar. Some of them are not predictive at all, and for some others, it is harder to say.

You wonder how you can go about finding the most predictive features. Which ones are OK to keep and which ones could you do without? You also wonder how to code the methods in a professional manner. You probably did your online search and found out that there is not much around there about feature selection. So you start to wonder: how are things really done in tech companies?

This course will help you! This is the most comprehensive online course in variable selection. You will learn about a huge variety of feature selection procedures used worldwide in different organizations and in data science competitions, to select the most predictive features.

Course prerequisites

To get the most out of this course, you need to have a basic knowledge of machine learning and familiarity with the most common predictive models, like linear and logistic regression, decision trees, and random forests, and the metrics used to evaluate model performance. You also need basic knowledge of Python and the open source libraries, Numpy, Pandas, and sklearn.

To wrap-up

This comprehensive feature selection course contains approximately 70 lectures spread across 8 hours of video, and ALL topics include hands-on Python code examples that you can use for reference, practice, and re-use in your own projects.

Enroll Now

Course Curriculum

Welcome

Available in days

days after you enroll

Course material

Available in days

days after you enroll

Feature selection

Available in days

days after you enroll

Filter Methods | Basic

Available in days

days after you enroll

Filter methods | Correlation

Available in days

days after you enroll

Filter methods | Statistical tests

Available in days

days after you enroll

Filter Methods | Other methods and metrics

Available in days

days after you enroll

Wrapper methods

Available in days

days after you enroll

Embedded methods | Linear models

Available in days

days after you enroll

Embedded methods – Lasso regularisation

Available in days

days after you enroll

Embedded methods | Trees

Available in days

days after you enroll

Hybrid feature selection methods

Available in days

days after you enroll

Congratulations! You did it!

Available in days

days after you enroll

Enroll Now

Frequently Asked Questions

When does the course begin and end?

You can start taking the course from the moment you enroll. The course is self-paced, so you can watch the tutorials and apply what you learn whenever you find it most convenient.

For how long can I access the course?

The course has lifetime access. This means that once you enroll, you will have unlimited access to the course for as long as you like.

What if I don't like the course?

There is a 30-day money back guarantee. If you don't find the course useful, contact us within the first 30 days of purchase and you will get a full refund.

Will I get a certificate?

Yes, you'll get a certificate of completion after completing all lectures, quizzes and assignments.

Can I ask questions if I get stuck?

Absolutely! Under each video there is a comments section. Just pop your question in there, and the instructors will reply as soon as they can.

Is the course mobile-friendly?

It is indeed. Download Teachable's app on Google Play or Apple Store, log in with your Train in Data credentials and enjoy the courses from your mobile phone.

Can I get an invoice for my company?

Yes — we’re happy to help with company invoices 😊

Just send us an email at pricing@trainindata.com with your company’s name, tax ID, physical address (and company number, if applicable), the course or courses you’d like to purchase, and the email address(es) of the students to be enrolled. We’ll send you the invoice, and once everything looks good on your side, you can proceed with payment. As soon as we receive it, we’ll take care of enrolling the students.

Can I gift a course?

Yes, absolutely! 🎁

To gift a course, just email us at pricing@trainindata.com with the name and address of the person giving the gift, along with the email address of the person who will be enrolled. Don't forget to mention the course you want to gift. We’ll send you an invoice, and once payment is complete, we’ll take care of enrolling the recipient in the course.

Frequently bought together

Feature Engineering for Machine Learning

Learn imputation, variable encoding, discretization, feature extraction, how to work with datetime, outliers, and more.

Soledad Galli