Why Feature Selection?
It is not always a good thing to deal with feature sets having maybe thousands of features or even more.
- The curse of dimensionality: As the dimensionality increases, the volume of the space increases so fast that the available data becomes sparse.
- High dimensionality tends to make models more complex and difficult to interpret.
- Often lead models to overfit on the training data.
CAUTION: Feature selection is not Feature extraction!
Strategies for Feature Selection
Feature selection will return a subset of the original set of features. There are three main strategies:
- Filter methods: They select features based on metrics like correlation, mutual information, relationship of each feature with the target variable to be predicted and so on
- Wrapper methods: They capture interaction between multiple features by using a recursive approach to build multiple models using feature subsets and select the best subset of features giving us the best-performing model.
- Embedded methods: They use Machine Learning models to rank and score feature variables based on their importance
Example of Filter Methods
Example of Wrapper Methods
Example of Embedded Methods