10.3 Permutation Feature Importance
Although the feature selection tool is very cool, it doesn't do a good job of controlling for the intercorrelation among independent variables. For example, if you use the Pearson correlation technique like I do in the video above, you are calculating bivariate correlations between the dependent variable and all other variables. Do you remember the weakness of that calculation from chapter 3.1? Basically, the Pearson correlation doesn't not control for the "overlap" among variances across all independent variables like a regression coefficient does.
More recently, the permutation feature importance technique has become more favorable and reliable. Rather than focusing on the correlation, or some other bivariate statistic, this technique selects random subsets of the data to see how evaluation measures (e.g. R squared) vary as different randomly selected subsets are tested. Those variables for which the sub-samples vary greater are morely likely to have bigger impacts on prediction and should be kept in your model. Follow along with this video to see how it works: