Please tell us about feature selection and enlighten us.

Question

Takeharu Kiso 2024 年 5 月 14 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2118511-please-tell-us-about-feature-selection-and-enlighten-us

回答済み: Prasanna 2024 年 6 月 26 日

One of the built-in feature selection algorithms, out-of-bag for random forests in classification trees, was employed to select features for machine learning (Selecting predictors for random forests - MATLAB & Simulink - MathWorks Japan) to indicate the importance of the predictors.

A histogram was created and sorted in order of the predictors' values, and from the histogram, the predictors were fed into the machine learning model by identifying where the importance of the predictors differed significantly and selecting only those that were greater than or equal to the histogram.

Here, can we consider the method of finding the histograms, the parts that differ significantly visually, and selecting thresholds as one of the various options for narrowing down the number of features?

We would be very grateful if you could provide us with some guidance.

Thank you very much in advance.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Prasanna 2024 年 6 月 26 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2118511-please-tell-us-about-feature-selection-and-enlighten-us#answer_1477281

Hi Takeharu,

Feature selection is a critical step in the machine learning pipeline that involves selecting a subset of relevant features (predictors) for use in model construction. One of the built-in feature selection algorithms in MATLAB is the out-of-bag (OOB) error estimation for random forests in classification trees. This method evaluates the importance of each predictor by measuring how much the prediction error increases when the values of that predictor are permuted while all others are left unchanged.

The method of using histograms and visual inspection to set a threshold for feature selection can indeed be considered a valid approach. It is a practical way to narrow down the number of features by leveraging the importance scores provided by the random forest algorithm. While it may not be as rigorous as some other statistical methods, it offers a straightforward and intuitive means of feature selection, especially when dealing with many predictors. However, it is essential to note that this approach has its limitations. The choice of threshold may vary depending on the specific dataset and problem at hand. Additionally, visual inspection might not always capture subtle but significant differences in predictor importance.

To enhance the robustness of your feature selection process, you might consider combining this method with other techniques such as

cross-validation
recursive feature elimination
regularization methods.

By combining the visual inspection method with these additional techniques, you can create a more robust and reliable feature selection process that enhances the performance and interpretability of your machine learning models.

Refer the following links for more information on feature selection and improving classification trees:

Hope this helps.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Please tell us about feature selection and enlighten us.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Please tell us about feature selection and enlighten us.

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示