MATLAB Answers

how to use ReliefF algorithm for feteare selection?

46 ビュー (過去 30 日間)
phdcomputer Eng
phdcomputer Eng 2020 年 2 月 15 日
回答済み: Jingwei Too 2020 年 7 月 23 日
I want to use ReliefF Algorithm for feature selection problem,I have a dataset (CNS.mat) I wanted to apply ReliefF Algoritm on this data and obtain the top 30 features, then apply classifier on the result of ReliefF Algorithm. I studied about how this Algorithm works in MATLAB Help:
[RANKED,WEIGHT] = relieff(X,Y,K)
[RANKED,WEIGHT] = relieff(X,Y,K,'PARAM1',val1,'PARAM2',val2,...)
and also I studied this example of ReliefF in MATLAB HELP:
load fisheriris
[ranked,weight] = relieff(meas,species,10)
ranked =
4 3 1 2
weight =
0.1399 0.1226 0.3590 0.3754
But I don't know if this code works the way I descripted, (selects top features and save them as result for classify), my aim is to apply ReliefF Algorithm as feature selection on CNS data and compare the results of this algorithm with other algorithms like SVM-RFE,InfoGain.
I'll be very gratefull your opinions how to use ReliefF for feature selection.

回答 (2 件)

MeLearningProgramming 2020 年 7 月 23 日
編集済み: MeLearningProgramming 2020 年 7 月 23 日
Hey guy,
I am using the relieff as well. you have to watch out, how the outputs are given.
weight = 0.1399 0.1226 0.3590 0.3754
means that the first parameter in meas got the weight 0.1399 (first line = first parameter of meas)
ranked = 4 3 1 2 dosn't mean first line = first parameter of meas = ranking number 4
it means that the first parameter in meas got the ranking position 3 (position of the number 1 = first parameter)
How to use relieff?
X should a Matix with datapoint x parameter (in my case for example 147510x10) and y should be a vector datapoint x 1 (147510x1)
first you should estimate the best k-value, like this:
ParamLabels = {'P1','P2','P3','P4','P5','P6','P7','P8','P9','P10'};
for k=1:200 %or parfor
[idx,weights] = relieff(X,y,k);
RankImportanceIdx(:,k) = idx';
RankImportanceWeight(:,k) = weights';
by a simple plot of RankImportanceWeight you can see at which k-value the results stay equal => best k-value.
In my case, the best k value for example is 75! afterwards you could plot the results like this:
title(['Relief algorithm weights vs. k-values','FontWeight','normal')
xlabel('size of k-nearest neighbor'); ylabel('weights');
and/or you could create a table, like this:
for pidx=1:size(ParamLabels,2)
[a,~] = find(strcmp(ParamLabels(RankImportanceIdx(1:end,75)),ParamLabels{pidx}));
RankImportanceTbl{pidx,:} = a;
by this you could chose the best 30 parameter that fits to your y.
hope this helps to adapt it to your problem,

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by