KNN classifier with ROC Analysis
    4 ビュー (過去 30 日間)
  
       古いコメントを表示
    
Hi Smart guys,
I wrote following codes to get a plot of ROC for my KNN classifier:
    load fisheriris;
    features                                = meas;
    featureSelcted                          = features;
    numFeatures                             = size(meas,1);
    %%Define ground truth
    groundTruthGroup                        = species;
    %%Construct a KNN classifier
    KNNClassifierObject                     = ClassificationKNN.fit(featureSelcted, groundTruthGroup, 'NumNeighbors', 3, 'Distance', 'euclidean');
    % Predict resubstitution response of k-nearest neighbor classifier
    [KNNLabel, KNNScore]                     = resubPredict(KNNClassifierObject);
    % Fit probabilities for scores
    groundTruthNumericalLable          = [ones(50,1); zeros(50,1); -1.*ones(50,1)];
    [FPR, TPR, Thr, AUC, OPTROCPT]          = perfcurve(groundTruthNumericalLable(:,1), KNNScore(:,1), 1);
Then we can plot the FPR vs TPR to get the ROC curve.
However, the FPR and TPR is different from what I got using my own implementation that the one above will not display all the points, actually, the codes above display only three points on the ROC. The codes I implemented will dispaly 151 points on the ROC as the size of the data is 150.
    patternsKNN                             = [KNNScore(:,1), groundTruthNumericalLable(:,1)];
    patternsKNN                             = sortrows(patternsKNN, -1);
    groundTruthPattern                      = patternsKNN(:,2);
    POS                                     = cumsum(groundTruthPattern==1);
    TPR                                     = POS/sum(groundTruthPattern==1);
    NEG                                     = cumsum(groundTruthPattern==0);
    FPR                                     = NEG/sum(groundTruthPattern==0);
    FPR                                     = [0; FPR];
    TPR                                     = [0; TPR];
May I ask how to tune '`perfcurve`' to let it output all the points for the ROC? Thanks a lot.
A.
1 件のコメント
  Alessandro
      
 2013 年 3 月 20 日
				
      編集済み: Alessandro
      
 2013 年 3 月 20 日
  
			try adding 'xvals','all' [FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthNumericalLable(:,1), KNNScore(:,1), 1,'xvals','all');
採用された回答
  Ilya
      
 2013 年 3 月 19 日
        For 3 neighbors, the posterior probability has at most 4 distinct values, namely (0:3)/3. Likely less for the Fisher iris data because the classes are well separated. With 4 distinct score values, you won't see more than 4 points on the ROC curve. Your implementation does not account for such ties.
2 件のコメント
  Ilya
      
 2013 年 3 月 20 日
				Yes, it does mean that your implementation is wrong. As I said, you can't have more points on a ROC curve than distinct threshold values. This is actually quite simple - you just need to think about it.
その他の回答 (0 件)
参考
カテゴリ
				Help Center および File Exchange で Statistics and Machine Learning Toolbox についてさらに検索
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


