query about svm classifier

1 回表示 (過去 30 日間)
Syed
Syed 2014 年 3 月 11 日
コメント済み: Walter Roberson 2014 年 3 月 11 日
Hi, I have learnt that we need to give two kinds of parameters to the svm. One for each class. I have a question. I referred this link and will use it to explain my query
In the above example, it says that the trainingLabels tell the classifier if the digit belongs to a particular category or not. But i do not understand as to what should be the percentage of belong and not-belong. i.e if i have 5 images of a digit '1', and trainingLabel for that is [1 1 1 1 1] , then should i also give 5 digits which are not digit '1' ? then the trainingLabel would be [1 1 1 1 1 0 0 0 0 0 ]. Is my understanding correct? If not, what is the percentage that we should give which belongs to one group and percentage of input which doesnt belong to it .
Please clarify

回答 (1 件)

Walter Roberson
Walter Roberson 2014 年 3 月 11 日
The labels are not percentages, they are category numbers that have no mathematical meaning. You could use (say) 39 as the trainingLabel for your digit '1' and you could use 54 for the letter 'i' and 27 for the digit '2' and whatever other arbitrary values are convenient. So if the order of the samples was '1', '1', '1', '1', '1', 'i', '2', '2', 'i', 'i' then the trainingLabel would be [39 39 39 39 39 54 27 27 39 39]
Use anything consistent that is convenient. You could probably even use characters such as
trainingLabel = '11111i22ii'
as long as the vector is one position per sample and the numbering is consistent.
  3 件のコメント
Syed
Syed 2014 年 3 月 11 日
Also, My main question is the following. say, i want my classifier to classify digit '1' correctly. and i assigned its label as 1. and if i have 5 different images for digit '1' , then my
trainingImages={'a11.jpg';'a22.jpg';'a33.jpg';'a44.jpg';'a55.jpg'}; trainingLabel='1 1 1 1 1 '
Is this the right method to train the svm? Or should there be a training for digits other than '1' For Example
trainingImages={'a11.jpg';'a22.jpg';'a33.jpg';'a44.jpg';'a55.jpg';'b11.jpg';'z22.jpg';'s55.jpg';'q66.jpg'}; in which case trainingLabel='1 1 1 1 1 0 0 0 0'
Which of the above two methods are better?
Thanks in advance.
Walter Roberson
Walter Roberson 2014 年 3 月 11 日
You are right, svmtrain() only accepts two distinct (non-error) values:
Grouping variable, which can be a categorical, numeric, or logical vector, a cell vector of strings, or a character matrix with each row representing a class label. Each element of Group specifies the group of the corresponding row of Training. Group should divide Training into two groups. Group has the same number of elements as there are rows in Training. svmtrain treats each NaN, empty string, or 'undefined' in Group as a missing value, and ignores the corresponding row of Training.
The trainingLabel should be a column vector, such as ('11111').' or ['1';'1';'1';'1';'1']
Yes you absolutely need to train with the different classes present in the input.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeStatistics and Machine Learning Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by