フィルターのクリア

Sentimental Analysis using SVM

7 ビュー (過去 30 日間)
Sanguk
Sanguk 2023 年 4 月 13 日
編集済み: Sanguk 2023 年 4 月 14 日
Hi,
I am a newbie to Matlab and coding.
I am trying to do sentimental analysis with tweet text data extracted from twitter API using SVM.
I have managed to preprocess the text. But I am stuck there.
I need to extract features and want to see the word frequency and user's perception towards target companies and so forth. Lastly, I need to evaluate the performance too.
how should I extract features and train and test the data?
Thank you.

採用された回答

Drew
Drew 2023 年 4 月 13 日
You can also look at these doc pages:
If you find this answer helpful, please remember to "accept" the answer.
  1 件のコメント
Sanguk
Sanguk 2023 年 4 月 13 日
編集済み: Sanguk 2023 年 4 月 14 日
Hi,
I followed the step in the link, but I get errors saying "Error using horzcat
Dimensions of arrays being concatenated are not consistent.
Error in reun (line 44)
XTrain = [XTrain, sentimentScoresTrain];"
how should I fix it?
filename = "sentiment_irrelevantdrop_all";
data = readtable(filename,'TextType','string');
data.sentiment = categorical(data.sentiment);
% Split dataset into training and test sets using holdout
cvp = cvpartition(data.sentiment, 'Holdout', 0.1);
dataTrain = data(cvp.training, :);
dataTest = data(cvp.test, :);
% Extract review text and sentiment labels from training and test set
textDataTrain = dataTrain.text;
textDataTest = dataTest.text;
YTrain = dataTrain.sentiment;
YTest = dataTest.sentiment;
% Preprocess training set
documents = preprocessText(textDataTrain);
% Create bag of words and remove infrequent words
bag = bagOfWords(documents);
bag = removeInfrequentWords(bag,2);
[bag,idx] = removeEmptyDocuments(bag);
YTrain(idx) = [];
% Encode training set using bag of words
XTrain = bag.Counts;
% Train SVM classifier
mdl = fitcecoc(XTrain, YTrain, "Learners", "linear");
% Preprocess test set
documentsTest = preprocessText(textDataTest);
documentsTrain = preprocessText(textDataTrain);
% Encode test set using bag of words
XTest = encode(bag, documentsTest);
% Compute sentiment scores for training and test sets using VADER
sentimentScoresTrain = vaderSentimentScores(documentsTrain);
sentimentScoresTest = vaderSentimentScores(documentsTest);
% Concatenate sentiment scores with bag of words features
XTrain = [XTrain, sentimentScoresTrain];
XTest = [XTest, sentimentScoresTest];
% Build new svm model using both bag-of-words and vader sentiment scores as
% features
mdl2 = fitcecoc(XTrain, YTrain, "Learners", "linear");
% Predict sentiment labels for test set
YPred = predict(mdl, XTest);
% Evaluate performance
accuracy = sum(YPred == YTest) / numel(YTest);
fprintf("Accuracy: %.2f%%\n", accuracy * 100);
confusion = confusionmat(YTest, YPred);
truePositive = confusion(1, 1);
falsePositive = confusion(2, 1);
trueNegative = confusion(2, 2);
falseNegative = confusion(1, 2);
% Compute precision, recall, and F-measure
precision = truePositive / (truePositive + falsePositive);
recall = truePositive / (truePositive + falseNegative);
fMeasure = 2 * precision * recall / (precision + recall);
% Compute accuracy
accuracy2 = (truePositive + trueNegative) / numel(YTest);
% Display results
disp(['True positive: ' num2str(truePositive)]);
disp(['False positive: ' num2str(falsePositive)]);
disp(['True negative: ' num2str(trueNegative)]);
disp(['False negative: ' num2str(falseNegative)]);
disp(['Precision: ' num2str(precision)]);
disp(['Recall: ' num2str(recall)]);
disp(['F-measure: ' num2str(fMeasure)]);
function documents = preprocessText(textData)
documents = tokenizedDocument(textData);
documents = addPartOfSpeechDetails(documents);
documents = removeStopWords(documents);
documents = erasePunctuation(documents);
documents = removeShortWords(documents,2);
documents = removeLongWords(documents,15);
end
Thank you

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeClassification についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by