Word Count in a PDF file
3 ビュー (過去 30 日間)
古いコメントを表示
Ahmed Alsaadi
2018 年 12 月 20 日
編集済み: Omer Yasin Birey
2018 年 12 月 21 日
I have a PDF file "EHP.pdf", I want to count the total number of words in that file? This file has many sections I want to exclude the last section from the calculations. Any suggestions?
2 件のコメント
採用された回答
Omer Yasin Birey
2018 年 12 月 20 日
編集済み: Omer Yasin Birey
2018 年 12 月 21 日
Hi Ahmed, you can use extractFileText. You must choose a starter word and a finisher word, this word must be unique. Because, counting will end when Matlab encounters this word. By this way you can count the words between the starter and finisher.
str = extractFileText("EHP.pdf");
i = strfind(str,"firstWord"); % write here the first word of your pdf
ii = strfind(str,"lastWord"); % write here the last word of your pdf, that must be distinctive
start = i(1);
fin = ii(1);
extracted = extractBetween(str,start,fin-1)
uniqueWordNumbers = wordCloudCounts(extracted);
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);
3 件のコメント
Omer Yasin Birey
2018 年 12 月 20 日
Ah, You are right Ahmed. I made a typo and also forgot a line there, try this instead:
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);
add this table2array line and change the input of sum with this
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Display and Presentation についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!