Find frequency of words from different books

2 ビュー (過去 30 日間)
L
L 2024 年 3 月 25 日
コメント済み: Voss 2024 年 3 月 25 日
I have a cell array of data collected from 5 different books
This is one of the cell arrays. It gives me the count of each word in the book (I used count{ii} = tabulate(text{ii}) ).
I need to create a unique count for all the words found in all the 5 books. So, for example, for the word 'the', I have to sum up all the frequencies in all 5 cells.
I was thinking about using a table but I really can't get it done.
Any ideas?

採用された回答

Voss
Voss 2024 年 3 月 25 日
Maybe this will help:
% example data:
counts = { ...
{'the' 464; 'project' 87; 'of' 253} ...
{'the' 300; 'of' 314; 'nothing' 17; 'project' 13} ...
{'the' 100; 'price' 99; 'of' 114; 'everything' 12; 'value' 88; 'nothing' 54} ...
}
counts = 1x3 cell array
{3x2 cell} {4x2 cell} {6x2 cell}
% concatenate the cell arrays in counts and convert into a table
T = cell2table(vertcat(counts{:}),'VariableNames',{'word','count'})
T = 13x2 table
word count ______________ _____ {'the' } 464 {'project' } 87 {'of' } 253 {'the' } 300 {'of' } 314 {'nothing' } 17 {'project' } 13 {'the' } 100 {'price' } 99 {'of' } 114 {'everything'} 12 {'value' } 88 {'nothing' } 54
% use groupsummary to find the total counts
G = groupsummary(T,'word','sum')
G = 7x3 table
word GroupCount sum_count ______________ __________ _________ {'everything'} 1 12 {'nothing' } 2 71 {'of' } 3 681 {'price' } 1 99 {'project' } 2 100 {'the' } 3 864 {'value' } 1 88
  2 件のコメント
L
L 2024 年 3 月 25 日
That is exacly what I needed. What does the column Group Count means?
Voss
Voss 2024 年 3 月 25 日
You're welcome!
GroupCount is the number of times each word appears in the table T, so that would correspond to the number of books each word appears in. I don't think you need that information (it's automatically included by groupsummary), and you can remove it.
% example data:
counts = { ...
{'the' 464; 'project' 87; 'of' 253} ...
{'the' 300; 'of' 314; 'nothing' 17; 'project' 13} ...
{'the' 100; 'price' 99; 'of' 114; 'everything' 12; 'value' 88; 'nothing' 54} ...
};
% concatenate the cell arrays in counts and convert into a table
T = cell2table(vertcat(counts{:}),'VariableNames',{'word','count'});
% use groupsummary to find the total counts
G = groupsummary(T,'word','sum');
% remove GroupCount
G = removevars(G,'GroupCount')
G = 7x2 table
word sum_count ______________ _________ {'everything'} 12 {'nothing' } 71 {'of' } 681 {'price' } 99 {'project' } 100 {'the' } 864 {'value' } 88

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeResizing and Reshaping Matrices についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by