How to compare strings and find most common values between them?

Question

Johny 2013 年 12 月 1 日

1
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/108160-how-to-compare-strings-and-find-most-common-values-between-them

コメント済み: Walter Roberson 2013 年 12 月 1 日

So I have a series of strings stored in a cell array, each of which occupies it's own element in that cell array, which for now we will call "tagstrings". Some of these strings are just repeats of each other, and they are not all the same or the same length. "tagstrings" looks something like this;

   I ate 34 slices of pizza
   I ate 34 slices of pizza
   I ate 34 slices of pizza
   I ate 35 slices of pizza
   I ate 89 slices of cheese pizza

I want to be able to take each of those strings, find the terms and/or words that appear most commonly, then collect them all into one single string that summarizes the most common information word by word. I would also like to limit the term or character length of the ending result string. the order in which the terms appear in the end is not important to me, but would be awesome if we could get that to happen.

In the example above, the code I desire would see that "35" and "89" and "cheese" for example are not very common, and would output a string that excludes these terms. I've tried separating the strings into cell arrays which deconstruct each string into its own cell array word by word using the strread command, but I didn't know what to do after this and it just got more complicated.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Walter Roberson 2013 年 12 月 1 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/108160-how-to-compare-strings-and-find-most-common-values-between-them#answer_116858

MATLAB Online で開く

splitphrases_cells = regexp(YourCellStringArray, '\s+', 'split');
splitphrases = horzcat(splitphrases_cell{:});

Now everything is word by word in one cell array of strings, ready for you to count.

Hint: Read all of the possibilities of unique()

2 件のコメント
なしを表示なしを非表示

Johny 2013 年 12 月 1 日

編集済み: Johny 2013 年 12 月 1 日

MATLAB Online で開く

Is there a way for me to perform the same operation as you have done using the following code that I have? in the following case, "YourCellStringArray" is a 7*1 cell array full of strings, if that's significant:

numoftimesitappears=length(YourCellStringArray);
words=cell(YourCellStringArray,1);
for i=1:numoftimes
    words{i}=strread(YourCellString{i},'%s');
end

what exactly is the difference? With my method, I get "words" to be a cell array with dimensions 7*1 and each cell within it just says "9x1 cell". clicking any of those gives me a vertical listing of the string at that position as opposed to the horizontal one which your code produces( yours makes 1x9's instead).

Walter Roberson 2013 年 12 月 1 日

Splitting that way is fine.

サインインしてコメントする。

How to compare strings and find most common values between them?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント
なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

How to compare strings and find most common values between them?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

2 件のコメント なしを表示なしを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

2 件のコメント
なしを表示なしを非表示