compare groups of items regarding overlaps

7 ビュー (過去 30 日間)
Ulrike Lohner
Ulrike Lohner 2021 年 6 月 23 日
コメント済み: Ulrike Lohner 2021 年 6 月 24 日
Short background: I have a number of texts that are being grouped regarding their value (about 5 differing values for each variable) for number of variables; meaning that each texts appears in one value group of each variable. (group A might be text1, text7, text23, text38; etc.)
Goal: I want to compare each of these primary groups regarding any overlap of their contained items using one group as a basis; i.e. I take group A and check which texts of this group also appear in any group of another variable (of course, I am not comparing groups that belong to the same variable, since there would oviously be no overlap). In the end, I'd like to say that e.g. Text 1, 7, 23 and 38 all appear in groups A, F, J, K and so forth.
That means I do not want to compare the means or any values of the data groups, but want to know which groups share which items.
Since I am not yet that experienced yet, I can't seem to find the right code to start with; any ideas about how to tackle this task?
  3 件のコメント
Ulrike Lohner
Ulrike Lohner 2021 年 6 月 24 日
Unfortunately, I am not allowed to post any original data due to data security issues (and the code I have so far is importing the data, so that wouldn't be any help). I can try to be more specific regarding my data, though:
Basically I have a large number of groups of strings that are organized in a table (each column one group, each string in a cell); there are about 150 different strings in total and each string will appear in a number of groups; however, no group is composed of the same combination of strings, and additionally, the groups do not have the same sizes.
I will probably need a loop that takes each column (i.e. each group) as a starting point once, checking which strings of this group is also contained in the other groups; giving me as output a new set of string clusters that only contain those strings included in the first group.
Anyway: thank you for the suggestions so far; I will dig deeper into the functions you mentioned already and will check if one of them serves my purpose.



SALAH ALRABEEI 2021 年 6 月 23 日
[val,ndxA,ndxB] = intersect(A,B)
It will give you the overlapping val and its index in both groups A and B
  1 件のコメント
Ulrike Lohner
Ulrike Lohner 2021 年 6 月 24 日
Thank you for this suggestion! I will have a closer look at that function and check whether is serves the right prupose.


その他の回答 (0 件)


Find more on Large Files and Big Data in Help Center and File Exchange




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by