Comparing two cell arrays of strings of different sizes

36 ビュー (過去 30 日間)
Jasper Admiraal
Jasper Admiraal 2015 年 10 月 27 日
コメント済み: Jos (10584) 2015 年 10 月 28 日
Dear all,
I'm trying to compare two cell arrays of strings, a lot like the following:
A = [X_ABCDE;X_BCDEA;X_BCED]
B = [X_A;X_B]
and the resulting array should be:
index = [1;2;2]
as I have a matrix D = [B, C] with C containing the cells I want to end up with.
I wanted to do it using a for-loop and a nested for-loop, but, as A is 8000x1 and D is 2000x2, this will take forever. I tried using strcmp and ismember, but these only work when the cells (or strings) are identical. strfind doesn't work either, because when I use strfind I need to have a for-loop over array A, and strfind only works if the smallest of both is a string.
for iCell = 1:length(A)
index = strfind(A(iCell), cell2mat(D(iCell,1)));
end
A string in B is ALWAYS shorter than in A and a string in B can occur more than once in A. I hope you can help me out :)
  2 件のコメント
Stephen23
Stephen23 2015 年 10 月 27 日
編集済み: Stephen23 2015 年 10 月 27 日
Your state that: "I have a matrix D = [B, C] with C containing the cells I want to end up with". but if the matrix D is an output (i.e. is defined by what you "end up with"), then how can D be accessed within the loop?: D(iCell,1). How does D get defined if C is an output?
Jasper Admiraal
Jasper Admiraal 2015 年 10 月 27 日
Sorry, I got it wrong. I want to end up with a certain order of the strings from C, indicated by the array index. In my code, a cell in B contains a code which corresponds to a specification in C (in the same row).

サインインしてコメントする。

採用された回答

Stephen23
Stephen23 2015 年 10 月 27 日
編集済み: Stephen23 2015 年 10 月 27 日
Use strncmp (not strcmp) to make the code simpler. Loop over the smaller cell array B, and pass the larger cell array A whole to strncmp:
A = {'X_ABCDE';'X_BCDEA';'X_BCED'}
B = {'X_A';'X_B'}
%
X = zeros(size(A));
for k = 1:numel(B)
X(strncmp(A,B{k},3)) = k;
end
produces:
>> X
X =
1
2
2
Do not use arrayfun or cellfun if you want fast code: they will be slower than a for-loop.
  1 件のコメント
Jasper Admiraal
Jasper Admiraal 2015 年 10 月 27 日
Thank you Stephen, works like a charm!

サインインしてコメントする。

その他の回答 (1 件)

Jos (10584)
Jos (10584) 2015 年 10 月 27 日
assuming B is all unique
A = {'X_ABCDE' ; 'X_BCDEA' ; 'X_BCE'}
B = {'X_A' ; 'X_B'}
index = arrayfun(@(k) find(strncmp(A{k},B,3)), 1:number(A))
  2 件のコメント
Stephen23
Stephen23 2015 年 10 月 27 日
編集済み: Stephen23 2015 年 10 月 27 日
My solution is much faster than this one (100 iterations)
Elapsed time is 0.0370002 seconds. % my answer
Elapsed time is 1.623 seconds. % this answer
Using arrayfun is slower than using a loop.
Jos (10584)
Jos (10584) 2015 年 10 月 28 日
of course it is. I just wanted to show a different answer ... :-)

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by