Vectorizing multiple string comparison
2 ビュー (過去 30 日間)
古いコメントを表示
Is there a way to significantly speed up this loop, perhaps by vectorizing it? Inputs in attachment. I do not have a Matlab version with "string" functions.
d = a';
for i = 1:numel(a)
d{i} = c(strcmp(a{i}, b), :);
end
I tried working my way from the inner part with cellfun, but either I am not getting it right or it is not the good approach:
aux = cellfun(@strcmp, a, b); % does not work
2 件のコメント
Walter Roberson
2017 年 1 月 27 日
That file is an Octave file that would take a bunch of work to read in MATLAB.
This is the wrong resource to be asking about performance improvement for Octave.
採用された回答
Guillaume
2017 年 1 月 26 日
One obvious minor speed-up is to get rid of the find that serves absolutely no purpose. You can directly use the logical vector returned by strcmp:
d{i} = c(strcmp(a{i}, b)), :);
For some reason, I cannot load your mat file. I'm going to assume that a is a cell array of string, and so is b (otherwise the loop would not be needed). Assuming that there are no repeated strings in b:
assert(numel(unique(b)) == numel(b), 'This code does not work when there are duplicate values in b');
d = cell(size(a))';
[isfound, loc] = ismember(a, b);
d(isfound) = c(loc(isfound), :);
If it's guaranteed that all elements of a are found in b, then you can simplify even further to:
assert(numel(unique(b)) == numel(b), 'This code does not work when there are duplicate values in b');
[isfound, loc] = ismember(a, b);
assert(all(isfound), 'The next line only works if all elements of a are in b');
d = num2cell(c(loc, :), 2);
2 件のコメント
Guillaume
2017 年 1 月 27 日
編集済み: Guillaume
2017 年 1 月 27 日
According to Walter, your mat file is an octave file that matlab can't open.
If there are duplicate values in b, then you don't have a choice but to use a loop, either explicitly as you have done or with cellfun:
d = cellfun(@(aa) c(strcmp(aa, b), :), a, 'UniformOutput', false);
It's very possible that the cellfun may be slower than the explicit loop (due to the anonymous function call).
edit: in matlab R2016b there is a an extremely easy way to vectorise the string comparison, using the new string class:
string(a) == string(b)'
but you'd still need a loop or cellfun afterward to create the d cell array:
d = cellfun(@(r) c(r, :), num2cell(string(a) == string(b)', 1), 'UniformOutput', false)
その他の回答 (1 件)
Walter Roberson
2017 年 1 月 27 日
ismember can be used between cell arrays of strings. The two-output version can be used to find the indices, which you can then use to index into c.
3 件のコメント
参考
カテゴリ
Help Center および File Exchange で Cell Arrays についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!