How to split a huge string array efficently
古いコメントを表示
Hi everyone,
I'm trying to split a huge string (~8.5mb, ~11.500 rows x ~400 columns) efficiently, but I cannot do that without a quiet slow "for" loop I cannot remove.
The number of colums may change from a file to another one so it's not possible for me to determin initially a unique format of the file and then import it according to it.

%% getting data from .txt => really fast
tic
disp('importing file');
a = string(textread([pwd '\test.txt'],'%s','headerlines',1)); %#ok<*DTXTRD>
toc
%% splitting each row in colums by delimiter ";" => slow
tic
disp('splitting each row by ";"');
b = strings(length(a),length(strsplit(a(1),';')));
for k=1:length(a)
b(k,:) = strsplit(a(k),';');
end
toc
%% date(str) to datenum => really fast
tic
disp('conv date to datenum');
dat1 = datenum(b(:,1),'yyyy-mm-dd');
toc
%% str to logical => really fast
tic
disp('converting data to logical array')
dat2 = logical(strcmp(b(:,2:end),'1')); %super fast
%dat2 = str2double(b(:,2:end)); %very slow
toc
% disp('converting data to logical array - 2'); %super fast as well
% tic
% dat2 = zeros(size(b));
% dat2(strcmp(b(:,2:end),'1')) = 1;
% toc
Thanks everyone! :)
Source file sample

3 件のコメント
Walter Roberson
2020 年 7 月 24 日
Why not use readtable() ?
I would also point out that textscan() can process character vectors in which the lines are separated by newlines.
endystrike
2020 年 7 月 24 日
endystrike
2020 年 7 月 24 日
採用された回答
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Dates and Time についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!