Explode cell that are into another cell

I have a cellarray data like in the picture : each cell contains cellarray of strings
I am opening all all the cells with this way
counter=0;
for ind=1:length(data)
tmp=cell2str(data{ind,1});
for k=1:size(tmp,1)
counter=counter+1;
tmp2=textscan(tmp(k,:),'%s%s%s%s%s%[^\n\r]','Delimiter', ' ');
for j=1:6
if isempty(tmp2{j})==0
Raw(counter,j)=tmp2{j};
end
end
clear tmp2 j
end
clear k tmp
end
The results is correct but is there a better/faster way to do it ?
Using parfor, or other technics
Thank you in advance

9 件のコメント

dpb
dpb 2019 年 1 月 9 日
Attach small sample dataset and what do you want the result to be?
NirE
NirE 2019 年 1 月 9 日
編集済み: NirE 2019 年 1 月 9 日
I am joining a small piece of data, 10 rows but in fact it has 38956 rows.
The length of the cells that are inside the main cell can differ. I hope that I am clear enough
You can just run the script that I wrote it is working but very slowly
Is there a way tu use parfor or other thing in order to accelerate the process?
Luna
Luna 2019 年 1 月 9 日
cell2str does not work for me, which version you are using?
NirE
NirE 2019 年 1 月 9 日
Matlab R2017b
dpb
dpb 2019 年 1 月 9 日
編集済み: Stephen23 2019 年 1 月 9 日
Must be in some TB, then, it's not in base R2017b...
Again, what's the desired output? That doesn't seem to make sense reading the code; your format statement has 5 strings but some "records" have many more fields than that...
Well, that's not it either, maybe FEX submittal? A search of online help doesn't find it, either.
Not knowing what, precisely, the cell2str function actually returns it's hard to guess exactly what the result that "works" really is without more effort than have time to spend...help us help you.
Luna
Luna 2019 年 1 月 9 日
What size should be your output? Are you expecting 64x1 cell array?
NirE
NirE 2019 年 1 月 9 日
the str2cell function just return a string vector with the number of line that we had in the cell.
for the format it is exactly what i want 6 parts with different length.
i hope that i am helping
Jan
Jan 2019 年 1 月 9 日
Note: Omit the useless clear commands. They will waste time only here.
dpb
dpb 2019 年 1 月 9 日
Well, no...helping would be to show us what you really, really want instead of just describing it that we can't reproduce.
Where did you find the function? SHOW us!!!

サインインしてコメントする。

 採用された回答

Jan
Jan 2019 年 1 月 9 日

1 投票

Start with a pre-allocation:
Len = cellfun('prodofsize', data);
Raw = cell(sum(Len), 6);
c = 0;
for ind = 1:numel(data)
tmp = data{ind};
for k = 1:numel(tmp)
c = c + 1;
tmp2 = strsplit(tmp{k}, ' ');
for j = 1:numel(tmp2)
Raw{c, j} = tmp2{j};
end
end
end
I cannot open your MAT file currently, so I guess, what it might contain. I guessed also, that cell2str can be avoided by scanning the cell element directly. I assume that Raw should be a cell array. All these assumptions can be wrong. If you post a small input as code and the wanted output, less guessing is required.

5 件のコメント

NirE
NirE 2019 年 1 月 10 日
Your code is reducing mine 3 times thanks a lot.
Can you just explain what is doing 'prodofsize' ?
Jan
Jan 2019 年 1 月 10 日
編集済み: Jan 2019 年 1 月 10 日
@Nir Eliezer: 'prodofsize' is explained in the docuemtation: doc cellfun. It is equivalent to:
Len = cellfun(@numel, data);
but much faster. If you provide a function handle to cellfun, it calls the Matlab engine for each element of the cell, while using the strings like 'length', 'prodofsize' and 'isclass' accesses the cell elements directly inside the cellfun core. Although the speed of cellfun might be negligible in your case, it is a good programming practize to use the most efficient code.
By the way, 'numel' would be much nicer than 'prodofsize'.
Maybe this is slightly faster:
Len = cellfun('prodofsize', data);
Raw = cell(sum(Len), 6);
c = 0;
for ind = 1:numel(data)
tmp = data{ind};
for k = 1:Len(ind)
c = c + 1;
tmp2 = strsplit(tmp{k}, ' ');
Raw(c, 1:numel(tmp2)) = tmp2;
end
end
NirE
NirE 2019 年 1 月 21 日
Jan one more question is there a way to parallelize your piece of code that I could use parfor ?
Jan
Jan 2019 年 1 月 22 日
Yes, a parallelizaion should be very straigh forward. Did you try it?
NirE
NirE 2019 年 1 月 22 日
Will try and tell you how it increase or not

サインインしてコメントする。

その他の回答 (2 件)

dpb
dpb 2019 年 1 月 9 日
編集済み: dpb 2019 年 1 月 9 日

1 投票

OK, I overlooked the regular expression in the format string that sucks up all of those extra blanks at the end of the odd-man-out records...
To dereference the cell content in each cell requires two levels snce textscan isn't cell-string aware. split doesn't cut it here because there's not a unique delimiter that defines the fields desired; hence the above...
You can try the following and see if the lack of preallocation shows up as a performance hit with the size; oftentimes it'll fool you and not be too bad...
fnTS=@(s) textscan(s,'%s%s%s%s%s%[^\n\r]','Delimiter', ' ');
res=[];
for i=1:length(data)
res=[res;cellfun(fnTS,data{i},'uni',0)];
end
res=cat(1,res{:});
The above yields a 64x6 cell array...
I'd have to think of the bestest way to be able to build the array directly w/o the intermediary second cell array to not be dynamically catenating the output.
ADDENDUM:
res(cellfun(@isempty,res))={''};
>> string(res)
ans =
11×6 string array
"1" "EventDataLogNewFile" "DataEventTime" "TypeSecondsSinceEpoch" "1546725641" ""
"1" "EventDataLogNewFile" "DataEventTime" "TypeFormattedDate" "Sun" "Jan 6 00:00:41 2019"
"1" "EventDataLogNewFile" "DataReportingSubsystem" "TypeString" "datalogger" ""
"1" "EventDataLogNewFile" "DataInstrumentID" "TypeString" "00:01:05:19:CF:30" ""
"1" "EventDataLogNewFile" "DataEntityName" "TypeString" "mc16" ""
"4" "EventREAD" "DataEventTime" "TypeSecondsSinceEpoch" "1546725657" ""
"4" "EventREAD" "DataEventTime" "TypeFormattedDate" "Sun" "Jan 6 00:00:57 2019"
"4" "EventREAD" "DataReportingSubsystem" "TypeString" "pc" ""
"4" "EventREAD" "DataEntityID" "TypeString" "Dev_CLPC_PressureGauge1" ""
"4" "EventREAD" "DataReading" "TypeUnitLessNumber" "34729" ""
"4" "EventREAD" "DataEventDuration" "TypeSec" "0.000686859" ""
>>
for just doing the first two elements in the for...end loop instead of all for brevity.
Luna
Luna 2019 年 1 月 9 日

0 投票

I was assuming the same 64x9 cell. Here is my solution gives the same result with Jan's:
cellArray = cellfun(@(x) strsplit(x(:,:),' '), vertcat(data{:}), 'UniformOutput',false);
for i = 1:numel(cellArray)
for j = 1:numel(cellArray{i})
raw{i,j} = cellArray{i}{j} ;
end
end

カテゴリ

ヘルプ センター および File ExchangeData Type Identification についてさらに検索

製品

リリース

R2017b

質問済み:

2019 年 1 月 9 日

コメント済み:

2019 年 1 月 22 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by