Building tall table from tall arrays generates error
3 ビュー (過去 30 日間)
古いコメントを表示
clear
dataFile = 'data.csv';
ds = tabularTextDatastore(dataFile, FileExtensions='.csv');
ds.ReadVariableNames = true;
ds.Delimiter = ',';
ds.SelectedVariableNames = ["hash", "count"];
ds.SelectedFormats = {'%s', '%f'};
data = tall(ds);
[g, THash] = findgroups(data.hash);
TCount = splitapply(@(x) {x}, data.count, g);
%% This works but cannot use it because actual data file is far larger than memory
hash = gather(THash);
count = gather(TCount);
T1 = table(hash, count);
%% This is the intended code but doesn't work
TT = table(THash,TCount);
write(fullfile(pwd,'data'),TT,FileType="parquet");
0 件のコメント
回答 (1 件)
Oguz Kaan Hancioglu
2023 年 3 月 15 日
Your code wasn't work because "gather(TCount)" returns cell array for each element. Therefore you are trying to write double array in to one single cell. You can find the length of each array into the cell. I hope this solves your problem.
%% This works but cannot use it because actual data file is far larger than memory
hash = gather(THash);
count = gather(TCount);
cellsz = cellfun(@size,count,'uni',false);
newCount = cellfun(@(x) x(1),cellsz,'UniformOutput',false)
T1 = table(hash, newCount);
参考
カテゴリ
Help Center および File Exchange で Matrices and Arrays についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!