Big data percentile calculation

1 回表示 (過去 30 日間)
David Santos
David Santos 2019 年 8 月 8 日
コメント済み: David Santos 2019 年 8 月 8 日
Hi,
I have a large set (30.000) of mat files each one of them containing a 4x1 cell array of 1483x2824 double, 4 matrix for each file ~= 30-40 MB
These are timeseries files representing simulations over 3 months.
I want to calculate the percentile of all this time series files but is too much memory for my computer because I need to load all the files, any clue on how to solve this? I'm working on a server with 20cores/40 threads and 256GB of memory.
I heard about this algorithm (P-square) but I couldn't find something similar inside matlab.
All the best

回答 (1 件)

Steven Lord
Steven Lord 2019 年 8 月 8 日
See some of the tools and techniques available in MATLAB for working with Big Data, data that's too big to fit in memory. Many functions are supported on tall arrays.
  2 件のコメント
David Santos
David Santos 2019 年 8 月 8 日
編集済み: David Santos 2019 年 8 月 8 日
Thanks!
What would you recommend if I want to convert my 4xcell array files in just one?
David Santos
David Santos 2019 年 8 月 8 日
Ok, I'm trying using a fileDatastore and tall arrays:
-After all definitions I have the tall array t:
function data=loadPrc(filename)
data=load(filename);
ind=strfind(filename,'/');
data=data.(strcat('l',filename(ind(end)+1:end-4-7)));
data=data{1};
end
ds=fileDatastore('matBorrame','ReadFcn',@loadPrc,'FileExtensions','.mat')
t=tall(ds)
t =
4×1 tall cell array
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
My problem is that now the prctile calculation gives a format error:
gather(prctile(t,90,3))
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: 0% complete
Evaluation 0% complete
Error using tall/prctile (line 48)
Argument 1 to PRCTILE must be one of the following data types: numeric.
Learn more about errors encountered during GATHER.
That's because t should be in the format (1483x2824x4) but I can't reshape or permute a tall array, any clue on how to solve this¿?
All the best

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLarge Files and Big Data についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by