Big data percentile calculation

Hi,
I have a large set (30.000) of mat files each one of them containing a 4x1 cell array of 1483x2824 double, 4 matrix for each file ~= 30-40 MB
These are timeseries files representing simulations over 3 months.
I want to calculate the percentile of all this time series files but is too much memory for my computer because I need to load all the files, any clue on how to solve this? I'm working on a server with 20cores/40 threads and 256GB of memory.
I heard about this algorithm (P-square) but I couldn't find something similar inside matlab.
All the best

回答 (1 件)

Steven Lord
Steven Lord 2019 年 8 月 8 日

2 投票

See some of the tools and techniques available in MATLAB for working with Big Data, data that's too big to fit in memory. Many functions are supported on tall arrays.

2 件のコメント

David Santos
David Santos 2019 年 8 月 8 日
編集済み: David Santos 2019 年 8 月 8 日
Thanks!
What would you recommend if I want to convert my 4xcell array files in just one?
David Santos
David Santos 2019 年 8 月 8 日
Ok, I'm trying using a fileDatastore and tall arrays:
-After all definitions I have the tall array t:
function data=loadPrc(filename)
data=load(filename);
ind=strfind(filename,'/');
data=data.(strcat('l',filename(ind(end)+1:end-4-7)));
data=data{1};
end
ds=fileDatastore('matBorrame','ReadFcn',@loadPrc,'FileExtensions','.mat')
t=tall(ds)
t =
4×1 tall cell array
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
My problem is that now the prctile calculation gives a format error:
gather(prctile(t,90,3))
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: 0% complete
Evaluation 0% complete
Error using tall/prctile (line 48)
Argument 1 to PRCTILE must be one of the following data types: numeric.
Learn more about errors encountered during GATHER.
That's because t should be in the format (1483x2824x4) but I can't reshape or permute a tall array, any clue on how to solve this¿?
All the best

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeDescriptive Statistics についてさらに検索

質問済み:

2019 年 8 月 8 日

コメント済み:

2019 年 8 月 8 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by