Big 3D gridded data set

8 ビュー (過去 30 日間)
Belinda Finlay
Belinda Finlay 2020 年 9 月 1 日
編集済み: Adam Wyatt 2020 年 9 月 9 日
I have 20 years of 3hour gridded ocean temperature data, in 3 dimensions (latitude, longitude and depth). To use the data more easily I am creating daily averages which results in a 563*1001*40 matrix. I was then creating a cell array for each year, with 365 daily averages. The cell arrays are quickly become very large.
I have read about datastore and tall array; however, all the examples I find are for tabulated data. Noting I am going to end up with ~7300 563*1001*40 matrix (one for each day of the year for 20 years). What is the best tool for managing such a large data set?
I will be extracting sections of the data based on lat long and time to do some composite analysis of the data but not until I can get it into a workable format.
Thanks in advance,
  3 件のコメント
Belinda Finlay
Belinda Finlay 2020 年 9 月 1 日
I would like to be able to run a script that access areas of the grid over the 20 years to develop composite plots. Does that make sense?
Madhav Thakker
Madhav Thakker 2020 年 9 月 9 日
  1. Read data for 1 day (or smaller duration).
  2. Do some analysis.
  3. Remove the variable from RAM.
  4. Read for next day.


回答 (1 件)

Adam Wyatt
Adam Wyatt 2020 年 9 月 9 日
編集済み: Adam Wyatt 2020 年 9 月 9 日
If you really do need all the data, then you can also use "matfile". You bascially then have a cell-array of matfile objects and can access the variables within each file programatically: I recommend using v7.3 files
  1. Load daily data
  2. Process daily data
  3. Save daily data as *.mat file
  4. Repeat 1-3 for each day, saving to a new *.mat file - I recommend usnig numeric suffixes for filenames
  5. Create cell array of matfile objects - call that m
  6. Access data via m{indx}.variablename
You can use index notation to access part of the array within the file (with some restrictions).
I've successfully used this method to access and process 10s of GB of mat file data (i.e. that is the compressed file size - that actual data size was of similar order to you).
I even created a class that enabled me to access the data more easily and perform other operations.
Do you really have a different number of data points for each year - i.e. do you really need cell-arrays. Try to avoid cells if possible.


Help Center および File ExchangeLarge Files and Big Data についてさらに検索




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by