フィルターのクリア

How to plot billions of points efficiently?

34 ビュー (過去 30 日間)
Eli4ph
Eli4ph 2018 年 7 月 4 日
コメント済み: Eli4ph 2018 年 7 月 6 日
I have 3 billions of 2D points to be plotted, which requires a lot of memory. Thus, I can only do this on a server, which has about 1TB memory. But the server does not have a decent graphics card, thus export of figures uses CPU to render and takes more than 5 hours. But this procedure needs to be done many times, because I need to change the scale of axes according to the shape of the scatter diagrams. My desktop has a decent graphics card. Could I utilize the desktop's ability when data can not fit into its memory?
Here is an example of my figures.
  6 件のコメント
Stephen23
Stephen23 2018 年 7 月 4 日
編集済み: Stephen23 2018 年 7 月 4 日
" In linear scale, it can be easily done by using round(). "
Yes, I also thought of using round, or some kind of tolerance.
"In linear scale, it can be easily done by using round(). But in log scale, I have no clean way to do this. Do you have any idea?"
Convert to linear scale, round to whatever precision, get the unique X-Y pairs, use the indices to plot a subset of the data. I think with a few billion points this might be possible with the memory that you have available, but you would have to try.
Eli4ph
Eli4ph 2018 年 7 月 4 日
@Cobeldick: Thanks. Let me try it.

サインインしてコメントする。

採用された回答

Stephen23
Stephen23 2018 年 7 月 4 日
編集済み: Stephen23 2018 年 7 月 4 日
Here is one way to subsample the data to produce almost identical plots:
% Fake data:
X = 10.^randn(2e4,1);
Y = 10.^randn(2e4,1);
figure()
scatter(X,Y,'filled')
set(gca,'xscale','log','yscale','log','title','AllData')
% Merge data points:
Xb = log10(X);
Yb = log10(Y);
Xf = 0.05; % adjust factor to suit
Yf = 0.05; % adjust factor to suit
Xb = Xf*round(Xb/Xf);
Yb = Yf*round(Yb/Yf);
[~,idx] = unique([Xb,Yb],'rows');
figure()
scatter(X(idx),Y(idx),'filled')
set(gca,'xscale','log','yscale','log','title','SubData')
The number of points plotted:
>> numel(X) % AllData
ans = 20000
>> nnz(idx) % SubData
ans = 6653
You can also see that all extrema are still clearly visible.
PS: you might be able to save some memory by putting the merging onto one line:
[~,idx] = unique([Xf*round(log10(X)/Xf),Yf*round(log10(Y)/Yf)],'rows');

その他の回答 (2 件)

Steven Lord
Steven Lord 2018 年 7 月 5 日
Consider storing your data as a tall array and using the tall visualization capabilities introduced in release R2017b.
  1 件のコメント
Eli4ph
Eli4ph 2018 年 7 月 6 日
Thanks. I have used the round solution and gained a considerable speedup.

サインインしてコメントする。


James Tursa
James Tursa 2018 年 7 月 5 日
  1 件のコメント
Eli4ph
Eli4ph 2018 年 7 月 6 日
Thanks. I have used the round solution and gained a considerable speedup.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLine Plots についてさらに検索

タグ

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by