Speeding up fread for random position in big file

12 ビュー (過去 30 日間)

Jonas 2022 年 6 月 23 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1746370-speeding-up-fread-for-random-position-in-big-file

コメント済み: Jonas 2022 年 6 月 30 日

dear community,

at the moment i need to read one or multiple image(s) from a big data file. since I build a live data viewer and the file is too big (20-80 GB) too read all at once i read it one by one. at the moment I use the following command

% this only once
fid=fopen('myfile.mraw','r');
% and this each time I need a specific or multiple image(s)
first_frame=251674 % exemplary frame number
numOfFrames=1 % can be 1 or 3, user dependent (3 only if i apply a temporal median filter of length 3)
bitOrder = 'b';
color = 1; % just gray in my case and not color
N=[320 384];
pixels=320*384;
ColorBit=12;
I=zeros(Pixels*numOfFrames,1,'uint16');
start = (first_frame-1)*pixels*ColorBit/8;
fseek(fid,start,'bof');
I = fread(fid,Pixels*numOfFrames,'ubit12=>uint16',bitOrder);
% convert to image matrix
A(:,:,:)=permute(reshape(I,[N numOfFrames]),[2 1 3 4]); 

is there an easy way to speed up the fread call? compared to the the further displaying etc. this line takes about 97% of the time.

when reading one frame per call I dont get more than 18--20 frames out per second. at least 50 fps would be nice

unfortunately i cannot supply the mraw file due to its size.

reading more frames at once is not possible here, since first_frame can change e.g. by 1000 between calls.

The file contains high speed recordings at 20k fps, together with other data.

best regards

Jonas

10 件のコメント
8 件の古いコメントを表示8 件の古いコメントを非表示

dpb 2022 年 6 月 23 日

編集済み: dpb 2022 年 6 月 23 日

"I certainly did not mean that you would replace the original files with the preprocessed ones,..."

One wouldn't replace the original files, of course not. It might, however, well be worth preprocessing them and build sets of files that contain the data in more i/o efficient rather than storage-efficient format. That could be done offline/overnight in batch processes so the end user never sees it. Just keep a database of the auxiliary files associated with the vendor file.

Can the vendor-supplied codes process images much more rapidly? If so, then they're playing some sort of tricks; at the root level MATLAB fread devolves to the system library i/o routines and they're all the same or essentially the same for everybody's compiler; the MATLAB version is somewhat different in that TMW has vectorized it for MATLAB, but under the hood the disk transfer is still what it is; and almost all applications are built with the same vendor toolsets which use the same standard libraries.

dpb 2022 年 6 月 23 日

I = fread(fid,Pixels*numOfFrames,'ubit12=>uint16',bitOrder);

has a syntax error, too. The bitOrder parameter is in the position of the skip parameter; this probably is another typo as it likely wouldn't run at all as is, but if fread is actually interpreting the 'b' as a number, that's a problem.

I = fread(fid,Pixels*numOfFrames,'ubit12=>uint16',0,bitOrder);

would appear to be what is wanted/needed.

Jonas 2022 年 6 月 23 日

the syntax is alright, this way it also used in the doc of fread. the resulting image(s) are also right, the command works as intended and the images are correct

サインインしてコメントする。

サインインしてこの質問に回答する。

採用された回答

Walter Roberson 2022 年 6 月 24 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1746370-speeding-up-fread-for-random-position-in-big-file#answer_992990

In my experience, you can be more efficient on conversion of 12 bit data than what is done by fread()

If memory serves

raw = uint16(fread(fid, [3 size], '*uint8'));
firsts = First12(raw(1,:),raw(2,:));
seconds = Second12(raw(2,:),raw(3,:));

where First12 and Second12 are lookup tables such that First12(A,B) = A*256 + bitand(B, 240)/16) and Second12(A,B) = bitand(A,15)*256 + B

One might think that it is more efficient to just do those numeric computations for the whole array of raw data, without using lookup tables -- and if you were to write a bit of C code to do the conversion, it might well be more efficient to do that. But my memory is that in MATLAB, it turns out to be more efficient to pre-compute the values and use array lookup -- since the pre-computation is over a relatively small array compared to the input values, and table lookup is comparatively fast since it only involves address arithmetic and pulling values out of memory. Applying those bitand() and multiplications over the whole large array is, if my memory is correct, slower at the MATLAB level (but again, some C code might well rip through it.)

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Jonas 2022 年 6 月 27 日

編集済み: Jonas 2022 年 6 月 27 日

first, the brace in the first formular is missing, it is (A*256 + bitand(B, 240))/16 if I am not mistaken

alright, update on this: actually this IS much faster than the previous syntax. I ran into a problem because

1) in the presented form I need a 2d lookup table

2) the input to the lookup table is a vector

applying arrayfun to solve 2) leads to much poorer performance and 2) interfers with 1)

i solved this by decoupling the lookup tables:

LUT1Part1=zeros(2^16,1,'uint16');
    for A=0:2^8-1
        LUT1Part1(A+1)= A*256/16;
    end
LUT1Part2=zeros(2^16,1,'uint16');
    for B=0:2^8-1
        LUT1Part2(B+1)= bitand(B, 240)/16;
    end
    
LUT2=zeros(2^16,1,'uint16');
    for A=0:2^8-1
        LUT2(A+1)= bitand(A,15)*256;
    end
    

later, i read and convert the tables as follows:

I use intlut from Matlab to lookup the values, this is also the reason why the lookup tables need 2^16 entries although 2^8 would be enough

    raw = uint16(fread(fid, [3 ceil(obj.Pixels/2)], 'uint8',bitOrder));
            
    firsts = intlut(raw(1,:),LUT1Part1) + intlut(raw(2,:),LUT1Part2);
    seconds = intlut(raw(2,:),LUT2) + raw(3,:); % here we can find the second part of Walter's conversion formular for second parts
    I=[firsts;seconds];
    
    % reshape also takes the elements columns wise, so values of firsts
    % and seconds do alternate
    A(:,:)=permute(reshape(I,N),[2 1 3]);  

the elapsed time to load single 1000 random images is about

29.660486 seconds for the old read and processing method

to 7.204475 seconds for this method

Bjorn Gustavsson 2022 年 6 月 27 日

You shouldn't need to have 2-D lookup tables. Matlab allows you to do thing like this:

first = zeros(3,4);
second = zeros(3,4);
first(:) = randn(12,1);

So if you know the image dimension you should be able to utilize this pattern.

Walter Roberson 2022 年 6 月 27 日

When I was designing the code with 2D lookup tables, I could have used linear indexing by multiplying one input by a constant and adding the other -- but that would have required an explicit multiply and add. I figured that it would probably be faster to use 2D indices, with the implicit lookup, as the Execution Engine would be able to convert that into low-level multiply-and-add instead of explicit. It might even possibly be able to convert to base register + offset machine code.

サインインしてコメントする。

その他の回答 (2 件)

Steven Lord 2022 年 6 月 23 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1746370-speeding-up-fread-for-random-position-in-big-file#answer_992065

Since your data is Big Data (too large to fit in memory all at once) you may want to investigate creating a datastore for your files and using that datastore to create a tall array on which to operate. See this section of the documentation for more information about the tools and techniques you can use for working with tall arrays and datastore objects.

9 件のコメント
7 件の古いコメントを表示7 件の古いコメントを非表示

dpb 2022 年 6 月 24 日

OK, that pretty-much confirms that, presuming your camera has 12-bit depth the output is, indeed, packed 12-bit. Looking at one of the CIH files would be interesting exercise.

It looks like it also will save 16-bit files that, while bigger, may well load much faster than translating 12 bits into 16 -- why not start with 16 to begin with????

Jonas 2022 年 6 月 30 日