フィルターのクリア

Reading all data of streams from a adft data file with large ItemCount is very slow.

1 回表示 (過去 30 日間)
Shikha
Shikha 2024 年 2 月 14 日
コメント済み: Shikha 2024 年 2 月 22 日
Hello,
Basically I need to read the streams data from ADFT DAT file and perform some preprocessing related to coordinates frames transformations. For this I am trying to read all data from the paricular selected stream which has an ItemCount of 18000 and trying to store in a csv file. The read(streamData) itself takes around 10-12 mins and even more if the stream has more structered data.
Can someone suggest a way which can allow me to make this reading process faster.

採用された回答

Shubham
Shubham 2024 年 2 月 22 日
Hi Shikha,
It seems that you were trying to read stream data from ADFT DAT file. You can try speedup the reading process by chunking the input data and leveraging the parallel computing toolbox for reducing the time taken to read the data.
You can read data using “adftFileReader” in chunks using the “select” function while providing a time range or index range as arguments. For more information, please refer to the following documentation: https://www.mathworks.com/help/driving/ug/read-data-from-adtf-dat-files.html#ReadDataFromADTFDATFilesExample-7
You can try testing it out using the following example as well:
openExample('driving/ExtractVideoStreamFromADTFDATFileExample');
Once you create chunks of the file, you can read it parallelly. Here is a simple example for reading a file using parfor:
% Create dummy data and write to a file
data = (1:1e8)';
lines = length(data);
% Uncomment the following lines when creating the dummy data for the first time
% fileID = fopen('dummy_data.txt', 'w');
% fprintf(fileID, '%d\n', data);
% fclose(fileID);
% MATLAB code to read data from a file in parallel and store in an ordered array
if isempty(gcp('nocreate'))
parpool;
end
% Define the number of workers
numWorkers = 6;
chunks = 10;
chunkSize = ceil(lines / chunks);
% Preallocate a cell array to hold the data for each chunk
dataCellArray = cell(chunks, 1);
% Read the file in parallel using parfor
parfor (curChunk = 1:chunks, numWorkers)
startLine = (curChunk - 1) * chunkSize + 1;
endLine = min(curChunk * chunkSize, lines);
dataCellArray{curChunk} = readChunk(startLine, endLine, 'dummy_data.txt');
disp("done for ");
disp(curChunk);
end
% Concatenate the data from each worker to form the complete array
dataArray = vertcat(dataCellArray{:});
da = vertcat(dataArray{:})
function dataChunk = readChunk(startLine, endLine, filename)
fileID = fopen(filename, 'r');
dataChunk = textscan(fileID, '%d', endLine-startLine+1, 'HeaderLines', startLine-1);
fclose(fileID);
end
You can modify the above code snippet to work for “adftFileReader” as well. Please refer to the following code snippet:
numWorkers = 6;
chunkSize = 10;
numChunks = itemcount/chunkSize;
dataCellArray = cell(numChunks, 1);
parfor (curChunk = 1:numChunks, numWorkers)
startIndex = (curChunk-1)*chunkSize+1;
endIndex = min(startIndex+chunkSize-1,itemcount);
dataCellArray{curChunk} = readChunk(startIndex, endIndex);
disp("done for ");
disp(curChunk);
end
dataArray = vertcat(dataCellArray{:})
function dataChunk = readChunk(startIndex, endIndex)
dataFolder = fullfile(tempdir, 'adtf-video', filesep);
datFileName = fullfile(dataFolder,"sample_can_video.dat");
file_reader = adtfFileReader(datFileName);
stream_index = 2;
stream_reader = select(file_reader, stream_index, IndexRange=[startIndex endIndex]);
dataChunk = read(stream_reader);
end
I have tested the code snippet on the example mentioned above. I have created chunks containing 10 frames (total 149 frames are present) and here is a glimpse of result stored in “dataArray”:
The first 10 frames of the video are stored as:
I would suggest to profile your code and perform the tasks asynchronously.
I hope this helps!
  3 件のコメント
Shubham
Shubham 2024 年 2 月 22 日
Hi Shikha,
The runtime should still be reduced upon reading a stream of structure of structures when using multiple workers. However if you think you still require additional help, you can start a new thread along with your data files and code snippets.
Thanks
Shikha
Shikha 2024 年 2 月 22 日
Hi Shubham,
You are right about the relative increase in runtime using multiple workers. Also, I would surely open a new thread if I require more help.
Thanks a lot for your help!!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeStartup and Shutdown についてさらに検索

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by