Import files based on file name

Question

0 投票

I have a folder containing around 300,000 files. I don't need to import all the files.

Problem: How can I import the files based on specific file name?

Problem example:

In the picture below I have a section of the files, which are all in the same folder. I only want to import the .data files. But I don't need all the .data files to be import only the last one of every serie.

Up until now I have the following code:

files = dir(fullimpfile(pwd,'*.data*'));
expData = cell(length(files),1);
for i = 1:length(files)
    fid = fopen(fullfile(files(i).folder,files(i).name),'r');
    
    %% Reading the data
    % Read all the data from the file
    dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
end

This code imports all of the data files. How can I use I think a for loop to only import only the last .data files of every serie?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Stephen23 2020 年 9 月 2 日

MATLAB Online で開く

0 投票

fnm = {...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.0',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.1',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.2',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.21',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.0',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.1',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.2',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.11'}
spl = regexp(fnm,'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = fnm(idx(idy))

Giving:

out =
'LedgeTest_muSP_0.10_muRP_0.10.1.data.21'
'LedgeTest_muSP_0.10_muRP_0.20.1.data.11'

Use it like this:

D = pwd;
tmp = dir(fullfile(D,'*.data.*'));
fnm = {tmp.name};
...
for k = 1:numel(out)
    fid = fopen(fullfile(D,out{k}),'r');
    ...
end

9 件のコメント
7 件の古いコメントを表示 7 件の古いコメントを非表示

Stephen23 2020 年 9 月 2 日

編集済み: Stephen23 2020 年 9 月 2 日

MATLAB Online で開く

Although innovative, this line:

out = struct2cell(files(idx(idy)))

and the corresponding indexing:

out{k}

need some more thought.

Note that struct2cell creates a cell array where the first dimension encodes the fields and the other dimensions correspond to the dimensions of the input structure arrays shifted/permuted by one.

So with your adaption, the code will iterate (using linear indexing) down the first column, which thus contains the data from the first element of the structure array returned by dir. Not all of this data is character, e.g. the fields bytes, isdir, and datenum. So as soon as your code refers to one of the cells with those data, it will throw an error. Even without throwing an error the code would still be incorrect because only one of the cells in each column is actually the filename.

One fix would be to change the code to use subscript indexing instead of linear indexing. Or use a simple comma-separated list to create a cell array with only the filenames:

out = {files(idx(idy)).name};

Or if you want a sub-structure of that returned by dir then just index into it:

sub = tmp(idx(idy));

The approach I showed you in my answer works without error because it does not include all of the other fields in the cell array, only the filenames, thus trivial linear indexing is all that is required.

Tessa Kol 2020 年 9 月 2 日

編集済み: Tessa Kol 2020 年 9 月 2 日

MATLAB Online で開く

Is it also possible to extract a range of files. For example the last 5 .data files of every serie?

And I also encounted an error when adjusting the previous code:

Error using textscan

Invalid file identifier. Use fopen to generate a valid file identifier.

Error in LedgeTest2D_results (line 30)

dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);

% Select only .data file from the last time step of each simulation
files = dir(fullfile(uigetdir,'*.data*'));
spl = regexp({files.name},'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = {files(idx(idy)).name};
expData = cell(length(out),1);
for i = 1:length(out)
    fid = fopen(fullfile(files.folder,out{i}),'r');
    %% Reading the data
    % Read all the data from the file
    dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1)
end

The matlab file is in a different folder from the .data files.

Stephen23 2020 年 9 月 2 日

編集済み: Stephen23 2020 年 9 月 2 日

MATLAB Online で開く

"The matlab file is in a different folder from the .data files."

Then you need to tell textscan where to look. Your innovative approach of using

fid = fopen(fullfile(files.folder,out{i}),'r');

will not work for two main reasons:

files.folder creates a comma-separated list containing the folder data from all elements of files, which you the supply as inputs to fullfile. Basically your code does this: fullfile(files(1).folder, files(2).folder, .... , files(end).folder, out{i}), which is very unlikely to be the path of an actual folder.
There is no attempt to use any indexing to provide only the relevant files data. Not all filenames from files occur in out (that was the whole point of your question), but you make not attempt to get only the elements of files that correspond to the filenames in out.

Probably the easiest approach would be to just use the code which I gave at the end of my original answer, but replace pwd with uigetdir:

D = uigetdir(...);

Tessa Kol 2020 年 9 月 25 日

MATLAB Online で開く

Thank you for your explenation!

Is it also possible to do the same approach, but then the files are in different folders. I tried that with the followin piece of code:

%% Loading the data
rhoPart = 2540;
% Select the main folder
Folder = uigetdir;
% Find all .data files in the sub folders
files = dir(fullfile(Folder,'\**\*.data*'));
% Select only .data file from the last time step of each simulation run
spl = regexp({files.name},'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = {files(idx(idy)).name};
k = 1;
for i = 1:length(out)
    fid = fopen(fullfile(files(i).folder,out{i}),'r');
    %% Reading the data
    % Read all the data from the file
    dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
%     frewind(fid);
    % Write headerline N, time, xmin, ymin, zmin, xmax, ymax, zmax
%     runData{k} = strsplit(fgetl(fid), ' ');
    % Write only the x, y, and z components of the particles, particle radius,
    % z component+ particle radius and volume of the particle
    expData{k} = [dataRead{1}(:,1) dataRead{2}(:,1) dataRead{3}(:,1) dataRead{7}(:,1) dataRead{3}(:,1)+dataRead{7}(:,1) rhoPart*(4/3)*pi*(dataRead{7}(:,1).^3)];
    % Write only the vx,vy,vz of the particles and magnitude
    velData{k} = [dataRead{4}(:,1) dataRead{5}(:,1) dataRead{6}(:,1) sqrt(dataRead{4}(:,1).^2 + dataRead{5}(:,1).^2 + dataRead{6}(:,1).^2)];
    fclose(fid);
    k = k + 1;
end

But this obviously doesn't work the files(i).folder doesn't match the out{i}.

The main folder where all the .data files are stored is called 2Dtest4.

Then, I have 4 subfolders called:

2Dtest_all-0.45

2Dtest_all-0.67

2Dtest_all-0.89

2Dtest_all-0.123

I those subfolders are the .data files stored.

Tessa Kol 2020 年 9 月 25 日

MATLAB Online で開く

files(idx(idy)).folder

Did not work either. I got an error saying:

Error using textscan
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in LedgeTest2D_results (line 33)
    dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);

Stephen23 2020 年 9 月 25 日

編集済み: Stephen23 2020 年 9 月 25 日

MATLAB Online で開く

It would probably be easier to get rid of out altogether and just sort the structure itself, e.g.:

files = files(idx(idy));

and then inside the loop you can simply access the folder and name, e.g.:

for k = 1:nume(files)
    fnm = fullfile(files(k).folder,files(k).folder);
    fid = fopen(fnm,'rt');
    ...
end

サインインしてコメントする。

Import files based on file name

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

9 件のコメント
7 件の古いコメントを表示 7 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

Import files based on file name

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

9 件のコメント 7 件の古いコメントを表示 7 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

9 件のコメント
7 件の古いコメントを表示 7 件の古いコメントを非表示