Import files based on file name
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
古いコメントを表示
I have a folder containing around 300,000 files. I don't need to import all the files.
Problem: How can I import the files based on specific file name?
Problem example:
In the picture below I have a section of the files, which are all in the same folder. I only want to import the .data files. But I don't need all the .data files to be import only the last one of every serie.

Up until now I have the following code:
files = dir(fullimpfile(pwd,'*.data*'));
expData = cell(length(files),1);
for i = 1:length(files)
fid = fopen(fullfile(files(i).folder,files(i).name),'r');
%% Reading the data
% Read all the data from the file
dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
end
This code imports all of the data files. How can I use I think a for loop to only import only the last .data files of every serie?
採用された回答
Stephen23
2020 年 9 月 2 日
fnm = {...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.0',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.1',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.2',...
'LedgeTest_muSP_0.10_muRP_0.10.1.data.21',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.0',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.1',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.2',...
'LedgeTest_muSP_0.10_muRP_0.20.1.data.11'}
spl = regexp(fnm,'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = fnm(idx(idy))
Giving:
out =
'LedgeTest_muSP_0.10_muRP_0.10.1.data.21'
'LedgeTest_muSP_0.10_muRP_0.20.1.data.11'
Use it like this:
D = pwd;
tmp = dir(fullfile(D,'*.data.*'));
fnm = {tmp.name};
...
for k = 1:numel(out)
fid = fopen(fullfile(D,out{k}),'r');
...
end
9 件のコメント
Tessa Kol
2020 年 9 月 2 日
I implemented your code into my code:
files = dir(fullfile(pwd,'*.data*'));
spl = regexp({files.name},'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = struct2cell(files(idx(idy)))
for k = 1:numel(out)
fid = fopen(fullfile(pwd,out{k}),'r');
...
end
But I got the following error:
Error using fullfile (line 103)
All inputs must be strings, character vectors, or cell arrays of character vectors.
Error in untitled2 (line 12)
fid = fopen(fullfile(pwd,out{k}),'r');
Although innovative, this line:
out = struct2cell(files(idx(idy)))
and the corresponding indexing:
out{k}
need some more thought.
Note that struct2cell creates a cell array where the first dimension encodes the fields and the other dimensions correspond to the dimensions of the input structure arrays shifted/permuted by one.
So with your adaption, the code will iterate (using linear indexing) down the first column, which thus contains the data from the first element of the structure array returned by dir. Not all of this data is character, e.g. the fields bytes, isdir, and datenum. So as soon as your code refers to one of the cells with those data, it will throw an error. Even without throwing an error the code would still be incorrect because only one of the cells in each column is actually the filename.
One fix would be to change the code to use subscript indexing instead of linear indexing. Or use a simple comma-separated list to create a cell array with only the filenames:
out = {files(idx(idy)).name};
Or if you want a sub-structure of that returned by dir then just index into it:
sub = tmp(idx(idy));
The approach I showed you in my answer works without error because it does not include all of the other fields in the cell array, only the filenames, thus trivial linear indexing is all that is required.
Tessa Kol
2020 年 9 月 2 日
Thank you so much for all the help!
Is it also possible to extract a range of files. For example the last 5 .data files of every serie?
And I also encounted an error when adjusting the previous code:
Error using textscan
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in LedgeTest2D_results (line 30)
dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
% Select only .data file from the last time step of each simulation
files = dir(fullfile(uigetdir,'*.data*'));
spl = regexp({files.name},'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = {files(idx(idy)).name};
expData = cell(length(out),1);
for i = 1:length(out)
fid = fopen(fullfile(files.folder,out{i}),'r');
%% Reading the data
% Read all the data from the file
dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1)
end
The matlab file is in a different folder from the .data files.
"The matlab file is in a different folder from the .data files."
Then you need to tell textscan where to look. Your innovative approach of using
fid = fopen(fullfile(files.folder,out{i}),'r');
will not work for two main reasons:
- files.folder creates a comma-separated list containing the folder data from all elements of files, which you the supply as inputs to fullfile. Basically your code does this: fullfile(files(1).folder, files(2).folder, .... , files(end).folder, out{i}), which is very unlikely to be the path of an actual folder.
- There is no attempt to use any indexing to provide only the relevant files data. Not all filenames from files occur in out (that was the whole point of your question), but you make not attempt to get only the elements of files that correspond to the filenames in out.
Probably the easiest approach would be to just use the code which I gave at the end of my original answer, but replace pwd with uigetdir:
D = uigetdir(...);
Stephen23
2020 年 9 月 2 日
"That is not ideal."
Personally I would avoid uigetdir, I only referred to it because that was what you showed in your previous comment. Putting UIs into code makes them difficult to generalize, to call in loops or from other functions, or to automatically test.
I would just supply that folder name once as a parameter/function input.
Tessa Kol
2020 年 9 月 25 日
Thank you for your explenation!
Is it also possible to do the same approach, but then the files are in different folders. I tried that with the followin piece of code:
%% Loading the data
rhoPart = 2540;
% Select the main folder
Folder = uigetdir;
% Find all .data files in the sub folders
files = dir(fullfile(Folder,'\**\*.data*'));
% Select only .data file from the last time step of each simulation run
spl = regexp({files.name},'\.data\.','split','once');
spl = vertcat(spl{:});
vec = str2double(spl(:,2));
[~,idx] = sort(vec);
[~,idy,idz] = unique(spl(idx,1),'last');
out = {files(idx(idy)).name};
k = 1;
for i = 1:length(out)
fid = fopen(fullfile(files(i).folder,out{i}),'r');
%% Reading the data
% Read all the data from the file
dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
% frewind(fid);
% Write headerline N, time, xmin, ymin, zmin, xmax, ymax, zmax
% runData{k} = strsplit(fgetl(fid), ' ');
% Write only the x, y, and z components of the particles, particle radius,
% z component+ particle radius and volume of the particle
expData{k} = [dataRead{1}(:,1) dataRead{2}(:,1) dataRead{3}(:,1) dataRead{7}(:,1) dataRead{3}(:,1)+dataRead{7}(:,1) rhoPart*(4/3)*pi*(dataRead{7}(:,1).^3)];
% Write only the vx,vy,vz of the particles and magnitude
velData{k} = [dataRead{4}(:,1) dataRead{5}(:,1) dataRead{6}(:,1) sqrt(dataRead{4}(:,1).^2 + dataRead{5}(:,1).^2 + dataRead{6}(:,1).^2)];
fclose(fid);
k = k + 1;
end
But this obviously doesn't work the files(i).folder doesn't match the out{i}.
The main folder where all the .data files are stored is called 2Dtest4.
Then, I have 4 subfolders called:
2Dtest_all-0.45
2Dtest_all-0.67
2Dtest_all-0.89
2Dtest_all-0.123
I those subfolders are the .data files stored.
Tessa Kol
2020 年 9 月 25 日
files(idx(idy)).folder
Did not work either. I got an error saying:
Error using textscan
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in LedgeTest2D_results (line 33)
dataRead = textscan(fid,'%f %f %f %f %f %f %f %f %f %f %f %f %f %f','HeaderLines',1);
It would probably be easier to get rid of out altogether and just sort the structure itself, e.g.:
files = files(idx(idy));
and then inside the loop you can simply access the folder and name, e.g.:
for k = 1:nume(files)
fnm = fullfile(files(k).folder,files(k).folder);
fid = fopen(fnm,'rt');
...
end
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Call C from MATLAB についてさらに検索
参考
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
