How to extract information from the name of a file without using strsplit function?

2 ビュー (過去 30 日間)
Peter
Peter 2014 年 12 月 28 日
編集済み: per isakson 2014 年 12 月 29 日
Hello!
I need some help extracting some coordenates from the name of a a file.
I have 224 files named like this: files=[{' x+0.000mm_y_+0.000mm_z_+0.000mm.dat_ '},....].
I want to extract the X coordenate and the Y coordenate (in this case 0.000 both) and finally have a matrix of 224x2 (224 rows for each file (files are points), and 2 columns for x and y).
I can not use strsplit function because i am using Matlab2010. I have tried with regexp:
delim = '[mm+]';
coord = regexp(files,delim, 'split');
but then i obtain a cell array 224x1 awith cells of 1x10 inside an i do not know how to work with it.
Any idea?? I would really appreciate your help.
Thanks!!
  5 件のコメント
per isakson
per isakson 2014 年 12 月 28 日
Read the documentation on regexp and play with a couple of its examples.
Peter
Peter 2014 年 12 月 28 日
Thank you very much isakson, put your answers as "answers" and i will accepted this way for you to get points.

サインインしてコメントする。

採用された回答

Azzi Abdelmalek
Azzi Abdelmalek 2014 年 12 月 28 日
files=[{' x+0.000mm_y_+0.000mm_z_+0.000mm.dat_ '} {' x+10.000mm_y_+20.000mm_z_+30.000mm.dat_ '}]
coord = regexp(files,'\d+(\.\d+)?','match')
coord=cell2mat(cellfun(@str2double,coord,'un',0)')
  3 件のコメント
Peter
Peter 2014 年 12 月 28 日
I Need to split this row of 672 columns in groups of 3 coordenates to have a 224 rows with 3 columns
Peter
Peter 2014 年 12 月 29 日
I tried that and it did not work.
When i use:
coord = regexp(files,'\d+(\.\d+)?','match')
I oibtain the three coordenates but HOWEVER they are not displayed in the same order as the name of the files, and it is a very important problem, to solve.
I need somehow to have the 3 coordinates in the same order as the files whose they are extracted.
For example for a group of files like this:
files=[{' x+0.000mm_y_+0.000mm_z_+0.000mm.dat_ '} {' x+5.000mm_y_+10.000mm_z_+0.000mm.dat_ '} {' x+10.000mm_y_+20.000mm_z_+30.000mm.dat_ '} {' x+15.000mm_y_+40.000mm_z_+0.000mm.dat_ '} {' x+20.000mm_y_+60.000mm_z_+0.000mm.dat_ '}]
I need to have the coordinates this way and order:
0 0
5 10
10 20
15 40
20 60
Do you know how to doi it? Whay does them follow another order? Is this about the 'match'??
Thanks

サインインしてコメントする。

その他の回答 (1 件)

per isakson
per isakson 2014 年 12 月 29 日
編集済み: per isakson 2014 年 12 月 29 日
The best code is the one, which is easiest to understand in three month from now. That's among alternatives, which return the correct result. And sometimes speed matters.
The question is "[...] extracting some coordinates [, i.e. numerical data] from the name of files [on a disk]"
dir is the obvious function to use to create a list of file names. dir returns a structure array. In this case, converting to a cell array just makes the code more complicated.
dir, List folder contents says: Results appear in the order returned by the operating system. (Windows returns the filenames in "ascii-order").
textscan is the best function to use to extract numerical data, which is embedded in a fixed character string. More often than not regexp leads to more complicated code.
for-loop often makes the code easier to understand and is in most cases fast enough.
sortrows is used to "correct" the sort order returned by dir
&nbsp
Demo:
>> coord = cssm('h:\m\cssm')
coord =
0 0 0
5 10 0
10 20 30
15 40 0
20 60 0
>>
where
function out = cssm( folderspec )
create_some_files( folderspec )
out = extract_numerical_data( fullfile( folderspec, 'x*.dat' ) );
end
function create_some_files( folderspec )
filenames = { 'x+0.000mm_y_+0.000mm_z_+0.000mm.dat'
'x+5.000mm_y_+10.000mm_z_+0.000mm.dat'
'x+10.000mm_y_+20.000mm_z_+30.000mm.dat'
'x+15.000mm_y_+40.000mm_z_+0.000mm.dat'
'x+20.000mm_y_+60.000mm_z_+0.000mm.dat' };
for jj = 1 : length( filenames )
cmd_create_file( fullfile( folderspec, filenames{jj} ) );
end
end
function out = extract_numerical_data( glob )
file_data = dir( glob );
coord = nan( length(file_data), 3 );
for rr = 1:length(file_data);
cell_num = textscan( file_data(rr).name ...
, 'x%fmm_y_%fmm_z_%fmm.dat' ...
, 'CollectOutput', true );
coord( rr, : ) = cell_num{1};
end
out = sortrows( coord, [1,2,3] );
end
and where cmd_create_file is attached
  1 件のコメント
Peter
Peter 2014 年 12 月 29 日
編集済み: Peter 2014 年 12 月 29 日
Okay, very nice and fine work!
And thanks for the explanation!

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeDatastore についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by