Operations on variables with specific naming patterns
古いコメントを表示
I am reading in a large number of text files. The files contain numerical values that I want, and text that I do not. Using the matlab.lang.makeValidName command, I am able to save the numbers into arrays with names like:
- A123
- B123
- C123
- A234
- B234
- C234
- A345
- B345
- C345
- ...
(It's a lot more complex than this in reality. The variable names are a combination of the filename from which the data was read and the name of the values from the file...but let's try to keep it simple in the example! :)
What I am trying to do now is to run calculations on each of the variables with "A" in the title. Using who('-regexp','A') I get a cell that contains the names of all of the variables in my workspace with "A" in the title, but I can't quite figure out what to do next with that data.
If I wanted to add all of the variables with A in the title, what would the proper command be? Likewise, if I wanted to create a much larger matrix of [A123 ; A234 ; A345 ...] what would that command be? The sizes of all of the "A" variables are the same, so there's nothing to worry about there.
Thanks for the help!
7 件のコメント
per isakson
2017 年 7 月 11 日
See TUTORIAL: Why Variables Should Not Be Named Dynamically (eval). After reading this tutorial, are you sure you want to proceed with your current design?
James
2017 年 7 月 13 日
per isakson
2017 年 7 月 13 日
編集済み: per isakson
2017 年 7 月 13 日
Yes, your answer makes me feel better :-) And not only me. Regulars at this forum have devoted a lot of time to discourage the use of eval.
I don't fully understand your problem and would like to pose some questions.
Regarding "large number of text files"
- Do all files have identical format, i.e. only the actual numbers differs?
- Would it simplify the analysis if the data from all file are in the memory at the same time? Or is it possible to process one file at the time?
- Does the data from all files fit in memory (and leave enough memory for the analysis)?
All the arrays, A123,B123,... do they have the same size? And what size are they?
"[A123 B123 C123 ...] [A234 B234 C234 ...] [A345 B345 C345 ...]" Does this denote three 3D arrays?
James
2017 年 7 月 14 日
" Fortunately, the variable names and the order in which they are generated are stored in the text file itself"
In general store your data in the simplest arrays possible. In practice this means that you should put them into one numeric array if possible, or you could consider a cell array if sizes/classes are different between data sets, or a structure if the data names are significant.
"just LOAD ALL THE FILES be the way to go about avoiding the eval issue?"
The question is not clear: the problems with eval are not because of lack of memory. And loading data can be done as you wish: badly into dynamically named variables, or neatly into simpler arrays or structures, so what difference would loading all of the files make?
"The variable names are consistent" What does that mean? Do all files contain exactly the same variables?
"The next step was going through and looking for the lines of numbers that corresponded to each group of variable names. Any line that started with a (1) corresponded to the first set of variable names, (2) corresponded to the second set, and so on.."
This indicates that the variable names contain metadata. Note that metadata is data. Data should be stored in variables, not in variable names. Once you put your metadata into variables then accessing and processing that metadata will be much faster and more efficient than any hack code you could come up with that accesses the variable names. Read this carefully:
and in particular this section:
James
2017 年 7 月 21 日
Guillaume
2017 年 7 月 21 日
In my opinion, dynamically named fields are just as bad as eval. You're still encoding metadata in variable (field) names, and this it not the way you should solve your problem.
There are two orthogonal issues at hand:
- Parsing of the files, so that whatever order the variable come in, you know what they are
- Storing of these variables and storing of the metadata
The first one can be solved any number of ways, with more or less complexity depending on how robust you want your parser to be
For storage, dynamically named anything is not a good idea. If speed is the focus, then as Stephen said the simplest storage is the best, matrices or cell arrays. Otherwise, you could go more fancy with maps (unfortunately, not very well implemented in matlab) or other containers.
Certainly when I see that you want to have variables A123, B123, A234, etc, what I read is that you need
container(A, 1, 2, 3) %where A could be categorical
container(B, 1, 2, 3)
container(A, 2, 3, 4)
It is then trivial to get all 'A' variables
container(A, :, :, :)
採用された回答
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Data Type Identification についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!