MATLAB Answers

0

Alternatives for using EVAL to access data in multi-layered struct?

Sjouke Rinsma さんによって質問されました 2018 年 10 月 18 日
最新アクティビティ Sjouke Rinsma さんによって 編集されました 2018 年 10 月 19 日
So, I have read many forum topics regarding the use of EVAL and it being bad practice, though in my situation I feel it makes my code more compact and actually more readable. Nevertheless, I was wondering whether there are any alternatives using for example indexing, though I do not yet see how I would implement that in this case.
In short; I have a Matlab class which imports data from a given folder, possibly containing a multitude of files of different formats (e.g. CSV, XLSX, or other). The data is sorted in a structure according "data.(filetype).(filename).(tab)" with 'tab' applying to e.g. Excel Workbook files but being omitted for CSV files. Each files' data is then stored into a table, textdata and headers, which adds another layer. To access specific data I use a recursive function to return the structure tree as strings like {'data.xlsx.file1.tab1'; 'data.xslx.file1.tab2'; 'data.xlsx.file1.tab1'}.
I currently use EVAL to acquire specific data such as 'variable1' from all tables, in order to avoid constantly having to determine the fieldnames and using a 3 or 4 layered for-loop, which becomes additionally cumbersome when having to include exceptions like for example missing variables. At the moment I am also pondering about how to write data back to specific fields without have to use EVAL again or some form of 'string-split-at-the-dots' and then dynamically inputting the fieldnames. But then again, maybe my whole approach for using such a multi-layered struct is already poor to begin with, so any suggestions and/or alternatives are more than welcome.

  0 件のコメント

サインイン to comment.

製品


リリース

R2017a

2 件の回答

Stephen Cobeldick
回答者: Stephen Cobeldick
2018 年 10 月 19 日
編集済み: Stephen Cobeldick
2018 年 10 月 19 日
 採用された回答

One simple solution is to use getfield and setfield to access nested structures. Instead of returning the structure location as one character vector like this:
S = 'data.xlsx.file1.tab1'
you should return it in a cell array of char vectors, like this:
C = {'xlsx','file1','tab1'}
(this will require only a simple change to the recursive function). Then you can trivially access the data using getfield:
getfield(data,C{:})
and that is all! Here is a simple working demonstration:
>> data.xlsx.file1.tab = 1;
>> data.xlsx.file2.tab = 2;
>> data.csv.file2 = 3;
>> C = {'xlsx','file2','tab'};
>> getfield(data,C{:})
ans = 2
No ugly loops, no evil eval, no problems!
"But then again, maybe my whole approach for using such a multi-layered struct is already poor to begin with..."
Personally I am not a big fan of nested structures, and I notice that they tend to be overused by beginners wanting to reflect the minutae of how they see their data-organization. One of the main risks (which you are doing) is encoding meta-data like filenames and tab names into the code (as filednames). This is a bad way to write code: it make code complex and makes accessing that meta-data slow and buggy. Your approach is very fragile, e.g. because there are many filenames that are not valid fieldnames: what would your code do with the filename a-1.csv ? Or a.2.csv? The approach of mixing meta-data (like filenames and tab names) into data is simply flawed, and should be avoided. Meta-data is data, and it should be stored as data in it own right. Consider those example filenames: if we put them into a structure field named filename, then the code will never break depending on the name itself:
S.filename = 'a-1.2.3-4.csv'
You should consider that a table is a very powerful option and has many advantages for processing groups of data.
Personally I would probably use a single non-scalar structure, where the meta-data are simply encoded as data in fields:
data(1).type = 'xlsx'
data(1).name = 'file1'
data(1).tab = 'tab1'
data(1).data = ...
data(2).type = 'csv'
data(2).name = 'file2'
data(2).tab = [];
data(2).data = ...
This would make accessing and processing the data quite simple, and has some neat syntaxes that you will find very handy:

  2 件のコメント

Philip Borghesani 2018 年 10 月 19 日
This does work fine and is simple code however in the long run using this along with setfield will produce quite a bit slower and possibly more difficult to restructure code. If the performance is acceptable then this is a perfectly fine solution. It can also be mixed with handle use in some spots for gradual improvement.
Sjouke Rinsma 2018 年 10 月 19 日
Yep, this is nice and intuitive, and for my application the most straightforward solution.
Since I don't know the layers of the struct beforehand, I will still use the recursive function to return the references to the structure data in text format, split the strings, create a cell array and that's it.
string = 'xlsx.file2.tab';
D = strsplit(string, '.');
getfield(data, D{:})
I do feel a little silly for not being aware of this set/get functionality for fields 8-)
Anyway, thank you both for responding!
EDIT:
@Stephen: I will also look into your suggestion using tables and see whether this is a fitting alternative.
@Philip: Some data files can indeed be quite large, so I'll keep your solution in mind in case performance becomes an issue. Thanks!

サインイン to comment.


回答者: Philip Borghesani 2018 年 10 月 18 日

I think this is where you went wrong: "To access specific data I use a recursive function to return the structure tree as strings like {'data.xlsx.file1.tab1'; 'data.xslx.file1.tab2'; 'data.xlsx.file1.tab1'}."
Instead store your table objects as handle objects inside the structure data. Then have the recursive function return the handle(s) to the data object(s) in a cell array or object array. Access will then be fast and there will be no need for eval to read or modify the table objects.

  1 件のコメント

Sjouke Rinsma 2018 年 10 月 19 日
" Instead store your table objects as handle objects inside the structure data."
Pew, that's a brain teaser. Okay, so this sounds like a nice solution, never crossed my mind that I could create handles to the structure data. I found one example in the forum:
is the accepted answer what you are referring to? In that case let's see if I get this correct: I would need to create shortnames (or 'pointers' if you will) like e.g. 'XLSX_Fx_Px'. Then I need to input for example 'data.xlsx.file1.tab1' to the hstruct class and assign the handle to it like:
data.xlsx.file1.tab1 = hstruct(data.xlsx.file1.tab1);
XLSX_F1_P1 = data.xlsx.file1.tab1;
I would use the recursive function to apply this to all structure data and collect all shortnames in a cell array. The assignment in the first line does add a new layer resulting in 'data.xlsx.file1.tab1.DATA' (the hstruct class obj.DATA in caps to avoid confusion). Though I can definitely see the ease of use since I can now access and modify the same data using 'XLSX_F1_P1.DATA'... I'm still wondering whether this is exactly what you mean, since this does involve reorganizing my original structure.

サインイン to comment.



Translated by