Best way to acess external config data (load mat file vs import data vs fscanf etc. )

Greetings, I need to store some external parameters in a configuration file (not more than 50 lines). It is not yet defined how many subfunction will need to have access to it but the overall speed will be important. The whole application will be compiled in the end.
I found multiple ways to do this but I am not sure what is the best way: I could load a mat file, read an ascii txt file, a csv and so on (and all this with multiple possibilities).
What is the best (most efficient) way to do this ?
Thanks.

5 件のコメント

dpb
dpb 2014 年 6 月 13 日
For what working definition of "most efficient"?
For only 50 parameters, the time even to process a text file will be miniscule unless you re-read it over and over and if you do so that's a poor factorization of the problem I'd suggest. I'd not mention it other than your comment that "it is no yet defined how many subfunction will need to have access to it". I'd say it should be read once at the beginning and the parameters then passed to the various functions as needed.
The key question is whether there's any reason/need for the configuration file to be human-read or not. If not, either the .mat or a stream file would be both the smallest and fastest to load. If, otoh, it's expected/needed to be able to use a text editor or the like outside an application to create/modify these, then an ASCII format is obviously needed. Only you as the program designer in conjunction with your future consumers' inputs can define that; not any of us here.
Mathieu
Mathieu 2014 年 6 月 16 日
編集済み: Mathieu 2014 年 6 月 16 日
Thanks.
Of course, "most efficient" is difficult to define precisely before the bottleneck of program architecture is complete.
Regarding the key question: .mat file isn't a problem as there is always the possibility to write a script that reads it and generates it.
What do you call a "stream file" ?
Poor factorization of the problem: true, but I'd like to keep the definition of functions (including number of arguments) as steady as possible to avoid rewrites and errors (multiple people working on the files). Maybe using a structure containing all configuration data could be good solution but I don't find it very "clean". What is your opinion about this ?
In general, in terms of speed and memory, with say 50 parameters, could you correct the order from fast to slow (and say if the difference is significant):
1 load(config.mat)
2 importdata(config.csv)
3 fscanf(config.txt)
4 textscan(config.txt)
dpb
dpb 2014 年 6 月 16 日
編集済み: dpb 2014 年 6 月 16 日
A stream file is what is often called "binary" -- fread/fwrite
I think the structure of a configuration object is the far cleaner solution--it encapsulates the changes inside the structure and is, therefore, far less prone to rampant changes in code at least isolating the interfaces.
Again, unless you insist on embedding this i/o operation deeply within a nested loop or somesuch, it's almost certainly to be of no never-mind in performance overall; it's just unwise to not do as good a job of factorization and encapsulation as feasible at the git-go rather than start with a bad factoring from the beginning.
If there are subsections that need disparate data, create specific structures or the like for them to minimize the amount of data being passed around rather than reload stuff randomly.
I'd say likely (but only timing with a specific file size/structure can prove it) the order would be
fread
load
fscanf
...
The higher-level functions trailing in no known order but stuff like importdata that are generic and do a lot of testing to handle anything automagically will likely bring up the rear and possibly by a fairly wide margin.
Mathieu
Mathieu 2014 年 6 月 17 日
Thanks a lot!
I also intuitively thought importdata would be slow, but for some weird reason associated it with the speed of "load" that you implied. It makes complete sense this way.
I use fread/fwrite with virtual com ports but have no idea how to use that with local files. Anyhow, I don't think it is worth to investigate now.
I'll start with a load and will switch to fscanf if needed.
Thanks again for the support.
dpb
dpb 2014 年 6 月 17 日
The biggest difference in speed between the classes is whether they're formatted or unformatted files. The overhead for formatted files is first there's many more bytes/numeric value and secondly it takes compute-time to do the conversion from/to internal storage to the ASCII representation.
For formatted files the higher-level, more convenient forms TMW supplies have all the extra processing and logic built into them that make them more convenient to use but that also comes at a price of far more complexity in the processing. load is slightly more complex than a pure stream of bytes owing to the fact it also saves the structure and the information associated with the variable(s) whereas all of that is your responsibility if you simply write an array as the byte stream.
Similarly, if use fscanf by itself, all the formatting and structure in the file is written explicitly so there's no additional internal logic. If you use one of the more convenient forms then they eventually get down to the final call to the i/o but they have (a varying amount of) flexibility built in so again, that costs something in performance in payback for the convenience (there is no free lunch).
OTOH, for the size of the data you're talking about, I'll reemphasize that write the code for ease of maintainability and clarity of expression first; the amount of data you're talking about is so minimal as to be of no concern for execution time or storage space (again, presuming the avoidance of doing the same thing over and over and over...).

サインインしてコメントする。

その他の回答 (1 件)

Pritesh Shah
Pritesh Shah 2014 年 6 月 17 日

0 投票

I think, You may try tic and toc command to find out optimal solution for your case.

カテゴリ

製品

質問済み:

2014 年 6 月 13 日

回答済み:

2014 年 6 月 17 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by