When should you start to consider using MEX files?
28 ビュー (過去 30 日間)
Whilst following some discussion forums about project euler problems and the best languages to use, someone asked was Matlab any good for solving all problems. The reply said that some of the problems were designed to take considerable more time running a Matlab function than a compiled language.
So my question. How do I decide that a problem is best suited to Matlab or C/C++. Other than writing a program/function and my dinner getting cold while I wait for an answer. Should I teach myself building MEX files or would I be better just writing, compiling and running from command line.
回答 (2 件)
Walter Roberson 2011 年 11 月 14 日
One of the first things you have to ponder is execution time compared to development time (including debugging.)
In my work, we develop new algorithms, and compare them to see which algorithm produces the best answer. For our purposes, ease of development and debugging is important: R&D is a lot of hard thinking and thinking is expensive. Once we have settled on some particular algorithms then might be time to make them substantially faster using a different language -- or it might be time to license the intellectual property and let some other organization worry about that.
And since R&D often turns out to be about economic politics rather than about best product, sometimes it turns out that taking the algorithms to market is financially impractical -- it costs pretty much $3 million to get through all of the regulatory assessments and field trials for anything the US considers a "medical device". In such a case, we may just continue to use the prototyped version in-house without the rewrite in to a faster language, knowing that the expense of rewriting would be far more than the in-house expense of waiting a little longer for the MATLAB version to finish.
Now, if "everyone" in-house was using what we develop, then the total waiting time might make it worth-while to spend the time doing a rewrite.
Jan 2011 年 11 月 15 日
The allocation of large temporary arrays can consume more time than the actual calculations. Example:
x = rand(1, 1e7);
v = max(abs(sqrt(sin(x))) > 5);
Required temporary arrays:
tmp1 = sin(x);
tmp2 = sqrt(tmp1);
tmp3 = abs(tmp2);
tmp4 = (tmp3 > 5);
Each memory block does neither match into the internal data cache of the processor nor in the 2nd level cache. Therefore you need a time-consuming interchange with the RAM.
Using a FOR loop is made much faster since Matlab 6.5 (R13). The JIT acceleration improves the speed massively. For this example, I expect a C-mex to be even faster. But the main benefit of the FOR loop in Matlab or C is avoiding the need of large temporary arrays.
I've published some C-mex files in the FEX, which benefit from this method: DNorm2, anyExceed, anyEq, Shuffle, DGradient. Another surprising field for C-mex functions are some built-in functions, which seem to have a suboptimal implementation, as e.g. filter or horzcat (see FilterM and Cell2Vec).
But before you start to implement some functions in C, use the profiler or other speed measurements to find the bottlenecks of your program. As Walter has said already: Spending hours to optimze a function, which runs for seconds only, is a waste of time - except if real-time programming is necessary or the total runtime of the program can be squeezed under 5 seconds. This is the magic limit, which provokes physical stress for human users.