How to speed up MEX function?
古いコメントを表示
following mex code is running too slow, but I don't know why it is and how to make it faster. Any help is greatly appreciated!
calculate_my_way.cpp
#include "mex.hpp"
#include "mexAdapter.hpp"
#include <cmath>
class MexFunction : public matlab::mex::Function {
public:
void operator()(matlab::mex::ArgumentList outputs, matlab::mex::ArgumentList inputs) {
matlab::data::TypedArray<double> var0 = inputs[0];
matlab::data::TypedArray<double> var1 = inputs[1];
matlab::data::TypedArray<double> var2 = inputs[2];
matlab::data::TypedArray<double> var3 = inputs[3];
auto var0Iter = var0.begin();
auto var1Iter = var1.begin();
auto var2Iter = var2.begin();
auto var3Iter = var3.begin();
const int numOfElements = var0.getNumberOfElements();
double buffer = 0;
for (int x = 0; x<numOfElements; x++)
{
buffer = std::sin(*var0Iter) + std::sin(*var1Iter) + std::sin(*var2Iter) + std::cos(*var3Iter);
*var0Iter = buffer;
buffer = std::sin(*var1Iter + *var2Iter) + std::cos(*var3Iter);
*var1Iter = buffer;
var0Iter++;
var1Iter++;
var2Iter++;
var3Iter++;
}
outputs[0] = std::move(var0);
outputs[1] = std::move(var1);
}
};
It's just simple calculation, but this code runs even slower than native distance function which performs a lot more complicated calculation than just a few sin+cos.
I'm using compiler that came with Visual Studio 2017. below is how I run mex and the compiler setup info.
mex -v calculate_my_way.cpp
...
Compiler location: C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\
...
OPTIMFLAGS : /O2 /Oy- /DNDEBUG
and this is how I am seeing performance issues.
clear
size_test = 1e7;
var1 = zeros(size_test, 1);
var2 = zeros(size_test, 1);
var3 = zeros(size_test, 1);
var4 = zeros(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
elapsed_time = timeit(cant_beat_me);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
elapsed_time = timeit(mex_slow);
15 件のコメント
Rik
2022 年 11 月 2 日
Apart from the segfault if var1 is longer than the others, did you try with a random test set as well? The distance function may have some calls optimized away.
I might be able to try this code on my desktop later today.
Walter Roberson
2022 年 11 月 2 日
buffer = std::sin(*var0Iter) + std::sin(*var1Iter) + std::sin(*var2Iter) + std::cos(*var3Iter);
*var0Iter = buffer;
buffer = std::sin(*var1Iter + *var2Iter) + std::cos(*var3Iter);
You calculate std::cos(*var3Iter) twice
Yifan Lin
2022 年 11 月 2 日
Bruno Luong
2022 年 11 月 2 日
編集済み: Bruno Luong
2022 年 11 月 2 日
"I'm guessing this is a compiler choice? Does matlab uses intel compiler that I don't have?"
I have Intel compiler I can test.
But Matlab can implement with vector arithmetics with multi-threading, you also could with OpenMP.
There are few people here that do miracles with Mex programing, James Tursa and Jan Simon to cite fews, but I believe they are C oriented and less C++.
Walter Roberson
2022 年 11 月 2 日
Which distance function are you comparing to?
Yifan Lin
2022 年 11 月 2 日
Walter Roberson
2022 年 11 月 2 日
The Mapping Toolbox distance() function is not coded in mex. You can read the MATLAB source code for it. The code converts the angles to radians, and then uses its local function greatcircledist() to compute using the haversine formula, and then does something that I do not recognize at the moment involving atan2() -- at least for the default calculation. There is a different code path if you use some of the options.
Bruno Luong
2022 年 11 月 2 日
timeit result of your code with VS compiler and Intel OneAPI compiler (2022)
VS_elapsed_time % 0.1795
Intel_elapsed_time % 0.1781
Bruno Luong
2022 年 11 月 2 日
編集済み: Bruno Luong
2022 年 11 月 2 日
Obviously evalutae cos/sin depends run time on data
Compare between MATLAB and cpp with zero data
clear
size_test = 1e7;
var1 = zeros(size_test, 1);
var2 = zeros(size_test, 1);
var3 = zeros(size_test, 1);
var4 = zeros(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
MATLAB_elapsed_time = timeit(cant_beat_me) % 0.0274
Intel_elapsed_time = timeit(mex_slow) % 0.1803
function [out0,out1] = distance(var0, var1, var2, var3)
out0 = sin(var0) + sin(var1) + sin(var2) + cos(var3);
out1 = sin(var1 + var2) + cos(var3);
end
with random data
clear
size_test = 1e7;
var1 = 2*pi*rand(size_test, 1);
var2 = 2*pi*rand(size_test, 1);
var3 = 2*pi*rand(size_test, 1);
var4 = 2*pi*rand(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
MATLAB_elapsed_time = timeit(cant_beat_me) % 0.1560
Intel_elapsed_time = timeit(mex_slow) % 0.5101
The factor of
>> 0.5101/0.156
ans =
3.2699
could be well explained by multi-thread.
Yifan Lin
2022 年 11 月 2 日
Bruno Luong
2022 年 11 月 2 日
Or stay with MATLAB?
Yifan Lin
2022 年 11 月 2 日
Yifan Lin
2022 年 11 月 2 日
Bruno Luong
2022 年 11 月 3 日
By curiosity I code the same calculation in C. Time is 0.24 sec; twice faster than C++ (0.5 sec) but 60% slower than MATLAB (0.147 sec).
/* mex -g -R2018a calculate_C_way.c */
#include "mex.h"
#include <math.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int i, n;
double *var0Iter, *var1Iter, *var2Iter, *var3Iter, *out0Iter, *out1Iter;
n = mxGetNumberOfElements(prhs[0]);
plhs[0] = mxCreateNumericMatrix(1, n, mxDOUBLE_CLASS, mxREAL);
plhs[1] = mxCreateNumericMatrix(1, n, mxDOUBLE_CLASS, mxREAL);
var0Iter = mxGetDoubles(prhs[0]);
var1Iter = mxGetDoubles(prhs[1]);
var2Iter = mxGetDoubles(prhs[2]);
var3Iter = mxGetDoubles(prhs[3]);
out0Iter = mxGetDoubles(plhs[0]);
out1Iter = mxGetDoubles(plhs[1]);
for (i = 0; i < n; i++) {
*out0Iter = sin(*var0Iter) + sin(*var1Iter) + sin(*var2Iter) + cos(*var3Iter);
*out1Iter = sin(*var1Iter + *var2Iter) + cos(*var3Iter);
out0Iter++;
out1Iter++;
var0Iter++;
var1Iter++;
var2Iter++;
var3Iter++;
}
}
Yifan Lin
2022 年 11 月 3 日
採用された回答
その他の回答 (1 件)
Bruno Luong
2022 年 11 月 2 日
編集済み: Bruno Luong
2022 年 11 月 2 日
I don't know well C++, but I have practiced quite a lot mex C.
It looks like this statement just move a bunch of data
outputs[0] = std::move(var0);
outputs[1] = std::move(var1);
ALso I wonder if your input "0, and 1 would change
*var0Iter = buffer;
...
*var1Iter = buffer;
after calling the mex, which is NOT allowed.
2 件のコメント
Yifan Lin
2022 年 11 月 2 日
Bruno Luong
2022 年 11 月 2 日
" Another one of your answer here helped me tremendously a few years back! thank you! "
Oh... realy glad to read that...
カテゴリ
ヘルプ センター および File Exchange で Write C Functions Callable from MATLAB (MEX Files) についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!