MEX api and interleaved complex arrays

5 ビュー (過去 30 日間)
Jan
Jan 2020 年 10 月 24 日
編集済み: Bruno Luong 2020 年 10 月 25 日
Does MATLAB store complex data in interleaved format since R2018a?
If you compile a MEX function with the -R2018a flag, you can process complex arrays in interleaved format. But what happens, if a MEX function is compiled with the -R2017b flag and called from a modern Matlab version? Does accessing mxGetPr / mxGetPi cause a deep data copy to create the two arrays in separate memory blocks, and another copy to re-create an interleaved array for the output?
This would mean, that MEX fumnctions processing complex data are very inefficient, when they use the -R2017b flag. Is this correct?

採用された回答

Bruno Luong
Bruno Luong 2020 年 10 月 24 日
編集済み: Bruno Luong 2020 年 10 月 24 日
Where have you been Jan? In a cave? ;-)
Sorry it couldn't resist.
As I understand, when you call mxGetPr and mxGetPi, the data is copied to ensure the compatibility. Yes it will not optimal, but it's not very inefficient either.
Yes, to avoid the data copy you have to compile MCC using -R2018a flag then use function such as mxGetDoubles, etc rather than mxGetPr
You can always use the predefined symbol MX_HAS_INTERLEAVED_COMPLEX to know which option you compile
#if MX_HAS_INTERLEAVED_COMPLEX
a = mxGetDoubles(A); % interleaved complex
/* ... */
#else
ar = mxGetPr(A); % real part
ai = mxGetPi(A); % iaginary part
/* ... */
#endif
  3 件のコメント
James Tursa
James Tursa 2020 年 10 月 25 日
Caveat: I am unable to confirm this until I get on a PC with MATLAB in a couple of days.
I believe that in R2018a and later there are no R2017b mxArray structures supported at all. Everything in the background is strictly interleaved complex. The mxGetPr and friends work with deep copies of the data blocks off to the side, NOT with R2017b style mxArray structures off to the side. Every mxArray that gets created and passed around is strictly interleaved complex. WHEN the data copies happen I will have to investigate later when I have MATLAB access.
Bruno Luong
Bruno Luong 2020 年 10 月 25 日
編集済み: Bruno Luong 2020 年 10 月 25 日
"When does the data copy happen? Is it performed implicitely for all inputs, even for deeply nested structs, or dynamically on demand when an array is accessed by mxGetPi/Pr? What about mxGetData()?"
It copies the first time user invokes mxGetPr (first is for the same mxArray). On my laptop it takes 20-30 ms for 1e7-element array.
I guess same thing happens when mxGetPi, mxGetData is called first.
The test script and c-code (windows specific) is bellow.
I guess there must be another copy to group data in interleaved storage when mex returns. It seems it triggered also when the first mxSetxxx is invoked.
The conclusion is better to rework your old mex files with the new interleaved storage to get out the last ounce of performance.
File testInterleavedGetPr.c
/*
* testInterleavedGetPr.c
*
* benchmark mxSetPr
* Compile:
* mex testInterleavedGetPr.c
* mex -R2018a testInterleavedGetPr.c
*/
#include <Windows.h>
#include <stdio.h>
#include "mex.h"
typedef unsigned int UInt32;
double PCFreq;
LONGLONG StartClock;
// Clock related functions----------------------------------------------------------------------------
void InitClock(void)
{
LARGE_INTEGER li;
if (!QueryPerformanceFrequency(&li))
mexErrMsgTxt("QueryPerformanceFrequency failed!\n");
/* Cycles by ms */
PCFreq = (double)(li.QuadPart) / 1000.0;
}
void tic(void)
{
LARGE_INTEGER li;
QueryPerformanceCounter(&li);
StartClock = li.QuadPart;
}
/* Returns elapsetime in ms since last call of tic () */
UInt32 toc(void) /* UInt32 is sufficient for 50 days */
{
LARGE_INTEGER li;
double dres;
QueryPerformanceCounter(&li);
dres = (double)(li.QuadPart-StartClock) / PCFreq;
return (UInt32)dres;
}
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
UInt32 t, i;
double *p;
InitClock();
for (i=0; i<3; i++)
{
#if MX_HAS_INTERLEAVED_COMPLEX
tic();
p = mxGetDoubles(prhs[0]);
t = toc();
mexPrintf("\tmxGetDoubles time = %i\n", t);
#else
tic();
p = mxGetPr(prhs[0]);
t = toc();
mexPrintf("\tmxGetPr time = %i\n", t);
tic();
p = mxGetPi(prhs[0]);
t = toc();
mexPrintf("\tmxGetPi time = %i\n", t);
tic();
p = mxGetData(prhs[0]);
t = toc();
mexPrintf("\tmxGetData time = %i\n", t);
#endif
}
}
File testInterleavedSetPr.c
/*
* testInterleavedSetPr.c
*
* benchmark mxSetPr
* Compile:
* mex testInterleavedSetPr.c
* mex -R2018a testInterleavedSetPr.c
*/
#include <Windows.h>
#include <stdio.h>
#include "mex.h"
typedef unsigned int UInt32;
double PCFreq;
LONGLONG StartClock;
// Clock related functions----------------------------------------------------------------------------
void InitClock(void)
{
LARGE_INTEGER li;
if (!QueryPerformanceFrequency(&li))
mexErrMsgTxt("QueryPerformanceFrequency failed!\n");
/* Cycles by ms */
PCFreq = (double)(li.QuadPart) / 1000.0;
}
void tic(void)
{
LARGE_INTEGER li;
QueryPerformanceCounter(&li);
StartClock = li.QuadPart;
}
/* Returns elapsetime in ms since last call of tic () */
UInt32 toc(void) /* UInt32 is sufficient for 50 days */
{
LARGE_INTEGER li;
double dres;
QueryPerformanceCounter(&li);
dres = (double)(li.QuadPart-StartClock) / PCFreq;
return (UInt32)dres;
}
#define N 10000000
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
UInt32 t, i;
double *p, *pr, *pi;
InitClock();
p = mxMalloc(sizeof(double)*N);
plhs[0] = mxCreateDoubleMatrix(1, N, mxCOMPLEX);
for (i=0; i<3; i++)
{
#if MX_HAS_INTERLEAVED_COMPLEX
p = mxMalloc(2*sizeof(double)*N);
tic();
mxSetDoubles(plhs[0], p);
t = toc();
mexPrintf("\tmxSetDoubles time = %i\n", t);
#else
pr = mxMalloc(sizeof(double)*N);
pi = mxMalloc(sizeof(double)*N);
tic();
mxSetPr(plhs[0], pr);
t = toc();
mexPrintf("\tmxSetPr time = %i\n", t);
tic();
mxSetPi(plhs[0], pi);
t = toc();
mexPrintf("\tmxSetPi time = %i\n", t);
tic();
#endif
}
}
MATLAB scripts
% script testInterleavedCopy.m
n=1e7;
z = rand(1,n)+1i*rand(1,n);
x = rand(1,n);
mex testInterleavedGetPr.c
fprintf('Get for complex array (<R2017b)\n');
tic
testInterleavedGetPr(z);
toc
fprintf('Get for non-complex array (<R2017b)\n');
tic
testInterleavedGetPr(x);
toc
mex -R2018a testInterleavedGetPr.c
fprintf('Get complex array (>=R2018a)\n');
tic
testInterleavedGetPr(z);
toc
fprintf('Get for non-complex array (>=R2018a)\n');
tic
testInterleavedGetPr(x);
toc
mex testInterleavedSetPr.c
fprintf('Set for complex array (<R2017b)\n');
tic
z = testInterleavedSetPr();
toc
mex -R2018a testInterleavedSetPr.c
fprintf('Set for complex array (>=R2018a)\n');
tic
z = testInterleavedSetPr();
toc
Result
>> testInterleavedCopy
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Get for complex array (<R2017b)
mxGetPr time = 43
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
Elapsed time is 0.045455 seconds.
Get for non-complex array (<R2017b)
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
Elapsed time is 0.000221 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Get complex array (>=R2018a)
mxGetDoubles time = 0
mxGetDoubles time = 0
mxGetDoubles time = 0
Elapsed time is 0.001904 seconds.
Get for non-complex array (>=R2018a)
mxGetDoubles time = 0
mxGetDoubles time = 0
mxGetDoubles time = 0
Elapsed time is 0.000151 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Set for complex array (<R2017b)
mxSetPr time = 56
mxSetPi time = 0
mxSetPr time = 0
mxSetPi time = 0
mxSetPr time = 0
mxSetPi time = 0
Elapsed time is 0.110597 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Set for complex array (>=R2018a)
mxSetDoubles time = 0
mxSetDoubles time = 0
mxSetDoubles time = 0
Elapsed time is 0.022854 seconds.
>>

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeWrite C Functions Callable from MATLAB (MEX Files) についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by