mex crash with fftw calls

Question

kai 2012 年 3 月 27 日

1
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/33616-mex-crash-with-fftw-calls

コメント済み: PAW 2019 年 9 月 9 日

I'm trying to compile fftw library functions into a mex C++ file. I managed to compile the file successfully. "mex -O filename.cpp -lm -lfftw3 -output test"

But when I run from matlab cmd window, matlab will crash with segmentation fault.

Anyone has any suggestions, thanks

the code is very simple as follows:

#include "mex.h"
#include "stdlib.h"
#include "stdio.h"
#include "fftw3.h"
/* $Revision: 1.8.6.3 $ */
void timestwo(double y[], double x[])
{
  y[0] = 2.0*x[0];
}
void test_fftw()
{
    int N=8;
    double *in, *in2;
    fftw_complex *out;
    fftw_plan p, p_rev;
    in = (double*)fftw_malloc(sizeof(double)*N);
    in2 =(double*) fftw_malloc(sizeof(double) * N);
    out = (fftw_complex*)fftw_malloc(sizeof(fftw_complex) * N);
    fftw_free(in);
    fftw_free(in2);
    fftw_free(out);
}
void mexFunction( int nlhs, mxArray *plhs[],
                  int nrhs, const mxArray *prhs[] )
{
  double *x,*y;
  mwSize mrows,ncols;
    /* Check for proper number of arguments. */
    if(nrhs!=1) {
      mexErrMsgTxt("One input required.");
    } else if(nlhs>1) {
      mexErrMsgTxt("Too many output arguments.");
    }
    /* The input must be a noncomplex scalar double.*/
    mrows = mxGetM(prhs[0]);
    ncols = mxGetN(prhs[0]);
    if( !mxIsDouble(prhs[0]) || mxIsComplex(prhs[0]) ||
        !(mrows==1 && ncols==1) ) {
      mexErrMsgTxt("Input must be a noncomplex scalar double.");
    }
    /* Create matrix for the return argument. */
    plhs[0] = mxCreateDoubleMatrix(mrows,ncols, mxREAL);
    /* Assign pointers to each input and output. */
    x = mxGetPr(prhs[0]);
    y = mxGetPr(plhs[0]);
    /* Call the timestwo subroutine. */
    printf("fjiejfiej\n");
    test_fftw();
    timestwo(y,x);
}

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

PAW 2019 年 8 月 9 日

MATLAB Online で開く

Thanks a lot for the fast answer...

Unfortunately, my bug seems to be different, I was too fast in reading the post. On my side : I compile a file on Linux64, run it 1 million times without an issue and with the right output see the code below). Now, if I change the code and recompile it, then automatically, I get a seg fault and need to restart Matlab. I wonder whether this issue is related to Matlab having a similar fftw3...

I don't think the is a memory leak since the code is quite simple.

I have compiled fftw3 with

./configure --enable-threads --enable-openmp --enable-float CFLAGS="-fopenmp -fPIC"
make
sudo make install

Here is a Matlab code to launch the c code and compile it:

mex '-L/usr/local/lib' -lfftw3_omp -lfftw3 -lm -I/usr/local/include/ Evaluate_prod_conv_elementary.cpp CXXFLAGS="\$CXXFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp"
% REAL Test
sh1=3;sh2=8;
sb1=2;sb2=5;
Kh=[[-floor(sh1/2),ceil(sh1/2)-1];[-floor(sh2/2),ceil(sh2/2)-1]];
Kb=[[-floor(sb1/2),ceil(sb1/2)-1];[-floor(sb2/2),ceil(sb2/2)-1]];
h=rand(sh1,sh2);
b=rand(sb1,sb2);
[c,Kc] = Evaluate_prod_conv_elementary(h,Kh,b,Kb);
ct=conv2(h,b,'full');
disp(norm(c-ct)/norm(ct));

Here is my c++ code (sorry for the imperfections)

// COMPILE WITH
// mex '-L/usr/local/lib' -lfftw3_omp -lfftw3 -lm Evaluate_prod_conv_elementary.cpp CXXFLAGS="\$CXXFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp"
#include "mex.h"
#include <math.h>
#include <iostream>
#include <iomanip>
//#include <stdlib.h>
#include "fftw3.h"
#include <omp.h>
using namespace std;
// Modulo operator
int mod(int a, int b)
{
    int r = a % b;
    return r < 0 ? r + b : r;
}
// Displays a real array (row major) (VALIDATED)
void disp_array(double *u,int n0,int n1){
    cout<<"Size: " << n0 << " -- " << n1<<"\n";
    for (int i=0;i<n0;++i){
        for (int j=0;j<n1;++j){
            cout<<fixed<<setprecision(2) << u[j+i*n1] << " " ;
        }
        cout << "\n";
    }
    cout << "\n\n";
}
// Displays a real array (VALIDATED)
void disp_array2(double *u,int n){
    for (int i=0;i<n;++i){
        printf("%1.4f     ",u[i]);
    }
    printf("\n\n");
}
/**
 * Given two matrices of the form
 *
 *      |l1(K),L1(K)|
 *  K = |  .  ,  .  |
 *      |  .  ,  .  |
 *      |ld(K),Ld(K)|
 *
 * that define a d-dimensional domain [l1,L1]x[l2,L2]x...x[ld,Ld], this
 * function returns support Kout of the convolution of functions supported
 * of K1 and K2:
 *
 *         |l1(K1)+l1(K2),L1(K1)+L1(K2)|
 *  Kc  =  |      .      ,      .      |
 *         |      .      ,      .      |
 *         |ld(K1)+ld(K2),Ld(K1)+Ld(K2)|
 **/
void add_domain(int **K1, int **K2,int d, int **Kc){
    for (int i=0; i<d; i++){
        Kc[i][0] = K1[i][0]+K2[i][0];
        Kc[i][1] = K1[i][1]+K2[i][1];
    }
}
/**
 * Compute the sum of the elements of an array.
 **/
double sum_array(double *a, int n){
    double s = 0;
    for (int i=0; i<n; i++){
        s += a[i];
    }
    return s;
}
/**
 * Compute the product of all the element within an array.
 **/
double prod_array(double *a, int n){
    double s = 1;
    for (int i=0; i<n; i++){
        s *= a[i];
    }
    return s;
}
/**
 * Given a matrix a with d dimensions, d=2 or d=3, and a matrix Ka, the
 * support of a, where
 *      |l1(K),L1(K)|
 *  K = |  .  ,  .  |
 *      |  .  ,  .  |
 *      |ld(K),Ld(K)|
 *
 * defines a d-dimensional domain [l1,L1]x[l2,L2]x...x[ld,Ld].
 *
 * The function returns b, that is an extension of a on the domain spaned
 * by Kb filled with 0 values.
 **/
void zero_padding(double *a, int Ka[][2], double *b, int Kb[][2], int d){
    int inda, indb;
    for (int i=Kb[0][0]; i<=Kb[0][1]; i++){
        for (int j=Kb[1][0]; j<=Kb[1][1]; j++){
            if (d==3){
                for (int k=Kb[2][0]; k<=Kb[2][1]; k++){
                    if (i<=Ka[0][1] && j<=Ka[1][1] && k<=Ka[2][1] && i>=Ka[0][0] && j>=Ka[1][0] && k>=Ka[2][0]){
                        b[((k-Kb[2][0])+(Kb[2][1]-Kb[2][0]+1)*((j-Kb[1][0])+(Kb[1][1]-Kb[1][0]+1)*(i-Kb[0][0])))]= a[((k-Ka[2][0])+(Ka[2][1]-Ka[2][0]+1)*((j-Ka[1][0])+(Ka[1][1]-Ka[1][0]+1)*(i-Ka[0][0])))];
                    } else{
                        b[((k-Kb[2][0])+(Kb[2][1]-Kb[2][0]+1)*((j-Kb[1][0])+(Kb[1][1]-Kb[1][0]+1)*(i-Kb[0][0])))] = 0;
                    }
                }
            }else{
                inda=(j-Ka[1][0])+(Ka[1][1]-Ka[1][0]+1)*(i-Ka[0][0]);
                indb=(j-Kb[1][0])+(Kb[1][1]-Kb[1][0]+1)*(i-Kb[0][0]);
                if (i<=Ka[0][1] && j<=Ka[1][1] && i>=Ka[0][0] && j>=Ka[1][0]){                    
                    b[indb]= a[inda];
                } else{
                    b[indb] = 0;
                }
            }
        }
    }
}
/**
 * Convert array in from matlab (column major), to C array (row major). (VALIDATED)
 **/
void matlab_to_array(double *in, double *out, int *size, int d){
    if (d==3){
        for (int k=0; k<size[2]; k++){
            for (int j=0; j<size[1]; j++){
                for (int i=0; i<size[0]; i++){
                    out[(int) (k+size[2]*(j+size[1]*i))]= in[(int) (i+size[0]*(j+size[1]*k))];
                }
            }
        }
    }else{
        for (int j=0; j<size[1]; j++){
            for (int i=0; i<size[0]; i++){
                out[j+size[1]*i]= in[i+size[0]*j];
            }
        }
    }
}
/**
 * Convert array in from c array (rowmajor), to matlab array (column major).
 **/
void array_to_matlab(double *in, double *out, int *size, int d){
    if (d==3){
        for (int k=0; k<size[2]; k++){
            for (int j=0; j<size[1]; j++){
                for (int i=0; i<size[0]; i++){
                    out[i+size[0]*(j+size[1]*k)]= in[k+size[2]*(j+size[1]*i)];
                }
            }
        }
    }else{
        for (int j=0; j<size[1]; j++){
            for (int i=0; i<size[0]; i++){
                out[i+size[0]*j]= in[j+size[1]*i];
            }
        }
    }
}
/**
 * Compute the element wise product between two fftw_complex arrays.
 **/
void product_carray(fftw_complex *u1,fftw_complex *u2,fftw_complex *out,int n){
#pragma omp parallel for
    for (int i=0;i<n;++i){
        out[i][0]=u1[i][0]*u2[i][0]-u1[i][1]*u2[i][1];
        out[i][1]=u1[i][1]*u2[i][0]+u1[i][0]*u2[i][1];
    }
}
/**
 * Divide each element of an array by its length n.
 **/
void normalize(double *u,int n){
#pragma omp parallel for
    for (int i=0;i<n;++i){
        u[i]=u[i]/n;
    }
}
/**
 * circular shift of 2D array.
 **/
void circshift(double *in, double *out, int *size, int shift0, int shift1){
    for (int i=0;i<size[0];i++){
        for (int j=0;j<size[1];j++){
            int i_mod = mod((i+shift0),size[0]);
            int j_mod = mod((j+shift1),size[1]);
            
            //std::cout<< i_mod <<" " << j_mod<<"\n";
            out[j_mod+i_mod*size[1]] = in[j+i*size[1]];
        }
    }
}
/**
 * Compute the convolution between two real value array.
 **/
void convolution2D(double *a, double *b, double *c, int *size){
    fftw_complex *fa,*fb,*fc;
    fftw_plan plan_a, plan_b, plan_c;
    
    //fftw_init_threads();
    //fftw_plan_with_nthreads(omp_get_max_threads());
    
    int size0 = size[0];
    int size1 = size[1];
    
    fa = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*size0*size1);
    plan_a = fftw_plan_dft_r2c_2d(size0,size1,a,fa, FFTW_ESTIMATE);
    fftw_execute(plan_a);
    
    fb = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*size0*size1);
    plan_b = fftw_plan_dft_r2c_2d(size0,size1,b,fb, FFTW_ESTIMATE);
    fftw_execute(plan_b);
    
    fc = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*size0*size1);
    product_carray(fa,fb,fc,size0*size1);
    
    plan_c = fftw_plan_dft_c2r_2d(size0,size1,fc,c, FFTW_ESTIMATE);
    fftw_execute(plan_c);
    normalize(c,size0*size1);
    fftw_destroy_plan(plan_a);
    fftw_destroy_plan(plan_b);
    fftw_destroy_plan(plan_c);
    
    fftw_free(fa);
    fftw_free(fb);
    fftw_free(fc);   
}
// Entry point for Matlab
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]){
    // Ouput : c=conv(h,b), Kc, nb_elements in Kc
    // Input : h, Kh, b, Kb
    
    // Check for proper input
    switch(nrhs) {
        case 4 :
            break;
        default: mexErrMsgTxt("Bad number of inputs.\n");
        break;
    }
    if (nlhs > 3) {mexErrMsgTxt("Too many outputs.\n");}
    
    double *h_matlab, *b_matlab; // input in column major
    double *Khm, *Kbm;
    int d;
    
    h_matlab=mxGetPr(prhs[0]);
    Khm=mxGetPr(prhs[1]);
    b_matlab=mxGetPr(prhs[2]);
    Kbm=mxGetPr(prhs[3]);
    d = (int) mxGetDimensions(prhs[1])[0];
    if (d > 2) {mexErrMsgTxt("Only 2D images for now.\n");}
    
    // Here we cast the Matlab domain to C-domain, add the two domains and strore the result in Kc
    int Kh[d][2], Kb[d][2], Kc[d][2];
    for (int i=0;i<d;++i){
        Kh[i][0]=Khm[i];Kh[i][1]=Khm[i+d];
        Kb[i][0]=Kbm[i];Kb[i][1]=Kbm[i+d];
        Kc[i][0]=Kb[i][0]+Kh[i][0];Kc[i][1]=Kb[i][1]+Kh[i][1];
    }
    
    // Get size
    int size_h[d];
    int size_b[d];
    int size_c[d];
    double len_h, len_b;
    len_h=(int) mxGetNumberOfElements(prhs[0]);
    len_b=(int) mxGetNumberOfElements(prhs[2]);
    int len_c=1;
    for (int i=0; i<d; i++){
        size_h[i] = (int) mxGetDimensions(prhs[0])[i];
        size_b[i] = (int) mxGetDimensions(prhs[2])[i];
        size_c[i] = (Kc[i][1]-Kc[i][0]+1);
        len_c*=size_c[i];
    }    
        
    // Outputs
    plhs[0] = mxCreateDoubleMatrix(size_c[0],size_c[1],mxREAL);
    plhs[1] = mxCreateDoubleMatrix(d,2,mxREAL);
    
    // Convert into row-major
    double *h = new double[(int)len_h];
    double *b = new double[(int)len_b];
    matlab_to_array(h_matlab,h,size_h,d);
    matlab_to_array(b_matlab,b,size_b,d);
    
    // Create output arguments
    double *c = new double[(int)len_c];
    double *Kcm=mxGetPr(plhs[1]);
    double *c_matlab=mxGetPr(plhs[0]);
    
    // Add domains and compute resulting size
    for (int i=0;i<d;++i){
        Kcm[i]=Kc[i][0];Kcm[i+d]=Kc[i][1];
    }
    
    double *h_ext = new double[len_c];
    double *b_ext = new double[len_c];
    double *cs = new double[(int)len_c];
    zero_padding(h,Kh,h_ext,Kc,d);
    zero_padding(b,Kb,b_ext,Kc,d);
    convolution2D(h_ext,b_ext,c,size_c);
    circshift(c, cs, size_c, Kc[0][0],Kc[1][0]);
    array_to_matlab(cs,c_matlab,size_c,d);
    
    // Freeing arrays
    free(h_ext);
    free(b_ext);
    free(h);
    free(b);
    free(c);
    free(cs);
}

Bruno Luong 2019 年 8 月 9 日

編集済み: Bruno Luong 2019 年 8 月 10 日

Edit See my answer bellow

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Kaustubha Govind 2012 年 4 月 2 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/33616-mex-crash-with-fftw-calls#answer_43004

I don't have any experience with this, but my only guess is that your code is somehow not playing well with the FFTW libraries which MATLAB itself loads. I think MATLAB ships with its own FFTW libraries in $matlabroot/bin/$arch named libmwfftw3 - perhaps you should link into this library instead of libfftw3?

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

PAW 2019 年 9 月 9 日

Dear Kaustubha Govind,

you were right. When we use conv2d in the Matlab code, then we got crash after the second compilation. If we don't, then everyting seems to work. So yes, it seems that there is a conflict between the different libraries...

This is a serious problem with Matlab that I already experienced with CGAL (conflict with voronoin...). At the end I had to use Python because Matlab was 2 orders of magnitude slowlier...

Best,

Pierre

サインインしてコメントする。

Answer 2

PAW 2019 年 8 月 9 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/33616-mex-crash-with-fftw-calls#answer_386985

I have the same problem. Any idea 7 years later?

Best regards,

Pierre

6 件のコメント
4 件の古いコメントを表示4 件の古いコメントを非表示

Steven Lord 2019 年 8 月 9 日

Bruno, have you explored using MATLAB Coder to generate code for your functions that use fft?

The "C/C++ Code Generation" item in the Extended Capabilities section of the fft documentation page states "For MEX output, MATLAB® Coder™ uses the library that MATLAB uses for FFT algorithms. For standalone C/C++ code, by default, the code generator produces code for FFT algorithms instead of producing FFT library calls. To generate calls to a specific installed FFTW library, provide an FFT library callback class. For more information about an FFT library callback class, see coder.fftw.StandaloneFFTW3Interface."

Bruno Luong 2019 年 8 月 9 日

編集済み: Bruno Luong 2019 年 8 月 9 日

I did tried when the C-coder when the product just came out. My assertion is that the code generated is not good enough for us, because some time we need code quite compact and fast and we prefer to directly code in in C.

Also for safety certification, it is easier to work with a code developed by hand as opposed to code that is generated by a tool.

The FFTW is only the tip of the iceberg.

サインインしてコメントする。