# Complex matrix inversion using LAPACK in MEX

27 views (last 30 days)
Henry on 28 Jan 2011
Commented: Jan on 7 Jun 2019
Hi, I am trying to speed up my matrix inversion by writing a routine in Mex that calls the zgetri function in LAPACK library directly. The routine seems to produce the right answer and returns the answer successfully to the main program in Mex, but it crashes in the end. I am pretty sure there is some memory management bug in my code. Hopefully someone can shed some light to this annoying bug that I could not find. Thanks in advance!
Here is my matrix inversion routine:
extern void invertComplexMatrix(double *MatR, double *MatI, double *invMatR, double *invMatI, int n)
{
int i,*ipiv,lwork,info;
double *wk;
double *zin,*temp,*outR,*outI;
mxArray *in, *out;
lwork = 10*n;
ipiv = (int *)malloc(2*n*sizeof(int) );
wk = (double *)malloc(2*lwork*sizeof(double));
/* Convert input to Fortran format */
in = mxCreateDoubleMatrix(n, n, mxCOMPLEX);
out = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
memcpy(mxGetPr(in), MatR, n * n * sizeof(double));
memcpy(mxGetPi(in), MatI, n * n * sizeof(double));
zin = mat2fort(in, n, n);
temp = zin;
/* Check ill-condition */
zgetrf(&n, &n, zin, &n, ipiv, &info);
/*Invert the complex matrix*/
zgetri(&n, temp, &n, ipiv, wk, &lwork, &info);
out = fort2mat(temp, n, n, n);
outR = mxGetPr(out);
outI = mxGetPi(out);
/* Copy the real and imag parts of the output to output pointers*/
memcpy(invMatR, outR, n * n * sizeof(double));
memcpy(invMatI, outI, n * n * sizeof(double));
/* Check to see if producing correct result*/
mexPrintf("outR[0], outI[0] are: %f,%f\n",outR[200],outI[200]);
/* Free intermediate Fortran format arrays */
mxFree(zin);
mxFree(temp);
mxFree(outR);
mxFree(outI);
mxDestroyArray(in);
};
/*------------------------------------*/
/*Main program:*/
void mexFunction (int nout, mxArray *pout[], int nin, const mxArray *pin[])
{
/*Declaration up top*/
int n = 200;
double *MatR, *MatI, *inMatR, invMatI;
mxArray *Mat, *InvMat;
/*Allocate input matrix*/
Mat = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
MatR = mxGetPr(Mat);
MatI = mxGetPi(Mat);
/*Assign value to MatR, MatI*/
blah blah
/*Allocate output matrix*/
InvMat = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
InvMatR = mxGetPr(InvMat);
InvMatI = mxGetPi(InvMat);
/*Call the matrix inversion routine*/
invertComplexMatrix(MatR,MatI,InvMatR, InvMatI,n);
/*Check to see if producing correct result, should be the same as the one printed inside the routine*/
mexPrintf("outR[0], outI[0] are: %f,%f\n",InvMatR[200],InvMatI[200]);
/*Manipulate the output Matrix, i.e., InvMatR, InvMatI */
blah blah
/*Free the memory*/
mxDestroyArray(Mat);
mxDestroyArray(InvMat);
mexPrintf("check to see if reaching the end of program");
}
So everything went fine, and the last message ("check to see if reaching the end of program") got printed out okay, but then the program crashed.
I often write code in Matlab but this Mex thing is a bit new to me. Any help is very much appreciated.
Thanks,
Henry.

Jan on 28 Jan 2011
Dear Henry, There are very strict rule of how memory is allocated and freed in C. E.g. this will kill the application:
double *outR;
outR = mxGetPr(out);
mxFree(outR);
The rules are simple: just apply mxFree, if an array was obtained by mxMalloc (of mxCalloc or mxRealloc). If you really need to call malloc instead of mxMalloc, apply free to release the memory. And if you copy a pointer to an array, never free both as done here:
temp = zin;
...
mxFree(zin);
mxFree(temp);
Good luck, Jan

Henry on 28 Jan 2011
Dear Jan,
Thanks so much for the tips. Yes, it solved the crash problem that I encountered. However, as I tried a more complicated case where the matrix inversion routine was called in a loop of 1000 iterations, the program crashed after a few hundred iterations. Maybe you could have a good insight of what the problem was. Here is what I did (with your suggestions taken into account):
extern void invertComplexMatrix(double *MatR, double *MatI, double *invMatR, double *invMatI, int n)
{
int i,*ipiv,lwork,info;
double *wk;
double *zin,*temp,*outR,*outI;
mxArray *in, *out;
lwork = 10*n;
ipiv = (int *)mxMalloc(2*n*sizeof(int) );
wk = (double *)mxMalloc(2*lwork*sizeof(double));
/* Convert input to Fortran format */
in = mxCreateDoubleMatrix(n, n, mxCOMPLEX);
out = mxCreateDoubleMatrix(n, n, mxCOMPLEX);
memcpy(mxGetPr(in), MatR, n * n * sizeof(double));
memcpy(mxGetPi(in), MatI, n * n * sizeof(double));
zin = mat2fort(in, n, n);
temp = zin;
/* Check ill-condition */
zgetrf(&n, &n, zin, &n, ipiv, &info);
/*Invert the complex matrix*/
zgetri(&n, temp, &n, ipiv, wk, &lwork, &info);
out = fort2mat(temp, n, n, n);
outR = mxGetPr(out);
outI = mxGetPi(out);
/* Copy the real and imag parts of the output to output pointers*/
memcpy(invMatR, outR, n * n * sizeof(double));
memcpy(invMatI, outI, n * n * sizeof(double));
/* Check to see if producing correct result*/
mexPrintf("outR[0], outI[0] are: %f,%f\n",outR[200],outI[200]);
/* Free intermediate Fortran format arrays */
mxFree(ipiv);
mxFree(wk);
mxDestroyArray(in);
mxDestroyArray(out);
};
/* Main program */
....
for (i = 0;i<1000;i++)
{
/*Allocate input matrix*/
Mat = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
MatR = mxGetPr(Mat);
MatI = mxGetPi(Mat);
/*Assign value to MatR, MatI*/
blah blah
/*Allocate output matrix*/
InvMat = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
InvMatR = mxGetPr(InvMat);
InvMatI = mxGetPi(InvMat);
/*Call the matrix inversion routine*/
invertComplexMatrix(MatR,MatI,InvMatR, InvMatI,n);
/*Manipulate the output Matrix to get some new output, for example */
out[i] = InvMatR[0] + InvMatI[0];
blah blah
/*Free the memory*/
mxDestroyArray(Mat);
mxDestroyArray(InvMat);
}
My questions are:
1) Is it okay to destroy the two mxArray in & out in the invertComplexMatrix routine?
2) Is it okay to allocate the two mxArray Mat & InvMat and then destroy them in the loop like the way I did?
Or if you see anything wrong or potentially wrong, please also let me know. I am a just a beginner in C/Mex programming so I'm quite sure that I've made a lot of dumb mistakes here.
Thank you so much,
Henry.

Jan on 28 Jan 2011
I do not see the bug. As far as I can see both questions are ok. It would be more efficient to create Mat and InvMat before the for(i) loop and destroy if after the loop once only. But at least this is no memory leak.
I'm not sure about ipiv and lwork. I expect that n values are enough for ipiv, but you reserve 2*n - but too much memory will not cause a problem. Is 10*n a valid size for lwork? I ever ask the Lapack function with lwork=-1 for their favourite workspace size.
But most of all I'd catch the info value and ask XERBLA in case of problems. If Lapack offers such a helpful mechanism for debugging, it is worth to use it.
I don't know mat2fort and fort2mat. You do not mxFree(zin), please look in the docs, if this is needed. But you create "out" twice:
out = mxCreateDoubleMatrix(n, n, mxCOMPLEX)
...
out = fort2mat(temp, n, n, n);
So mxFree(out) will probably not be able to cleanup the memory correctly.
A complex problem for a C-Mex beginner! Good luck, Jan

Henry on 29 Jan 2011
Thank you, Jan. I will look into it more carefully to find the bug. The program worked fine when I used mexCallMatlab to call Matlab's "inv" function but the speed was too slow. I know Matlab inv function uses LAPACK to do the inversion so I hope it is worth all the troubles of writing a direct inversion routine inside my MEX file using LAPACK to gain some speed up. Thanks again. Henry.

#### 1 Comment

Jan on 29 Jan 2011
You wil gain more speed, if you allocare the temporary arrays once only outside the loops. This will be the main source of acceleration in a C-mex compared to Matlab, because the actual inversion is perfromed by the same Lapack routine.

James Tursa on 29 Jan 2011
Edited: Jan on 7 Jun 2019
I took your code and modified it to be a mex routine that takes a 2D double complex input (caution: no argument checking done!) and returns the inverse back to the MATLAB workspace. I also put in some LAPACK prototypes, changed your int's to mwSignedIndex's (you should always do this for BLAS and LAPACK calls), and simplified the data copying (you were copying the data twice which was unnecessary). I also didn't put in any checks for the LAPACK function return INFO. Hopefully you can adapt this for your own use.
#include "mex.h"
#ifndef MWSIZE_MAX #define mwIndex int #define mwSignedIndex int #define mwSize int #endif
void zgetrf(mwSignedIndex *, mwSignedIndex *, double *, mwSignedIndex *, mwSignedIndex *, mwSignedIndex *);
void zgetri(mwSignedIndex *, double *, mwSignedIndex *, mwSignedIndex *, double *, mwSignedIndex *, mwSignedIndex *);
void invertComplexMatrix(double *MatR, double *MatI, double *InvMatR, double *InvMatI, mwSignedIndex n)
{ mwSignedIndex i, *ipiv, lwork, info; mwSignedIndex n2 = n*n; double *wk; double *zin; double *zp;
lwork = 10*n;
ipiv = (mwSignedIndex *) mxMalloc(n*sizeof(*ipiv) );
wk = (double * )mxMalloc(2*lwork*sizeof(*wk));
/* Convert input to Fortran format */
zp = zin = mxMalloc(2*n2*sizeof(*zin));
for( i=0; i<n2; i++ ) {
*zp++ = *MatR++;
*zp++ = *MatI++;
}
/* Check ill-condition */
zgetrf(&n, &n, zin, &n, ipiv, &info);
/*Invert the complex matrix*/
zgetri(&n, zin, &n, ipiv, wk, &lwork, &info);
zp = zin;
for( i=0; i<n2; i++ ) {
*InvMatR++ = *zp++;
*InvMatI++ = *zp++;
}
/* Free intermediate arrays */
mxFree(zin);
mxFree(wk);
mxFree(ipiv);
}
/*------------------------------------*/
/*Main program:*/
void mexFunction (int nout, mxArray *pout[], int nin, const mxArray *pin[])
{ /*Declaration up top*/
mwSignedIndex n;
mxArray *Mat, *InvMat;
double *MatR, *MatI, *InvMatR, *InvMatI;
/*Allocate input matrix*/
Mat = pin[0];
MatR = mxGetPr(Mat);
MatI = mxGetPi(Mat);
n = mxGetM(Mat);
/*Allocate output matrix*/
InvMat = mxCreateDoubleMatrix(n,n,mxCOMPLEX);
InvMatR = mxGetPr(InvMat);
InvMatI = mxGetPi(InvMat);
/*Call the matrix inversion routine*/
invertComplexMatrix(MatR, MatI, InvMatR, InvMatI, n);
pout[0] = InvMat;
}

James Tursa on 29 Jan 2011
Sorry about the formatting of my answer ... I don't know how to control this. It was a simple copy & paste from the editor.
Jan on 7 Jun 2019
Formatting fixed.

Henry on 30 Jan 2011
Thanks James. I tried the code and I encountered a strange bug. If I included the two mexPrintf messages (see the code below), the program ran fine. Strange enough, once I commented out the two mexPrintf's, the program crashed.
void invertComplexMatrix(double *MatR, double *MatI, double *InvMatR, double *InvMatI, mwSignedIndex n)
{ mwSignedIndex i, *ipiv, lwork, info;
mwSignedIndex n2 = n*n;
double *wk;
double *zin;
double *zp;
lwork = 10*n;
ipiv = (mwSignedIndex *) mxMalloc(n*sizeof(*ipiv) );
wk = (double * )mxMalloc(2*lwork*sizeof(*wk));
/* The "magic" mexPrintf that makes the program run*/
mexPrintf("MatR[2],MatI[2] is: %f,%f\n",MatR[2],MatI[2]);
/* Convert input to Fortran format */
zin = mxMalloc(2*n2*sizeof(*zin));
zp = zin;
for( i=0; i<n2; i++ ) {
*zp++ = *MatR++;
*zp++ = *MatI++;
}
/* Check ill-condition */
zgetrf(&n, &n, zin, &n, ipiv, &info);
/*Invert the complex matrix*/
zgetri(&n, zin, &n, ipiv, wk, &lwork, &info);
zp = zin;
for( i=0; i<n2; i++ ) {
*InvMatR++ = *zp++;
*InvMatI++ = *zp++;
}
/* Free intermediate arrays */
mxFree(zin);
mxFree(wk);
mxFree(ipiv);
}
...
/*Main program*/
....
/*The second "magic" mexPrintf*/
mexPrintf("MatR[2],MatI[2] is: %f,%f\n",MatR[2],MatI[2]);
invertComplexMatrix(MatR, MatI, MatInvR, MatInvI, n);
...
It really confuses me as mexPrintf in my understanding should not make any difference here. Maybe I'm missing something that you could help. Thank you.

James Tursa on 30 Jan 2011
Hmmm ... I don't see anything obvious. Works fine on my machine using MSVC and LCC. What OS & compiler are you using?
Henry on 30 Jan 2011
I'm using Windows XP 64-bit and Microsoft Visual C++ 2005.

Henry on 31 Jan 2011
I think there might be a bug in the subroutine. "zgetrf" will change the value of "zin". Hence, using "zin" as an input to "zegtri" might be a problem. What about the following?
void invertComplexMatrix(double *MatR, double *MatI, double *InvMatR, double *InvMatI, mwSignedIndex n)
{ mwSignedIndex i, *ipiv, lwork, info;
mwSignedIndex n2 = n*n;
double *wk;
double *zin;
double *zp;
lwork = 10*n;
ipiv = (mwSignedIndex *) mxMalloc(n*sizeof(*ipiv) );
wk = (double * )mxMalloc(2*lwork*sizeof(*wk));
/* The "magic" mexPrintf that makes the program run*/
mexPrintf("MatR[2],MatI[2] is: %f,%f\n",MatR[2],MatI[2]);
/* Convert input to Fortran format */
zin = mxMalloc(2*n2*sizeof(*zin));
zp = zin;
for( i=0; i<n2; i++ ) {
*zp++ = *MatR++;
*zp++ = *MatI++;
}
/* Check ill-condition */
zgetrf(&n, &n, zin, &n, ipiv, &info);
/*Invert the complex matrix*/
zgetri(&n, zp, &n, ipiv, wk, &lwork, &info);
for( i=0; i<n2; i++ ) {
*InvMatR++ = *zp++;
*InvMatI++ = *zp++;
}
/* Free intermediate arrays */
mxFree(zin);
mxFree(wk);
mxFree(ipiv);
}

James Tursa on 1 Feb 2011
No. That will not work. In the first place, after the loop converting the input arrays to Fortran format the zp pointer is pointing beyond valid memory, so you can't use it as an input to anything without first resetting it to valid memory or you will corrupt memory and bomb the program. But beyond that, the zgetrf routine is supposed to change the zin array in preparation for a zgetri call. e.g., from the netlib.org documentation:
zgetri:
http://www.netlib.org/lapack/complex16/zgetri.f
(The 2nd argument is A)
* A (input/output) COMPLEX*16 array, dimension (LDA,N)
* On entry, the factors L and U from the factorization
* A = P*L*U as computed by ZGETRF.
* On exit, if INFO = 0, the inverse of the original matrix A.
Rather, I think the problem may be the way you compile it and the libraries you use. Or perhaps with the specific inputs you use. Can you post explicit details about how you compile and link, and what the inputs you are testing with?