Error was detected while a MEX-file was running and MATLAB is exiting because of fatal error
    6 views (last 30 days)
  
       Show older comments
    
    Naveen kumar Elumalai
 on 14 Mar 2019
  
    
    
    
    
    Commented: Srinidhi Ganeshan
 on 21 Mar 2019
            I am trying to run the batched version of QR (dgeqrfbatched) in matlab using CUBLAS by calling it from a mex file. I am struck with this error which i am not able to find the answer , Is there any work around for this problem? I am attaching the code that i am running and also the crash report
#include "mex.h"
#include "cublas_v2.h"
// The MEX gateway function.
    void mexFunction(int nlhs, mxArray *plhs[], int nrhs,const mxArray *prhs[]) 
    {
    // Get input variables from Matlab (host variables).
    double **A;
    // Get dimensions of input variables from Matlab.
    size_t m, n, k;
    const mwSize *Adims;
    Adims = mxGetDimensions(prhs[0]);
    //Bdims = mxGetDimensions(prhs[1]);
    m = Adims[0];
    n = Adims[1];
    k = Adims[2];
    A = (double**)mxGetPr(prhs[0]);
    int lda = m;
    const int batchSize=k;
//step -1 Allocate storage for batch count 
    double **tau;
    tau = (double**)malloc(batchSize * sizeof(double*));        
    for (int i = 0; i < batchSize; i++) 
    {
    		tau[i] = (double*)malloc(n * sizeof(double));
    }
      int *info;
	info = (int*)malloc(batchSize * sizeof(int));
//step -2 create host pointer array to the gpu array
    double **d_A, **d_TAU, **h_d_A, **h_d_TAU;
	h_d_A = (double**)malloc(batchSize * sizeof(double*));
	h_d_TAU = (double**)malloc(batchSize * sizeof(double*));
	for (int i = 0; i < batchSize; i++) {
		cudaMalloc((double**)&h_d_A[i], m*n * sizeof(double));
		cudaMalloc((double**)&h_d_TAU[i], n * sizeof(double));
	}
//step -3 copy host array of pointers to device
	cudaMalloc((double**)&d_A, batchSize * sizeof(double*));
	cudaMalloc((double**)&d_TAU, batchSize * sizeof(double));
	cudaMemcpy(d_A, h_d_A, batchSize * sizeof(double*), cudaMemcpyHostToDevice);
	cudaMemcpy(d_TAU, h_d_TAU, batchSize * sizeof(double*), cudaMemcpyHostToDevice);
	for (int i = 0; i < batchSize; i++)
	{
		cudaMemcpy(h_d_A[i], A[i], m *n * sizeof(double), cudaMemcpyHostToDevice);
		cudaMemcpy(h_d_TAU[i], tau[i], n * sizeof(double), cudaMemcpyHostToDevice);
	}
// --- CUBLAS initialization
	cublasHandle_t cublas_handle;
	cublasCreate(&cublas_handle);
	cublasDgeqrfBatched(cublas_handle, m, n, d_A, lda, d_TAU, info, batchSize);
	for (int i = 0; i < batchSize; i++)
		cudaMemcpy(A[i], h_d_A[i], m*n * sizeof(double), cudaMemcpyDeviceToHost);
//print the A matrix
	for (int k = 0; k < batchSize; k++) {
		for (int j = 0; j < m; j++) {
			for (int i = 0; i < n; i++) {
				int index = j * m + i;//not tested
				//count = count + 1;
				printf("\n %d The values are %lf",k+index, A[k][index]);
			} // i
		} // j
	} // k
}
When i execute the above program this is the crash report i am getting. 
--------------------------------------------------------------------------------
       Segmentation violation detected at Thu Mar 14 08:52:21 2019 -0700
--------------------------------------------------------------------------------
Configuration:
  Crash Decoding           : Disabled - No sandbox or build area path
  Crash Mode               : continue (default)
  Default Encoding         : UTF-8
  Deployed                 : false
  GNU C Library            : 2.24 stable
  Graphics Driver          : Unknown software 
  Java Version             : Java 1.8.0_144-b01 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
  MATLAB Architecture      : glnxa64
  MATLAB Entitlement ID    : 1378095
  MATLAB Root              : /cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/matlab/2018a
  MATLAB Version           : 9.4.0.813654 (R2018a)
  OpenGL                   : software
  Operating System         : "CentOS Linux release 7.6.1810 (Core) "
  Process ID               : 171494
  Processor ID             : x86 Family 6 Model 79 Stepping 1, GenuineIntel
  Session Key              : 32be7088-53bb-46ac-a878-a0e4028bfd50
  Static TLS mitigation    : Disabled: Unnecessary 1
  Window System            : No active display
Fault Count: 1
Abnormal termination
Register State (from fault):
  RAX = 0000000000000000  RBX = 0000000000000000
  RCX = 00002b105df7b080  RDX = 0000000000000000
  RSP = 00002b1073ffcc00  RBP = 00002b1073ffcc10
  RSI = 00002b1073ffcf10  RDI = 0000000000000000
   R8 = 00002b1042d133e8   R9 = 0000000000000030
  R10 = 000000000000042b  R11 = 00002b1047aff750
  R12 = 0000000000000000  R13 = 00002b1073ffcf10
  R14 = 0000000000000000  R15 = 00002b105df7b080
  RIP = 00002b1047aa710c  EFL = 0000000000010206
   CS = 0033   FS = 0000   GS = 0000
Stack Trace (from fault):
[  0] 0x00002b1047aa710c                               bin/glnxa64/libmx.so+00499980 _ZN6matrix6detail10noninlined12mx_array_api15mxGetDimensionsEPK11mxArray_tag+00000012
[  1] 0x00002b10e29b6c00         /home/naveen/Matlab/Mexcuda/example.mexa64+00003072 mexFunction+00000106
[  2] 0x00002b105dd49080                              bin/glnxa64/libmex.so+00413824
[  3] 0x00002b105dd49447                              bin/glnxa64/libmex.so+00414791
[  4] 0x00002b105dd49f2b                              bin/glnxa64/libmex.so+00417579
[  5] 0x00002b105dd3430c                              bin/glnxa64/libmex.so+00328460
[  6] 0x00002b105bdca2ad                   bin/glnxa64/libmwm_dispatcher.so+00979629 _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2_+00000829
[  7] 0x00002b105bdcabae                   bin/glnxa64/libmwm_dispatcher.so+00981934 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+00000030
[  8] 0x00002b105ee70da1                          bin/glnxa64/libmwm_lxe.so+12619169
[  9] 0x00002b105ee71982                          bin/glnxa64/libmwm_lxe.so+12622210
[ 10] 0x00002b105ef59fc9                          bin/glnxa64/libmwm_lxe.so+13574089
[ 11] 0x00002b105eefb431                          bin/glnxa64/libmwm_lxe.so+13186097
[ 12] 0x00002b105e7015a8                          bin/glnxa64/libmwm_lxe.so+04822440
[ 13] 0x00002b105e703cbc                          bin/glnxa64/libmwm_lxe.so+04832444
[ 14] 0x00002b105e70001d                          bin/glnxa64/libmwm_lxe.so+04816925
[ 15] 0x00002b105e6f9ba1                          bin/glnxa64/libmwm_lxe.so+04791201
[ 16] 0x00002b105e6f9dd9                          bin/glnxa64/libmwm_lxe.so+04791769
[ 17] 0x00002b105e6ff846                          bin/glnxa64/libmwm_lxe.so+04814918
[ 18] 0x00002b105e6ff92f                          bin/glnxa64/libmwm_lxe.so+04815151
[ 19] 0x00002b105e82e503                          bin/glnxa64/libmwm_lxe.so+06055171
[ 20] 0x00002b105e831cf3                          bin/glnxa64/libmwm_lxe.so+06069491
[ 21] 0x00002b105ed41f6d                          bin/glnxa64/libmwm_lxe.so+11378541
[ 22] 0x00002b105ecef60c                          bin/glnxa64/libmwm_lxe.so+11040268
[ 23] 0x00002b105ecf6448                          bin/glnxa64/libmwm_lxe.so+11068488
[ 24] 0x00002b105ecf7e22                          bin/glnxa64/libmwm_lxe.so+11075106
[ 25] 0x00002b105ed85807                          bin/glnxa64/libmwm_lxe.so+11655175
[ 26] 0x00002b105ed85aea                          bin/glnxa64/libmwm_lxe.so+11655914
[ 27] 0x00002b105dab591a                         bin/glnxa64/libmwbridge.so+00207130 _Z8mnParserv+00000874
[ 28] 0x00002b105b7bebb8                            bin/glnxa64/libmwmcr.so+00641976
[ 29] 0x00002b1048475e9f                         bin/glnxa64/libmwmlutil.so+06524575 _ZNSt13__future_base13_State_baseV29_M_do_setEPSt8functionIFSt10unique_ptrINS_12_Result_baseENS3_8_DeleterEEvEEPb+00000031
[ 30] 0x00002b1046e464f9 /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib/libpthread.so.0+00058617
[ 31] 0x00002b1048476126                         bin/glnxa64/libmwmlutil.so+06525222 _ZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0_+00000102
[ 32] 0x00002b105b7be9d3                            bin/glnxa64/libmwmcr.so+00641491
[ 33] 0x00002b10435a61a2                            bin/glnxa64/libmwmvm.so+03367330 _ZN14cmddistributor15PackagedTaskIIP10invokeFuncIN7mwboost8functionIFvvEEEEENS2_10shared_ptrINS2_13unique_futureIDTclfp_EEEEEERKT_+00000082
[ 34] 0x00002b10435a64e8                            bin/glnxa64/libmwmvm.so+03368168 _ZNSt17_Function_handlerIFN7mwboost3anyEvEZN14cmddistributor15PackagedTaskIIP10createFuncINS0_8functionIFvvEEEEESt8functionIS2_ET_EUlvE_E9_M_invokeERKSt9_Any_data+00000024
[ 35] 0x00002b105b206e6c                            bin/glnxa64/libmwiqm.so+00867948 _ZN7mwboost6detail8function21function_obj_invoker0ISt8functionIFNS_3anyEvEES4_E6invokeERNS1_15function_bufferE+00000028
[ 36] 0x00002b105b20697f                            bin/glnxa64/libmwiqm.so+00866687 _ZN3iqm18PackagedTaskPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000447
[ 37] 0x00002b105b1e4ab1                            bin/glnxa64/libmwiqm.so+00727729
[ 38] 0x00002b105b1c7ac8                            bin/glnxa64/libmwiqm.so+00608968
[ 39] 0x00002b105b1c28bf                            bin/glnxa64/libmwiqm.so+00587967
[ 40] 0x00002b1044694a05                       bin/glnxa64/libmwservices.so+03262981
[ 41] 0x00002b1044695ff2                       bin/glnxa64/libmwservices.so+03268594
[ 42] 0x00002b10446968fb                       bin/glnxa64/libmwservices.so+03270907 _Z25svWS_ProcessPendingEventsiib+00000187
[ 43] 0x00002b105b7bffc3                            bin/glnxa64/libmwmcr.so+00647107
[ 44] 0x00002b105b7c06a4                            bin/glnxa64/libmwmcr.so+00648868
[ 45] 0x00002b105b7b93f1                            bin/glnxa64/libmwmcr.so+00619505
[ 46] 0x00002b1046e3f1f4 /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib/libpthread.so.0+00029172
[ 47] 0x00002b104554c16f /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib/libc.so.6+00950639 clone+00000095
[ 48] 0x0000000000000000                                   <unknown-module>+00000000
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
2 Comments
  Edric Ellis
    
      
 on 15 Mar 2019
				Not really an answer to your question as such - but note that if you have Parallel Computing Toolbox, you might be able to use pagefun - it doesn't support QR directly, but it does support batched mldivide...
Accepted Answer
  James Tursa
      
      
 on 14 Mar 2019
        
      Edited: James Tursa
      
      
 on 14 Mar 2019
  
      Can you explain what you intended with these lines for A:
    double **A;
        :
    A = (double**)mxGetPr(prhs[0]);
If you pass in a regular double array, there are doubles in the data area of prhs[0], not pointers to doubles.  You've got one too many levels of indirection here.  What were your intentions with this?
Downstream in your code you appear to use A[i] as a pointer in a memory copy. Since there are doubles behind A, and not pointers to doubles behind A, you would be using a floating point double bit pattern as a pointer and this will crash MATLAB.
3 Comments
  James Tursa
      
      
 on 15 Mar 2019
				
      Edited: James Tursa
      
      
 on 21 Mar 2019
  
			Using A will point to the first batch (we typically use the term "plane" or "page" here to refer to the first 2D slice of a multi-dimensional array). To point to the next plane, simply increment the pointer by the appropriate amount. E.g.,
A points to the first plane
A+m*n points to the second plane
A+m*n*2 points to the third plane
A+m*n*3 points to the fourth plane
etc.
So, programatically you would simply use A+m*n*i as your pointer to the plane you want to process, where i is a 0-based index (like you currently have in your for-loop).
More Answers (0)
See Also
Categories
				Find more on Introduction to Installation and Licensing in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!