Clear Filters
Clear Filters

matlabpool local files

1 view (last 30 days)
Daozheng Chen
Daozheng Chen on 26 Feb 2012
Hi:
I am using matlabpool for the jobs which I submit to a cluster. This cluster using Sun Grid Engine (SGE) as the scheduler. I am using "matlabpool local 2;" for each submitted job, and this will a set of files in ".matlab/local_scheduler_data/R2011b", such as "Job10.common.mat."
As more jobs get submitted, The job has a higher chance of getting failed. Sometimes, it gots this error:
Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
Unable to read MAT-file
/home-nfs/dzchen/.matlab/local_scheduler_data/R2011b/Job19/Task1.in.mat
File may be corrupt.
Any suggestion is appreciated!
Thank you!
Daozheng
  1 Comment
Dana
Dana on 25 Sep 2012
We are also having intermittent problems when launching multiple instances of matlab on a compute nodes of our cluster. We get this error sometimes and a variety of other errors that seem to be related to files in the .matlab folder. Was this issue ever resolved?

Sign in to comment.

Accepted Answer

Thomas
Thomas on 25 Sep 2012
Did you validate your SGE configuration? Do you have the SGE-specific submit files on the path? These submit function files are required to submit SGE commands to the queueing system.
We have an SGE scheduler with MATLAB DCS which works pretty well.
THe files for R2012a are:
SGEClusterInfo.m getJobStateFcn.m
communicatingJobWrapper.sh getRemoteConnection.m
communicatingSubmitFcn.m getSubmitString.m
createSubmitScript.m independentJobWrapper.sh
deleteJobFcn.m independentSubmitFcn.m
extractJobId.m
The files for R2011 b and older are
createSubmitScript.m
destroyJobFcn.m
distibsubmit.m
distributedJobWrapper.sh
distributedSubmitFcn.m
extractJobId.m
SGEClusterInfo.m
submitjob.m
getJobStateFcn.m
getRemoteConnection.m
getSubmitString.m
parallelJobWrapper.sh
parallelSubmitFcn.m
These files should be availabel from the Mathworks.. Please follow the readme that come along with these files as you will have to edit some parts of these files to suit your SGE cluster queueing system..
  1 Comment
Jason Ross
Jason Ross on 25 Sep 2012
These files are available in matlabroot\toolbox\distcomp\examples, and go over configuring the generic scheduler interface.

Sign in to comment.

More Answers (0)

Categories

Find more on Specialized Power Systems in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!