launch matlab with mpi on multiple nodes, missing output files

Hello , I am running matlab program with a batch script which lunches multiple nodes. Within each nodes, there are several independent jobs that are paralleled by parfor loop. Inside the parfor, there is a for loop for saving files.
The code that I am running can be simplified to the following examples.
The matlab code looks like:
-----------------------------------------------------------
my_par=parcluster('local');
my_parallel_job=parpool(my_par,16);
parfor n_loop=1:16
for n_loop_2=1:5
fileid=fopen([my_par.Host num2str(n_loop) '_' ...
num2str(n_loop_2)'.txt'],'w');
fprintf(fileid,'%d \n',n_loop_2);
fclose(fileid);
end
end
delete(my_parallel_job);
exit;
-------------------------------------------------------------
Here is my batch script:
---------------------------------------------------
#!/bin/bash
#SBATCH --ntasks-per-node=1
#SBATCH --partition=cfel
#SBATCH -t 00:10:00
#SBATCH --nodes=2
#SBATCH --job-name name_lalala
#SBATCH --output %j-%N.out
#SBATCH --error %j-%N.err
. /etc/profile.d/modules.sh module load mpi/mpich-3.2-x86_64 module load matlab
mpirun -np 2 matlab -nodisplay -nosplash -r "my_matlab_file_name"
-------------------------------------------------------------
everything works properly when my node number in the batch is 1 (total file number=16*5). But when the nodes are more than 1, some output files are missing.
Could anyone please have any suggestion on what could be happening?
many thanks Lu

Answers (1)

answer to myself. Use srun and tasks-per-node instead of mpirun.
problem solved

1 Comment

Hi lu wang....Could you please check this question [https://ch.mathworks.com/matlabcentral/answers/451144-error-unexpected-matlab-operator-on-cluster] and tell me about the error and how to resolve the error.

Sign in to comment.

Categories

Products

Asked:

on 28 Mar 2018

Commented:

on 20 Mar 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!