Main Content

Choose Between Thread-Based and Process-Based Environments

With Parallel Computing Toolbox™, you can run your parallel code in different parallel environments, such as thread-based or process-based environments. These environments offer different advantages.

Note that thread-based environments support only a subset of the MATLAB® functions available for process workers. If you are interested in a function that is not supported, let the MathWorks Technical Support team know. For more information on support, see Check Support for Thread-Based Environment.

Select Parallel Environment

Depending on the type of parallel environment you select, features run on either process workers or thread workers. To decide which environment is right for you, consult the following diagram and table.

  • To use parallel pool features, such as parfor or parfeval, create a parallel pool in the chosen environment by using the parpool function.

    EnvironmentRecommendationExample
    Thread-based environment on local machine

    Use this setup for reduced memory usage, faster scheduling, and lower data transfer costs.

    parpool('threads')

    Note

    If you choose 'threads', check that your code is supported. For more information, see Check Support for Thread-Based Environment.

    To find out if you can get sufficient benefit from a thread-based pool, measure data transfer in a process-based pool with ticBytes and tocBytes. If the data transfer is large, such as above 100 MB, then use 'threads'.

    Process-based environment on local machine

    Use this setup for most use cases and for prototyping before scaling to clusters or clouds.

    parpool('local')

    Process-based environment on remote cluster

    Use this setup to scale up your computations.

    parpool('MyCluster')
    where MyCluster is the name of a cluster profile.

  • To use cluster features, such as batch, create a cluster object in the chosen environment by using the parcluster function. Note that cluster features are supported only in process-based environments.

    EnvironmentRecommendationExample
    Process-based environment on local machine

    Use this setup if you have sufficient local resources, or to prototype before scaling to clusters or clouds.

    parcluster('local')

    Process-based environment on remote cluster

    Use this setup to scale up your computations.

    parcluster('MyCluster')

    where MyCluster is the name of a cluster profile.

Recommendation

Defaulting to process-based environments is recommended.

  • They support the full parallel language.

  • They are backwards compatible with previous releases.

  • They are more robust in the event of crashes.

  • External libraries do not need to be thread-safe.

Choose thread-based environments when:

  • Your parallel code is supported by thread-based environments.

  • You want reduced memory usage, faster scheduling and lower data transfer costs.

Compare Process Workers and Thread Workers

The following shows a performance comparison between process workers and thread workers for an example that leverages the efficiency of thread workers.

Create some data.

X = rand(10000, 10000);

Create a parallel pool of process workers.

pool = parpool('local');
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 6).

Time the execution and measure data transfer of some parallel code. For this example, use a parfeval execution.

ticBytes(pool);
tProcesses = timeit(@() fetchOutputs(parfeval(@sum,1,X,'all')))
tocBytes(pool)
tProcesses = 3.9060

             BytesSentToWorkers    BytesReceivedFromWorkers
             __________________    ________________________

    1                   0                       0          
    2                   0                       0          
    3                   0                       0          
    4                   0                       0          
    5             5.6e+09                   16254          
    6                   0                       0          
    Total         5.6e+09                   16254     

Note that the data transfer is significant. To avoid incurring data transfer costs, you can use thread workers. Delete the current parallel pool and create a thread-based parallel pool.

delete(pool);
pool = parpool('threads');

Time how long the same code takes to run.

tThreads = timeit(@() fetchOutputs(parfeval(@sum,1,X,'all')))
tThreads = 0.0232

Compare the times.

fprintf('Without data transfer, this example is %.2fx faster.\n', tProcesses/tThreads)
Without data transfer, this example is 168.27x faster.

Thread workers outperform process workers because thread workers can use the data X without copying it, and they have less scheduling overhead.

Solve Optimization Problem in Parallel on Process-Based and Thread-Based Pool

This example shows how to use a process-based and thread-based pool to solve an optimization problem in parallel. Thread-based pools are optimized for less data transfer, faster scheduling, and reduced memory usage, so they can result in a performance gain in your applications.

Problem Description

The problem is to change the position and angle of a cannon to fire a projectile as far as possible beyond a wall. The cannon has a muzzle velocity of 300 m/s. The wall is 20 m high. If the cannon is too close to the wall, it fires at too steep an angle, and the projectile does not travel far enough. If the cannon is too far from the wall, the projectile does not travel far enough. For full problem details, see Optimize an ODE in Parallel (Global Optimization Toolbox) or the latter part of the video Surrogate Optimization.

MATLAB Problem Formulation

To solve the problem, call the patternsearch solver from Global Optimization Toolbox. The objective function is in the cannonobjective helper function, which calculates the distance the projectile lands beyond the wall for a given position and angle. The constraint is in the cannonconstraint helper function, which calculates whether the projectile hits the wall, or even reaches the wall before hitting the ground. The helper functions are in separate files that you can view when you run this example.

Set the following inputs for the patternsearch solver. Note that, to use Parallel Computing Toolbox, you must set 'UseParallel' to true in the optimization options.

lb = [-200;0.05];
ub = [-1;pi/2-.05];
x0 = [-30,pi/3];
opts = optimoptions('patternsearch',...
    'UseCompletePoll', true, ...
    'Display','off',...
    'UseParallel',true);
% No linear constraints, so set these inputs to empty:
A = [];
b = [];
Aeq = [];
beq = [];

Solve on Process-Based Pool

For comparison, solve the problem on a process-based parallel pool first.

Start a parallel pool of process workers.

p = parpool('local');
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 6).

To reproduce the same computations later, seed the random generator with the default value.

rng default;

Use a loop to solve the problem several times and average the results.

tProcesses = zeros(5,1);
for repetition = 1:numel(tProcesses)
    tic
    [xsolution,distance,eflag,outpt] = patternsearch(@cannonobjective,x0, ...
        A,b,Aeq,beq,lb,ub,@cannonconstraint,opts);
    tProcesses(repetition) = toc;
end
tProcesses = mean(tProcesses)
tProcesses = 2.7677

To prepare for the comparison with a thread-based pool, delete the current parallel pool.

delete(p);

Solve on Thread-Based Pool

Start a parallel pool of thread workers.

p = parpool('threads');
Starting parallel pool (parpool) ...
Connected to the parallel pool (number of workers: 6).

Restore the random number generator to default settings and run the same code as before.

rng default
tThreads = zeros(5,1);
for repetition = 1:numel(tThreads)
    tic
    [xsolution,distance,eflag,outpt] = patternsearch(@cannonobjective,x0, ...
        A,b,Aeq,beq,lb,ub,@cannonconstraint,opts);
    tThreads(repetition) = toc;
end
tThreads = mean(tThreads)
tThreads = 1.5790

Compare the performance of thread workers and process workers.

fprintf('In this example, thread workers are %.2fx faster than process workers.\n', tProcesses/tThreads)
In this example, thread workers are 1.75x faster than process workers.

Notice the performance gain due to the optimizations of the thread-based pool.

When you are done with computations, delete the parallel pool.

delete(p);

What Are Thread-Based Environments?

In thread-based environments, parallel language features run on workers that are backed by computing threads, which run code on cores on a machine. They differ from computing processes in that they coexist within the same process and can share memory.

Thread-based environments have the following advantages over process-based environments.

  • Because thread workers can share memory, they can access numeric data without copying, so they are more memory efficient.

  • Communication between threads is less time consuming. Therefore, the overhead of scheduling a task or inter-worker communication is smaller.

When you use thread-based environments, keep the following considerations in mind.

  • Check that your code is supported for a thread-based environment. For more information, see Check Support for Thread-Based Environment.

  • If you are using external libraries from workers, then you must ensure that the library functions are thread-safe.

What are Process-Based Environments?

In process-based environments, parallel language features run on workers that are backed by computing processes, which run code on cores on a machine. They differ from computing threads in that they are independent of each other.

Process-based environments have the following advantages over thread-based environments.

  • They support all language features and are backwards compatible with previous releases.

  • They are more robust in the event of crashes. If a process worker crashes, then the MATLAB client does not crash. If a process worker crashes and your code does not use spmd or distributed arrays, then the rest of the workers can continue running.

  • If you use external libraries from workers, then you do not need to pay attention to thread-safety.

  • You can use cluster features, such as batch.

When you use a process-based environment, keep the following consideration in mind.

  • If your code accesses files from workers, then you must use additional options, such as 'AttachedFiles' or 'AdditionalPaths', to make the data accessible.

Check Support for Thread-Based Environment

Thread workers support only a subset of the MATLAB functions available for process workers. If you are interested in a function that is not supported, let the MathWorks Technical Support team know.

parpool, parfor, parfeval, parfevalOnAll, tall, and parallel.pool.Constant are supported, subject to the following limitations.

  • A thread-based parallel pool does not have an associated cluster object.

  • afterEach and afterAll are not supported.

  • FevalQueue is not supported.

  • Tall arrays do not support write and support only tabular text and in-memory inputs.

Other parallel language functionality, including spmd, distributed, and parallel.pool.DataQueue, is not supported.

The following core MATLAB functionality is supported on a thread worker.

In general, functionality that modifies or accesses things outside of the thread worker are not supported, including the following core MATLAB functionality.

gpuArray is supported on a thread worker.

See Also

|

Related Topics