Shared memory in parfor genetic algorithm

3 views (last 30 days)
Hello, community Matlab!
I use gamultiobj with parallelization, since my optimization function takes a lot of time when performing. To accelerate this process, I want to caching repeated data in some array or map, which can be calculated in the optimization function. That is, the functions can come to the input that the data that does not make no sense that it makes no sense to re-consider. All workers must have access to this array, read and write down values.
The problem is that I do not know how to do it, because gamultiobj uses parfor inside himself, in which, as I understand it, not to transmit data between workers. I wanted to use global variables, but they cannot be used in parfor. I would have perfectly suited the implementation of the LabProbe / Labreceive / Labsend, which are unfortunately used in spmd.
Thanks, Alexander.

Accepted Answer

Edric Ellis
Edric Ellis on 12 May 2022
If you're using R2022a or later, you could use ValueStore to allow workers to share values. The main requirement here is for you to come up with a way to convert the input arguments of your function into a "key" that can be used with the ValueStore. If that is straightforward, then you might be able to get things to work like this:
if isempty(gcp("nocreate")); parpool("local"); end
Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 2).
parfor ii = 1:10
val = cachingMagic(randi(3));
out(ii) = sum(val, "all");
end
Analyzing and transferring files to the workers ...done. Cache miss for magic_3 at 12-May-2022 12:43:00 Cache miss for magic_2 at 12-May-2022 12:43:00 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_1 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_1 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:00 Cache miss for magic_1 at 12-May-2022 12:43:00 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:01
% cachingMagic returns "magic(in)", with caching.
function out = cachingMagic(in)
arguments
in (1,1) double {mustBeInteger}
end
% Because the input is a simple scalar, we can generate a string key very
% easily.
cacheKey = sprintf("magic_%d", in);
% getCurrentValueStore returns empty on the client, so we should guard
% against that
vs = getCurrentValueStore();
isCached = ~isempty(vs) && vs.isKey(cacheKey);
if isCached
fprintf('Cache hit for %s at %s\n', cacheKey, string(datetime));
out = vs(cacheKey);
else
fprintf('Cache miss for %s at %s\n', cacheKey, string(datetime));
% Not in cache, must compute
out = magic(in);
% Introduce an arbitrary delay to simulate slow computation
pause(rand);
% If we have a ValueStore, cache the result
if ~isempty(vs)
vs(cacheKey) = out;
end
end
end
  1 Comment
Alexander Kobyzhev
Alexander Kobyzhev on 12 May 2022
Thank you very much for the answer, it helped me to realize the cache!
Sending and receiving data between workers is not as fast as we would like. The profiler showed that for one thousand calls get and put go for about 25-30 seconds, which is not very fast. However, even so I got an increase in performance by 25%.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 11 May 2022
You can use Parallel Data Queue to send results back from the worker to the controller, and another set to distribute results to the worker. It is a bit of a nuisance, and might not be efficient.
You could also do something like hash the arguments to get an index to use into a memory map. This might be a challenge to do efficiently.
  1 Comment
Alexander Kobyzhev
Alexander Kobyzhev on 12 May 2022
Thanks for the answer!
It turns out to obtain at least some values between the client and workers I need to create 1 DataQueue on the client side and 4 DataQueue for workers (if I have 4 workers). At the same time, I need to send these 4 DataQueue from workers to the client via DataQueue client. I can also process the data obtained only after each generation of the genetic algorithm, which is not very good, because in several cases the same data can be cast and sent to the client. (loss of performance)
Apparently I have to abandon this idea with caching...

Sign in to comment.

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!