matlab random number generation in parfor loop - matlab

I want to generate same normal random numbers for loop and parfor loop. Following MATLAB documentation I tried three different method:
Method 1: Using rng(seed, 'twister')
N = 1;
for ind = 1:10
rng(ind, 'twister')
samps(ind) = normrnd(0,1,N,1);
end
plot(samps); hold on
parfor ind = 1:10
rng(ind, 'twister')
samps(ind) = normrnd(0,1,N,1);
end
scatter(1:length(samps), samps)
legend({'using for loop', 'using parfor loop'})
Method 2: Using RandStream Method
N = 1;
sc = parallel.pool.Constant(RandStream('Threefry'));
for ind = 1:10
stream = sc.Value;
stream.Substream = ind;
samps(ind) = normrnd(0,1,N,1);
end
plot(samps); hold on
sc = parallel.pool.Constant(RandStream('Threefry'));
parfor ind = 1:10
stream = sc.Value;
stream.Substream = ind;
samps(ind) = normrnd(0,1,N,1);
end
scatter(1:length(samps), samps)
legend({'using for loop', 'using parfor loop'})
Method 3: Using another RandStream method
N = 1;
stream = RandStream('mrg32k3a');
for ind = 1:10
stream.Substream = ind;
samps(ind) = normrnd(0,1,N,1);
end
plot(samps); hold on
stream = RandStream('mrg32k3a');
parfor ind = 1:10
set(stream,'Substream',ind);
samps(ind) = normrnd(0,1,N,1);
end
scatter(1:length(samps), samps)
legend({'using for loop', 'using parfor loop'})
My question is why the last two methods are not working. If I understood MATLAB documentation correctly, they should have worked. Also, is there anything wrong with using rng(seed, 'twister') method with different seeds to produce statistically independent samples?
Edit:
Here is the link to the MATLAB documentation that I used.

As #Cris points out, normrnd uses the current global stream. Here's how to make your 3rd case match between client and workers - you need to set the global stream explicitly. (Note that I'm reverting the global stream immediately after using it too).
N = 1;
stream = RandStream('mrg32k3a');
for ind = 1:10
stream.Substream = ind;
prev = RandStream.setGlobalStream(stream);
samps(ind) = normrnd(0,1,N,1);
RandStream.setGlobalStream(prev);
end
stream = RandStream('mrg32k3a');
parfor ind = 1:10
set(stream,'Substream',ind);
prev = RandStream.setGlobalStream(stream);
pf_samps(ind) = normrnd(0,1,N,1);
RandStream.setGlobalStream(prev);
end
With this code, samps and pf_samps are identical.
Also, you should not use "twister" in parallel, since the random samples you obtain by setting the seed in this way will not have as good qualities as if you use a "parallel" generator like mrg32k3a. (I'm no expert, but my understanding is that the twister "seed" values are not able to select statistically-independent sequences of samples in the way that streams/substreams can)
In recent versions of MATLAB, faster generators that support multiple streams are available. See the doc https://www.mathworks.com/help/matlab/math/creating-and-controlling-a-random-number-stream.html#brvku_2 . Philox and Threefry in particular are good parallel generators.

I'm inferring that your concern is that the pseudorandomly generated numbers for your for and parfor loops did not match. This is however expected behavior. You used a MRG32K3A pseudorandom algorithm, which "does not make a guarantee that a fixed seed and a fixed series of calls to mrg32k3a.RandomState methods using the same parameters will always produce the same results." Source.
So it appears that everything is working correctly.

When you do normrnd(0,1,N,1), you are not using stream to generate a random number, but rather the default RNG. Instead, use randn(stream,N,1), see the documentation.
As far as I can tell, normrnd and other generators from the Statistics and a Machine Learning toolbox always use the default RNG, and cannot use a custom RandStream object.

Related

Sliced Variables in PARFOR loop: Sequential to Parallel Conversion in MATLAB

I have a code in MATLAB in which I'm running monte-carlo simulations using parfor instead of simple for loop to convert the code from sequential to parallel. Following is the piece of code which is inside the parfor loop.
But MATLAB gives an error saying "Valid indices for local_Q_mega_sub_seed are restricted in parfor loop". Suggested action says to "Fix the index" and it suggests to use "Sliced Variables". I've been struggling to use this concept. I have read https://blogs.mathworks.com/loren/2009/10/02/using-parfor-loops-getting-up-and-running/#12 and https://www.mathworks.com/matlabcentral/answers/123922-sliced-variables-in-parfor-loop-restricted-indexing along with MATLAB documentation https://www.mathworks.com/help/distcomp/sliced-variable.html and https://www.mathworks.com/help/distcomp/parfor.html but I'm not getting it right.
Could anyone please let me know how can I use sliced variables in the given piece of code so that I could get an idea?
index_f = 1;
subseed_step = (sub_seed_transmitted_at/fs_local)*sintablen_mega_frequency;
for i = 1 : fs_local
local_Q_mega_sub_seed(i) = SINTAB(round(index_f));
local_I_mega_sub_seed(i) = COSTAB(round(index_f));
index_f = index_f + subseed_step;
if index_f>sintablen_mega_frequency
index_f = index_f - sintablen_mega_frequency;
end
You're not showing enough context here, but I bet the problem here is similar to this one:
parfor ii = 1:10
for jj = 1:10
tmp(jj) = rand
end
out(ii) = sum(tmp);
end
In this case, the parfor machinery cannot categorically prove that the way tmp is being used is independent of the order of iterations of the parfor loop. This is because it appears as though values assigned to tmp in one iteration of the parfor loop are still being used in the next iteration.
Fortunately, there's a very simple workaround for this case - convince parfor that you are not doing anything dependent on the order of evaluation of the iterations of the loop by resetting the variable. In the simple case above, this means:
parfor ii = 1:10
% reset 'tmp' at the start of each parfor loop iteration
tmp = [];
for jj = 1:10
tmp(jj) = rand
end
out(ii) = sum(tmp);
end

How can we create random numbers without using any function rand in matlab?

b=round(rand(1,20));
Instead of using the function rand, I would like to know how to create such series.
This is a very interesting question. Actually, the easiest way to go, in my opinion is to use a Linear-feedback Shift Register. You can find plenty of examples and implementations googling around (here is one coming from another SO question).
Here is a quick Matlab demo based on this code:
b = lfsr(20)
function r = lfsr(size)
persistent state;
if (isempty(state))
state = uint32(1);
end
if (nargin < 1)
size = 1;
end
r = zeros(size,1);
for i = 1:size
r(i) = bitand(state,uint32(1));
if (bitand(state,uint32(1)))
state = bitxor(bitshift(state,-1),uint32(142));
else
state = bitshift(state,-1);
end
end
end

Avoiding race conditions when using parfor in MATLAB

I'm looping in parallel and changing a variable if a condition is met. Super idiomatic code that I'm sure everyone has written a hundred times:
trials = 100;
greatest_so_far = 0;
best_result = 0;
for trial_i = 1:trials
[amount, result] = do_work();
if amount > greatest_so_far
greatest_so_far = amount;
best_result = result;
end
end
If I wanted to replace for by parfor, how can I ensure that there aren't race conditions when checking whether we should replace greatest_so_far? Is there a way to lock this variable outside of the check? Perhaps like:
trials = 100;
greatest_so_far = 0;
best_result = 0;
parfor trial_i = 1:trials
[amount, result] = do_work();
somehow_lock(greatest_so_far);
if amount > greatest_so_far
greatest_so_far = amount;
best_result = result;
end
somehow_unlock(greatest_so_far);
end
Skewed answer. It does not exactly solve your problem, but it might help you avoiding it.
If you can afford the memory to store the outputs of your do_work() in some vectors, then you could simply run your parfor on this function only, store the result, then do your scoring at the end (outside of the loop):
amount = zeros( trials , 1 ) ;
result = zeros( trials , 1 ) ;
parfor trial_i = 1:trials
[amount(i), result(i)] = do_work();
end
[ greatest_of_all , greatest_index ] = max(amount) ;
best_result = result(greatest_index) ;
Edit/comment : (wanted to put that in comment of your question but it was too long, sorry).
I am familiar with .net and understand completely your lock/unlock request. I myself tried many attempts to implement a kind of progress indicator for very long parfor loop ... to no avail.
If I understand Matlab classification of variable correctly, the mere fact that you assign greatest_so_far (in greatest_so_far=amount) make Matlab treat it as a temporary variable, which will be cleared and reinitialized at the beginning of every loop iteration (hence unusable for your purpose).
So an easy locked variable may not be a concept we can implement simply at the moment. Some convoluted class event or file writing/checking may do the trick but I am afraid the timing would suffer greatly. If each iteration takes a long time to execute, the overhead might be worth it, but if you use parfoor to accelerate a high number of short execution iterations, then the convoluted solutions would slow you down more than help ...
You can have a look at this stack exchange question, you may find something of interest for your case: Semaphores and locks in MATLAB
The solution from Hoki is the right way to solve the problem as stated. However, as you asked about race conditions and preventing them when loop iterations depend on each other you might want to investigate spmd and the various lab* functions.
You need to use SPMD to do this - SPMD allows communication between the workers. Something like this:
bestResult = -Inf;
bestIndex = NaN;
N = 97;
spmd
% we need to round up the loop range to ensure that each
% worker executes the same number of iterations
loopRange = numlabs * ceil(N / numlabs);
for idx = 1:numlabs:loopRange
if idx <= N
local_result = rand(); % obviously replace this with your actual function
else
local_result = -Inf;
end
% Work out which index has the best result - use a really simple approach
% by concatenating all the results this time from each worker
% allResultsThisTime will be 2-by-numlabs where the first row is all the
% the results this time, and the second row is all the values of idx from this time
allResultsThisTime = gcat([local_result; idx]);
% The best result this time - consider the first row
[bestResultThisTime, labOfBestResult] = max(allResultsThisTime(1, :));
if bestResultThisTime > bestResult
bestResult = bestResultThisTime;
bestIndex = allResultsThisTime(2, labOfBestResult);
end
end
end
disp(bestResult{1})
disp(bestIndex{1})

MATLAB parfor how to slice a matrix

I have a little parfor test script which gives the warning in the title.
The code is this
out = zeros(10, 1);
in = rand(5e8, 10);
tic
parfor i = 1:10
for j = 1:5e8
p = floor(rand(1,1)*5e8);
out(i) = out(i) + in(p, i);
end
end
toc
tot = sum(out)
the warning comes out on line 7 regarding how variable in is accessed.
I don't understand why, slicing should be trivial. Just send each column of in to each worker.
If I change the code to
out = zeros(10, 1);
in = rand(5e8, 10);
tic
parfor i = 1:10
a = in(:,i);
for j = 1:5e8
p = floor(rand(1,1)*5e8);
out(i) = out(i) + a(p);
end
end
toc
tot = sum(out)
the warning disappears but I don't like that assignment to a.
The code was explicitly designed to mess up the cache memory.
Unfortunately, as explained here http://www.mathworks.com/help/distcomp/advanced-topics.html#bq_of7_-1 , MATLAB does not understand how to slice in, hence the code analyser warning. You have to read that page fairly closely to understand why it cannot be sliced. The relevant paragraph is:
Form of Indexing. Within the list of indices for a sliced variable, one of these indices is of the form i, i+k, i-k, k+i, or k-i, where i
is the loop variable and k is a constant or a simple (nonindexed)
broadcast variable; and every other index is a scalar constant, a
simple broadcast variable, colon, or end.
The clause in bold type at the end is the relevant one - in your case, p does not match this constraint.

How to use a variable outside a PARFOR loop in MATLAB?

In MATLAB, I have a variable proba and I have a parfor loop as showing below:
parfor f = 1:N
proba = (1/M)*ones(1, M);
% rest of the code
end
pi_proba = proba;
MATLAB said that: "The temporary variable 'proba' is used after the PARFOR loop, but its value is nondeterministic"
I do not understand how to correct this error. I need to use a parallel loop and I need proba after the loop. How to do this?
When using parfor the classes are classified according to these categories. Make sure every variable matches one of these categories. For non-writing access to proba a Broadcast-Variable would be the best choice:
proba = (1/M)*ones(1, M);
parfor f = 1:N
% rest of the code
end
pi_proba = proba;
In case of writing access within the loop, a sliced variable is nessecary:
proba=cell(1,N)
parfor f = 1:N
%now use proba{f} inside the loop
proba{f}=(1/M)*ones(1, M);
% rest of the code
end
%get proba from whatever iteration you want
pi_proba = proba{N};