MATLAB: Generating random numbers in parfor or parallel computing - matlab

In a single for loop, I use a single random seed to generate all the "random numbers". They are very random as I take one from the stream at a time, without any gap.
However, in parfor, each worker uses a different random seed, therefore, the numbers obtained may have interference with each other. Therefore, they are not really random as they do not come from a single seed.
Also, for my case, I do not know how many random numbers each worker needs beforehand. How can I solve this problem?

In parfor, the workers use different streams from a random number generator that is specifically designed to be used in parallel. Therefore, you can rely on the random numbers generated inside parfor having reasonable statistical qualities. More here: http://www.mathworks.com/help/distcomp/control-random-number-streams.html

Related

Set random-seed just for the first simulation of a series of repetitions with Behavior Space

I have to test different algorithms with netlogo. I have a different Netlogo model to simulate each algorithm.
I would like to use Behavior Space tool to run a series of simulations with every model, and use random-seed to replicate the events that occur at random with all the models (algorithms).
Therefore, I will set a Behavior Space Experiment for each method. And in the Experiment settings I will set a number of Repetitions to compute different samples of the results.
The problem is that setting a random-seed, for example in the setup procedure of the models will produce that repetition of events for each Experiment of the Behavior Space, however, it will produce the same results in all the repetitions of the experiment.
What I would need is to run the series of simulations of the model setting the random-seed only in the first simulation, so that the results obtained repeating the simulations of the model with Netlogo (the samples obtained) will be different, and all the Experiments will use the same sequence of random events that I would need to compare the different algorithms.
Is there any way of setting a Behavior Space Experiment with a number of repetitions, and generate the same random sequence in another experiment with the same number of repetitions?
Regards
Use behaviorspace-run-number. E.g., as the simplest example, in the model setup include the line
random-seed behaviorspace-run-number

How to generate n independent normal random variables in Matlab

I am new to this, and I'd like to know whether there's a function or any way I can generate n independent normal random variables in Matlab?
randn will produce independent random variates from a Normal distribution.
In general, you have to trust that the sequence of variates produced by a pseudo-random number generator are statistically independent. Smart people who are experts in designing RNGs have worked hard to try to achieve that.
You can use these two below,
randn
which will give you normally distributed random numbers.
rand
which will give you uniformly distributed random numbers.

Strange rand() behaviour in MATLAB

rand() does not seem to generate really random numbers. I have a simple program that returns a 6-digit number by calling :
for i=1:6
r=rand(1,1)
end
so I ran this 4-5 times yesterday. And saved the output. Today I opened MATLAB again and called the same function again 4-5 times. The same numbers were returned.
Why is this happening?
Should I provide a random seed or any other fix?
Thanks for any help!
To expand on #alexforrence's answer, rand and other related functions produce pseudo-random numbers (PRNs) that require an initial value to begin production. These numbers are not truly random since, following the initial seed, the numbers are produced via an algorithm, which is deterministic by its very nature.
However, being pseudo-random isn't necessarily a bad thing since models that use PRNs (e.g., Monte Carlo Methods) can generate portable, repeatable results across many users and platforms.
Additionally, the seed can be changed to create sets of random numbers and results that are statistically independent but also produce repeatable results.
For many scientific applications, this is very important.
Also, "true" random numbers (next paragraph) have a tendency to "clump" together and not evenly spread over their range for a small sampling of the space, which will degrade the performance of some methods that rely on stochastic processes.
There are methods to create "true-er" random numbers by the introduction of randomness from various analogue sources (e.g., hardware noise). These types of numbers are extremely important for cryptographically secure PRNs, where non-repeatability is an important feature (in contrast to the scientific usage). True random number generators require special hardware that leverages natural noise (e.g., quantum effects).
Although, it is important to remember that the total number of random numbers that can be generated and computationally used is limited by the precision of the numbers being used.
You can re-seed MATLAB with a pseudo-random seed using the rng function.
However, "reseeding the generator too frequently within a session is not a good idea because the statistical properties of your random numbers can be adversely affected" [src].
From the Mathworks documentation, you can use
rng('shuffle');
before calling rand to set a "random" seed (based on the current time). Setting the seed manually (either by not changing the seed at startup, by resetting using rng('default'), or setting the seed manually by rng(number)) allows you to exactly repeat previous behavior.

Correct way to generate random numbers

On page 3 of "Lecture 8, White Noise and Power Spectral Density" it is mentioned that rand and randn create Pseudo-random numbers. Please correct me if I am wrong: a sequence of random number is that which for the same seed, two sequences are never really exact.
Whereas, Pseudo-random numbers are deterministic i.e., two sequences are same if generated from the same seed.
How can I create random numbers and not pseudo-random numbers since I was under the impression that Matlab's rand and randn functions are used to generate identically independent random numbers? But, the slides mention that they create pseudo random numbers. Googling for creating of random numbers return rand and randn() functions.
The reason for distinguishing random numbers from pseudo-random numbers is that I need to compare performance of cryptography (A) random with white noise characteristics and (B) pseudo-random signal with white noise characteristic. So, (A) must be different from (B). I shall be grateful for any code and the correct way to generate random numbers and pseudo-random numbers.
Generation of "true" random numbers is a tricky exercise, you can check Wikipedia on RNG and the tests of randomness (http://en.wikipedia.org/wiki/Random_number_generation). This link offers RNG based on atmospheric noise (http://www.random.org/).
As mentioned above, it is really difficult (probably impossible) to create real random numbers with computer software. There are numerous projects on the internet that provide real random numbers that are generated by physical processes (for example the one Kostya mentioned). A Particularly interesting one is this from HU Berlin.
That being said, for experiments like the one you want to perform, Maltab's psedo RNGs are more than fine. Matlab's algorithms include Mersenne Twister which is one of the best known pseudo RNG (I would suggest you google the Mersenne Twister's properties). See Maltab rng documentation here.
Since you did not mention which type of system you want to simulate, one simple approach to solve your issue would be to use a good RNG (Mersenne Twister) for process A and a not-so-good for process B.

Matlab parfor work distribution

I have a parfor loop through say 100 iterations, and the workload on every iteration is different but changes linearly in a way that the first one takes the most time and the last one is the fastest. But when I run through the parfor loop with my four instances/labs, during the last few hours only one lab is active as it's running through the few first iterations by its own.
So I know which iterations are the slow ones. How could I make workload between cores more even. For example somehow force all labs to start working on the first four slow ones and then proceed in order? Or something similar to prevent only one active core running the few slow ones alone..
Matlab parfor does nothing more but split up the indices and distributes them to the workers. It does this by creating contiguous chunks from the indices. I don't know the exact algorithm but this means that data with similar indices get computed in the same chunk and by the same worker.
The simplest solution would be a stochastic one. Just shuffle your indices so that the work intensive steps are distributed nicely. While this doesn't give you any guarantees on performance it is simple and will work most of the time.
Some example code:
% dummy data
N=10;
data=1:N;
% generate the permutated indices
permIndex=randperm(N);
% permute the data
dataPermuted=data(permIndex);
% run the loop
parfor i=1:N
% do something e.g. pause for the time as specified by data
pause(dataPermuted(i));
end
%invert the index permutation
dataInversePermuted(permIndex)=dataPermuted;
I used pause to simulate the different computation times.
I don't think this is documented anywhere, but you can quickly deduce that PARFOR runs iterations in reverse loop order (using pause and disp if you want to see it in action). So, you should simply reverse your loop. PARFOR gives you no means to explicitly control execution order, but SPMD using for-drange does (PARFOR is significantly easier to use though).
#denahiro's suggestion is also a good one.