Matlab parallel processing with big data [duplicate] - matlab

This question already has answers here:
Sending data to workers
(3 answers)
Saving time and memory using parfor?
(2 answers)
Closed 4 years ago.
I'm new to parallel processing, here's my problem:
I have a big data variable that cannot fit twice in RAM. Therefore, this won't work:
for ind=1:4
data{ind}=load_data(ind);
end
parfor ind=1:4
process_longtime(data{ind});
end
As there's a memory overflow. My hypothesis is, that Matlab tries to copy the whole data variable to every worker.
If this is correct - is there a way to distribute data into 4 (or n) parts to the workers, so they do not need access to the whole data variable?

Related

How do slow down a matlab script so that you can view graphs updating [duplicate]

This question already has answers here:
How to do an animated plot in matlab
(3 answers)
Closed 4 years ago.
Is there a way in matlab to slow down the execution time of a script so that you can view the graphs? I currently use breakpoints and step through the code but that is not ideal for showing demos.
You can try to use the pause function
pause(1)
In this way matlab should sleep for 1 second and you can see your graphic construction
https://it.mathworks.com/matlabcentral/answers/55874-how-to-stop-delay-execution-for-specified-time

Not enough RAM for parfor [duplicate]

This question already has answers here:
Saving time and memory using parfor?
(2 answers)
Closed 6 years ago.
If a parfor reckons that the computer will not have enough ram to run the code in parallel will it automatically serialize it? That definitely seems to be the case.
I have two identical parfor loops (except regarding the size of the matrices within them). On the first one it easily reaches 100% CPU and half my RAM, on the second one it reaches 12-20% CPU and all my RAM, and the codes are exactly equal (except for the size of the matrices inside them).
I have addressed the same issue in this question here.
In short, being each worker in the Matlab pool independent from the others, each worker needs his own amount of memory.
And no, Matlab does not automatically serialise your for-loop if it goes out-of-memory. If Matlab throws a proper error (as my knowledge it does happen on Windows PCs) you can do some sort of try-catch statement. The try-catch simply tries to execute the code in the try branch and if some error happens it executes automatically the catch block. In your case it'll be something like
try
% parfor here
catch
% standard for here
end

Matlab: Error using parallel_function: Out of Memory [duplicate]

This question already has answers here:
Saving time and memory using parfor?
(2 answers)
Closed 7 years ago.
I am using Matlab R2011b version on Windows 7 64 bit, Core i7 CPU with 8 GB RAM. I am running Approximate Nearest Neighbor algorithm called the Locality Sensitive Hashing using Matlabpool. Upon starting Matlab pool, I get the output
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
When the control reaches the for loop, Matlab throws errro
Error using parallel_function (line 598)
Out of memory. Type HELP MEMORY for your options.
Error stack:
remoteParallelFunction.m at 29
Error in Evaluate (line 19)
parfor i=1:query_num
I have no clue how to solve this problem. Please help. Thank you
That is because the parfor requires a lot more memory.
All the workers/labs in a parfor loop are independent so each of them needs his amount of memory. Also, there is overhead involved due to the fact that the pool must spread and collect data from/to the workers.
Try using a regular for or open a pool with 2 workers instead of 4.
Also, it depends on how optimized your some_function() is: try using as few variables as possible.

How to give the matlab "find" function a hint that the list is sorted? [duplicate]

This question already has answers here:
Faster version of find for sorted vectors (MATLAB)
(5 answers)
Closed 7 years ago.
I want to use find function in matlab to find the index of first value that is bigger than a number C. the list is too long and it takes a lot of time to execute. But the values are actually sorted in increasing manner. How can I take advantage of that feature of data in matlab?
find(Data>C,1,'first')
set the 'first' switch in find. This will ensure that as soon as it finds the first element satisfying the criterion it will stop looking.

how to calculate mean and variance in online learning [duplicate]

This question already has answers here:
Rolling variance algorithm
(13 answers)
Closed 7 years ago.
how to calculate mean and variance in online learning by matlab?
suppose we have a stream of data that each time we receive only 40 of data. i want to update mean and variance of this data set by get each 40 data.
I would like every time I get 40 data, I update mean and variance of the all data that received so far. please pay attention that I could not save all data and each time I can save only 40 data.
thanks a lot
You might want to calculate a running mean and a running variance. There is a very good tutorial here:
http://www.johndcook.com/blog/standard_deviation/
With these algorithms you don't need to keep all values in memory.