Parallel computation with cell array - matlab

A,B,C: cell arrays of size 100 x 1.
Each cell of A is a matrix. All have same size.
B and C contain vectors.
I need to create a cell array D of size 100 x 1. Serial code will look something like:
for i=1:100
D{i}=my_func(A{i},B{i},C{i});
end
where my_func is a function that takes input of a matrix and vectors, producing a vector.
I want to use parfor (or spmd) to make things faster. However, A has large size so I don't want to broadcast A to all workers. Is there a way to do this efficiently, given that my_func takes sometime? If anyone can give me a small example, I will appreciate it.

It looks like your cell arrays are all sliced variables. This means they will not be broadcast to all workers, only those segments each worker needs.
Thus, you can safely replace the for with parfor:
parfor i = 1:100
D{i} = my_func(A{i}, B{i}, C{i});
end

Related

MATLAB: how to divide and distribute a list of cell arrays over workers in a parallel loop?

I have a question about parallelizing code in MATLAB. I use MATLAB 2017a.
Let's say I have a cell array:
A = { A1, ..., A10}
and these matrices are quite big ( size > 10000 ). Now I want to start manipulating these matrices in a parallelpool. In fact, ther first worker needs only A1, the second worker needs only A2 and so on.
I have now this code;
parfor i = 1:10
matrix = A{i};
blabla = manipulate(Ai);
save(blabla);
end
I think that MATLAB gives every worker all the matrices in A but this is not really needed. Is there a way to say:
"Give i-th worker only matrix Ai"?
Based on the documentation for variables in parfor loops, specifically sliced variables, it appears as though the cell array A in your example meets the criteria to be treated as a sliced variable by default. You shouldn't have to do anything special. You may want to confirm that all the listed criteria are met, and take a look at each variable to see how they are used both inside and outside your parfor loop.
You want spmd blocks. That way, you explicitly handle the slicing of the parallel data, rather than letting Matlab do it automagically with the parfor block.
parpool('myprofile',10)
spmd
i = labindex;
B = foo(A{i});
end
for i = 1:10
bar(B{i});
end

Error message when storing output from loop in matrix

I have this program which calculates the realized covariance for each day in my sample but I have some troubles with storing the output in a matrix.
the program is as follows:
for i=1:66:(2071*66)
vec = realized_covariance(datapa(i:i+65),data(i:i+65),datapo(i:i+65),data(i:i+65),'wall','Fixed',fixedInterval,5)
mat(2,4142) = vec
end
Output:
vec =
1.0e-03 *
0.1353 -0.0283
-0.0283 0.0185
Subscripted assignment dimension mismatch.
I have tried various way to store the output in a matrix like defining a matrix on zeroes to store the output in or let the row dimension of the storing matrix be undefined, but nothing seems to do the job.
I would really appreciate an advice on how to tackle this challenge.
I have used a solution which does the job.
I defined a matrix and then filled in all my output one at the time using the following:
A = zeros(0,0) %before loop, only serve to define the storing matrix
A = [A; vec]%after the calculating function, inside the loop.
Actually mat(2,4142) is a single location in a matrix, you can't assign there four values.
You need to define the exact location inside mat every time you want to assign values into it. Try doing it like that:
mat=zeros(2,2142);
for k=1:66:(2071*66)
vec=realized_covariance(datapa(i:i+65),data(i:i+65),datapo(i:i+65),data(i:i+65),'wall','Fixed',fixedInterval,5)
mat(:,[(((k-1)/66)*2)+1 (((k-1)/66)*2)+2])=vec;
end
You're trying to store a 2 x 2 matrix into a single element. I.e. 4 elements on the right hand side, one on the left. That won't fit. See it like this: you have a garage besides your house where 1 car fits. You've got three friends coming over and they also want to park their car inside. That's a problem though, as you've got only space for one. So you have to buy a bigger garage: assign 4 elements on the left (e.g. mat(ii:ii+1,jj:jj+1) = [1 2;3 4]), or use a cell/structure array.
As Steve suggests in a comment below, you can use a 3D matrix quite easily:
counters = 1:66:(2071*66);
mat = zeros(2,2,numel(counters)); %// initialise output matrix
for ii=1:numel(counters)
vec = realized_covariance(datapa(counters(ii):counters(ii+65)),...
data(counters(ii):counters(ii+65)),datapo(counters(ii):counters(ii+65)),...
data(counters(ii):counters(ii+65)),'wall','Fixed',fixedInterval,5)
mat(:,:,ii) = vec; %// store in a 3D matrix
end
Now mat is 3D, with the first two coordinates being your regular output, i.e.e vec, and the last index is the iteration number. So to access the output of iteration 1032 you'd do mat(:,:,1032), possibly with a squeeze around that to make it 2D instead of 3D.

How to preallocate a list of external data structure in matlab?

My problem is related to an externally defined data structure: tensor. Tensor is a multidimensional array. In the Matlab tensor toolbox 2.5, tensor is a class with two fields: t.data, t.size:
% Create the tensor
t.data = data;
t.size = siz;
t = class(t, 'tensor');
return;
Like the built-in function zeros() in Matlab, I can use tenzeros() , to create a tensor full of zeros, e.g., tenzeros([2,3,4]). There're also other types of tensor data structure in this toolbox: tensor, sptensor, ktensor, ttensor, etc.
My question is, how I can preallocate 200 of tenzeros or other tensor types, where each tensor is of the same size [100,200,300]? That is, preallocating memory for 200 tensors. The reason is currently I use a for loop to create 200 tensors one by one, the memory requirements just goes up very very high. Some people advised me to preallocate memory for large data structures I need before I really compute them.
Thus, I want to preallocate an array of 200 tensors in the beginning; then in a for loop (parfor loop specifically), I compute the actual result of each tensor and send it to the preallocated space.
Why I couldn't use:
c=repmat(tenzeros([100, 200, 300]),200,1)
which throws:
Error using tensor.size
Too many output arguments.
Error in repmat (line 73)
[m,n] = size(A);
----------
update:
I pre-allocate the memory for the 200 tensors just because I heard memory preallocation can make the data continuous in the memory and thus can alleviate the OutOfMemory problem. Actually I only need each computed tensor to be written into each txt file in a for loop, which means I do not need the 200 tensors all together as my final result.
So currently I am using #Andrew Janke's third piece of codes to pre-allocate the memory for the 200 tensors in the beginning:
%Memory pre-allocation
c = cell([200, 1]);
parfor i = 1:numel(c)
c{i} = tenrand([100,200,300]); %This is just a tensor with random values to fill in the memory space
end
Then I virtually compute the 200 tensors in a parfor loop and fill in the pre-allocated memory space (i.e. c):
%Compute the 200 tensors in a parfor loop
parfor i = 1: 200
c{i} = computeTensorFunction(...)...;
aTensor = c{i};
write aTensor (i.e. c{i}) into a text file...;
end
Will the second part overwrite the space in c with-preallocated memory?
The experssion aTensor = c{i}: it doesn't make a duplicated copy, right? (I do not make changes to aTensor)
You can preallocate a cell array of initialized tensor objects by using repmat basically the way you are, but by sticking each tensor inside a cell.
c=repmat( { tenzeros([100, 200, 300]) }, 200, 1);
The { } curly braces surrounding the tenzeros call enclose it in a 1-by-1 cell.
If repmat is blowing up, you may be able to work around it by assigning the cell contents yourself from a re-used temporary variable. This will be basically as fast as repmat, and have the same memory usage characteristics.
sz = [200, 1];
c = cell(sz);
% Construct initial value *once* outside the loop
tmp = tensor(...);
for i = 1:numel(c)
c{i} = tmp;
end
Note that this isn't going to do as much for performance as preallocating primitive arrays, because only the top "container" level of composite types gets preallocated and possibly modified in-place. The arrays stored in fields of objects (like tensors) will still get copied when their values are changed inside functions, and probably even in the local workspace that first created them.
This will help a little bit with the peak memory usage because all of the initial zero tensors will be sharing their memory via the copy-on-write optimization. So it's more efficient that initializing the cell array with new tensors in a loop over multiple constructor calls. But since you're going to be discarding those initial zero values anyway, the most memory-efficient way to do this would be to just initialize it with empty cells.
sz = [200, 1];
c = cell(sz);
parfor i = 1:numel(c)
c{i} = calculate_your_result(...);
end
Because the tensor is a composite type (object), preallocation won't help much with the space they consume. You should probably work out an estimate of how much memory your data set will require in the best case scenario and see how that compares to the actual usage you're seeing. You might just need more memory for this application.

Error when trying to use parfor (parallel for loop) in MATLAB

I am dealing with a very huge matrix and so wanted to use parallel computing in MATLAB to run in clusters. Here I have created a sparse matrix using:
Ad = sparse(length(con)*length(uni_core), length(con)*length(uni_core));
I have a written function adj using which I can fill the matrix Ad.
Every time the loop runs, from the function adj I get a square symmetric matrix which is to be assigned to the Ad from 3682*(i-1)+1 to 3682 *(i-1)+3682 in the first index and similarly in the second index. This is shown here:
parfor i = 1:length(con)
Ad((3682*(i-1))+1:((3682*(i-1))+3682), ...
(3682*(i-1))+1:((3682*(i-1))+3682)) = adj(a, b, uni_core);
end
In a normal for loop it is running without any problem. But in parfor in parallel computing I am getting an error that there is a problem in using the sliced arrays with parfor.
Outputs from PARFOR loops must either be reduction variables (e.g. calculating a summation) or "sliced". See this page in the doc for more.
In your case, you're trying to form a "sliced" output, but your indexing expression is too complicated for PARFOR. In a PARFOR, a sliced output must be indexed by: the loop variable for one subscript, and by some constant expression for the other subscripts. The constant expression must be either :, end or a literal scalar. The following example shows several sliced outputs:
x3 = zeros(4, 10, 3);
parfor ii = 1:10
x1(ii) = rand;
x2(ii,:) = rand(1,10);
x3(:,ii,end) = rand(4,1);
x4{ii} = rand(ii);
end
In your case, your indexing expression into Ad is too complicated for PARFOR to handle. Probably the simplest thing you can do is return the calculations as a cell array, and then inject them into Ad on the host side using a regular FOR loop, like so:
parfor i = 1:length(con)
tmpout{i} = ....;
end
for i = 1:length(con)
Ad(...) = tmpout{i};
end
Edric has already explained why you're getting an error, but I wanted to make another suggestion for a solution. The matrix Ad you are creating is made up of a series of 3682-by-3682 blocks along the main diagonal, with zeroes everywhere else. One solution is to first create your blocks in a PARFOR loop, storing them in a cell array. Then you can combine them all into one matrix with a call to the function BLKDIAG:
cellArray = cell(1,length(con)); %# Preallocate the cell array
parfor i = 1:length(con)
cellArray{i} = sparse(adj(a,b,uni_core)); %# Compute matrices in parallel
end
Ad = blkdiag(cellArray{:});
The resulting matrix Ad will be sparse because each block was converted to a sparse matrix before being placed in the cell array.

What's the best way to iterate through columns of a matrix?

I want to apply a function to all columns in a matrix with MATLAB. For example, I'd like to be able to call smooth on every column of a matrix, instead of having smooth treat the matrix as a vector (which is the default behaviour if you call smooth(matrix)).
I'm sure there must be a more idiomatic way to do this, but I can't find it, so I've defined a map_column function:
function result = map_column(m, func)
result = m;
for col = 1:size(m,2)
result(:,col) = func(m(:,col));
end
end
which I can call with:
smoothed = map_column(input, #(c) (smooth(c, 9)));
Is there anything wrong with this code? How could I improve it?
The MATLAB "for" statement actually loops over the columns of whatever's supplied - normally, this just results in a sequence of scalars since the vector passed into for (as in your example above) is a row vector. This means that you can rewrite the above code like this:
function result = map_column(m, func)
result = [];
for m_col = m
result = horzcat(result, func(m_col));
end
If func does not return a column vector, then you can add something like
f = func(m_col);
result = horzcat(result, f(:));
to force it into a column.
Your solution is fine.
Note that horizcat exacts a substantial performance penalty for large matrices. It makes the code be O(N^2) instead of O(N). For a 100x10,000 matrix, your implementation takes 2.6s on my machine, the horizcat one takes 64.5s. For a 100x5000 matrix, the horizcat implementation takes 15.7s.
If you wanted, you could generalize your function a little and make it be able to iterate over the final dimension or even over arbitrary dimensions (not just columns).
Maybe you could always transform the matrix with the ' operator and then transform the result back.
smoothed = smooth(input', 9)';
That at least works with the fft function.
A way to cause an implicit loop across the columns of a matrix is to use cellfun. That is, you must first convert the matrix to a cell array, each cell will hold one column. Then call cellfun. For example:
A = randn(10,5);
See that here I've computed the standard deviation for each column.
cellfun(#std,mat2cell(A,size(A,1),ones(1,size(A,2))))
ans =
0.78681 1.1473 0.89789 0.66635 1.3482
Of course, many functions in MATLAB are already set up to work on rows or columns of an array as the user indicates. This is true of std of course, but this is a convenient way to test that cellfun worked successfully.
std(A,[],1)
ans =
0.78681 1.1473 0.89789 0.66635 1.3482
Don't forget to preallocate the result matrix if you are dealing with large matrices. Otherwise your CPU will spend lots of cycles repeatedly re-allocating the matrix every time it adds a new row/column.
If this is a common use-case for your function, it would perhaps be a good idea to make the function iterate through the columns automatically if the input is not a vector.
This doesn't exactly solve your problem but it would simplify the functions' usage. In that case, the output should be a matrix, too.
You can also transform the matrix to one long column by using m(:,:) = m(:). However, it depends on your function if this would make sense.