MATLAB parfor how to slice a matrix - matlab

I have a little parfor test script which gives the warning in the title.
The code is this
out = zeros(10, 1);
in = rand(5e8, 10);
tic
parfor i = 1:10
for j = 1:5e8
p = floor(rand(1,1)*5e8);
out(i) = out(i) + in(p, i);
end
end
toc
tot = sum(out)
the warning comes out on line 7 regarding how variable in is accessed.
I don't understand why, slicing should be trivial. Just send each column of in to each worker.
If I change the code to
out = zeros(10, 1);
in = rand(5e8, 10);
tic
parfor i = 1:10
a = in(:,i);
for j = 1:5e8
p = floor(rand(1,1)*5e8);
out(i) = out(i) + a(p);
end
end
toc
tot = sum(out)
the warning disappears but I don't like that assignment to a.
The code was explicitly designed to mess up the cache memory.

Unfortunately, as explained here http://www.mathworks.com/help/distcomp/advanced-topics.html#bq_of7_-1 , MATLAB does not understand how to slice in, hence the code analyser warning. You have to read that page fairly closely to understand why it cannot be sliced. The relevant paragraph is:
Form of Indexing. Within the list of indices for a sliced variable, one of these indices is of the form i, i+k, i-k, k+i, or k-i, where i
is the loop variable and k is a constant or a simple (nonindexed)
broadcast variable; and every other index is a scalar constant, a
simple broadcast variable, colon, or end.
The clause in bold type at the end is the relevant one - in your case, p does not match this constraint.

Related

Declaring a vector in matlab whose size we don't know

Suppose we are running an infinite for loop in MATLAB, and we want to store the iterative values in a vector. How can we declare the vector without knowing the size of it?
z=??
for i=1:inf
z(i,1)=i;
if(condition)%%condition is met then break out of the loop
break;
end;
end;
Please note first that this is bad practise, and you should preallocate where possible.
That being said, using the end keyword is the best option for extending arrays by a single element:
z = [];
for ii = 1:x
z(end+1, 1) = ii; % Index to the (end+1)th position, extending the array
end
You can also concatenate results from previous iterations, this tends to be slower since you have the assignment variable on both sides of the equals operator
z = [];
for ii = 1:x
z = [z; ii];
end
Sadar commented that directly indexing out of bounds (as other answers are suggesting) is depreciated by MathWorks, I'm not sure on a source for this.
If your condition computation is separate from the output computation, you could get the required size first
k = 0;
while ~condition
condition = true; % evaluate the condition here
k = k + 1;
end
z = zeros( k, 1 ); % now we can pre-allocate
for ii = 1:k
z(ii) = ii; % assign values
end
Depending on your use case you might not know the actual number of iterations and therefore vector elements, but you might know the maximum possible number of iterations. As said before, resizing a vector in each loop iteration could be a real performance bottleneck, you might consider something like this:
maxNumIterations = 12345;
myVector = zeros(maxNumIterations, 1);
for n = 1:maxNumIterations
myVector(n) = someFunctionReturningTheDesiredValue(n);
if(condition)
vecLength = n;
break;
end
end
% Resize the vector to the length that has actually been filled
myVector = myVector(1:vecLength);
By the way, I'd give you the advice to NOT getting used to use i as an index in Matlab programs as this will mask the imaginary unit i. I ran into some nasty bugs in complex calculations inside loops by doing so, so I would advise to just take n or any other letter of your choice as your go-to loop index variable name even if you are not dealing with complex values in your functions ;)
You can just declare an empty matrix with
z = []
This will create a 0x0 matrix which will resize when you write data to it.
In your case it will grow to a vector ix1.
Keep in mind that this is much slower than initializing your vector beforehand with the zeros(dim,dim) function.
So if there is any way to figure out the max value of i you should initialize it withz = zeros(i,1)
cheers,
Simon
You can initialize z to be an empty array, it'll expand automatically during looping ...something like:
z = [];
for i = 1:Inf
z(i) = i;
if (condition)
break;
end
end
However this looks nasty (and throws a warning: Warning: FOR loop index is too large. Truncating to 9223372036854775807), I would do here a while (true) or the condition itself and increment manually.
z = [];
i = 0;
while !condition
i=i+1;
z[i]=i;
end
And/or if your example is really what you need at the end, replace the re-creation of the array with something like:
while !condition
i=i+1;
end
z = 1:i;
As mentioned in various times in this thread the resizing of an array is very processing intensive, and could take a lot of time.
If processing time is not an issue:
Then something like #Wolfie mentioned would be good enough. In each iteration the array length will be increased and that is that:
z = [];
for ii = 1:x
%z = [z; ii];
z(end+1) = ii % Best way
end
If processing time is an issue:
If the processing time is a large factor, and you want it to run as smooth as possible, then you need to preallocating.If you have a rough idea of the maximum number of iterations that will run then you can use #PluginPenguin's suggestion. But there could still be a change of hitting that preset limit, which will break (or severely slow down) the program.
My suggestion:
If your loop is running infinitely until you stop it, you could do occasional resizing. Essentially extending the size as you go, but only doing it once in a while. For example every 100 loops:
z = zeros(100,1);
for i=1:inf
z(i,1)=i;
fprintf("%d,\t%d\n",i,length(z)); % See it working
if i+1 >= length(z) %The array as run out of space
%z = [z; zeros(100,1)]; % Extend this array (note the semi-colon)
z((length(z)+100),1) = 0; % Seems twice as fast as the commented method
end
if(condition)%%condition is met then break out of the loop
break;
end;
end
This means that the loop can run forever, the array will increase with it, but only every once in a while. This means that the processing time hit will be minimal.
Edit:
As #Cris kindly mentioned MATLAB already does what I proposed internally. This makes two of my comments completely wrong. So the best will be to follow what #Wolfie and #Cris said with:
z(end+1) = i
Hope this helps!

How to correct "Valid indices for 'variable' are restricted in PARFOR loops" error in matlab

I am trying to set up a parfor nested loop in MatLab R2016a as below.
N = size(A,1);
M = size(v,1);
in = zeros(N*M,1);
parfor i=1:N
for j=1:M
k = (i-1)*M+j;
if sqrt(sum((A(i,:)-v(j,:)).^2))<=tol
in(k) = i;
end
end
end
However, I am getting the following error Valid indices for 'in' are restricted in PARFOR loops. Is there some way I can correct this because both arrays A and v are considerably large, over 40,000 rows for A and 8,000 v? The variable tol is 0.0959.
The problem is that MATLAB doesn't recognize that the variable k is slicing the matrix in correctly. The solution should be to index in using i and j separately:
N = size(A,1);
M = size(v,1);
in = zeros(M,N);
parfor i=1:N
for j=1:M
if sqrt(sum((A(i,:)-v(j,:)).^2))<=tol
in(j,i) = i;
end
end
end
in = in(:); % reshape to a column vector, as the output in the question's code
The other alternative, but it requires more intermediate memory, is to compute this without a loop at all:
A = reshape(A,1,N,[]);
v = reshape(v,M,1,[]);
in = sum(bsxfun(#minus,A,v).^2,3) < tol*tol;
in = in(:);
(Or something similar to that, I have not run this code... Please let me know, or fix the post, if there is a typo or other mistake.)
N = size(A,1);
M = size(v,1);
in = cell(N,1);
parfor i=1:N
s=v;
p=zeros(1:M,1);
for j=1:M
k = (i-1)*M+j;
if sqrt(sum((A(i,:)-s(j,:)).^2))<=tol
p(k) = i;
end
end
in{i}=single(p);
end
in=cell2mat(in);
in=reshape(in,[N*M,1]);
sometime matlab doesnt recognize a variable in parfor loop as "sliced variable"
, a sliced variable is a variable that has a reference out of parfor loop and each of its element only accessed by a single worker (in parfor parallel workers)
so you could use a temporary variable and collect results after the parfor loop,
NOTE 1: It is better to vectorise code in the older versions because loops didn't use to be as good as they are now in R2017 referring to (this).
NOTE 2: if most elements of "in" are zero try using "sparse matrix" which could save a lot of memory;

Sliced Variables in PARFOR loop: Sequential to Parallel Conversion in MATLAB

I have a code in MATLAB in which I'm running monte-carlo simulations using parfor instead of simple for loop to convert the code from sequential to parallel. Following is the piece of code which is inside the parfor loop.
But MATLAB gives an error saying "Valid indices for local_Q_mega_sub_seed are restricted in parfor loop". Suggested action says to "Fix the index" and it suggests to use "Sliced Variables". I've been struggling to use this concept. I have read https://blogs.mathworks.com/loren/2009/10/02/using-parfor-loops-getting-up-and-running/#12 and https://www.mathworks.com/matlabcentral/answers/123922-sliced-variables-in-parfor-loop-restricted-indexing along with MATLAB documentation https://www.mathworks.com/help/distcomp/sliced-variable.html and https://www.mathworks.com/help/distcomp/parfor.html but I'm not getting it right.
Could anyone please let me know how can I use sliced variables in the given piece of code so that I could get an idea?
index_f = 1;
subseed_step = (sub_seed_transmitted_at/fs_local)*sintablen_mega_frequency;
for i = 1 : fs_local
local_Q_mega_sub_seed(i) = SINTAB(round(index_f));
local_I_mega_sub_seed(i) = COSTAB(round(index_f));
index_f = index_f + subseed_step;
if index_f>sintablen_mega_frequency
index_f = index_f - sintablen_mega_frequency;
end
You're not showing enough context here, but I bet the problem here is similar to this one:
parfor ii = 1:10
for jj = 1:10
tmp(jj) = rand
end
out(ii) = sum(tmp);
end
In this case, the parfor machinery cannot categorically prove that the way tmp is being used is independent of the order of iterations of the parfor loop. This is because it appears as though values assigned to tmp in one iteration of the parfor loop are still being used in the next iteration.
Fortunately, there's a very simple workaround for this case - convince parfor that you are not doing anything dependent on the order of evaluation of the iterations of the loop by resetting the variable. In the simple case above, this means:
parfor ii = 1:10
% reset 'tmp' at the start of each parfor loop iteration
tmp = [];
for jj = 1:10
tmp(jj) = rand
end
out(ii) = sum(tmp);
end

MATLAB: Using a for loop within another function

I am trying to concatenate several structs. What I take from each struct depends on a function that requires a for loop. Here is my simplified array:
t = 1;
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
Here I am doing concatenation manually:
A = [a(1:2).data a(1:3).data a(1:4).data a(1:5).data] %concatenation function
As you can see, the range (1:2), (1:3), (1:4), and (1:5) can be looped, which I attempt to do like this:
t = 2;
A = [for t = 2:5
a(1:t).data
end]
This results in an error "Illegal use of reserved keyword "for"."
How can I do a for loop within the concatenate function? Can I do loops within other functions in Matlab? Is there another way to do it, other than copy/pasting the line and changing 1 number manually?
You were close to getting it right! This will do what you want.
A = []; %% note: no need to initialize t, the for-loop takes care of that
for t = 2:5
A = [A a(1:t).data]
end
This seems strange though...you are concatenating the same elements over and over...in this example, you get the result:
A =
1 4 1 4 9 1 4 9 16 1 4 9 16 25
If what you really need is just the .data elements concatenated into a single array, then that is very simple:
A = [a.data]
A couple of notes about this: why are the brackets necessary? Because the expressions
a.data, a(1:t).data
don't return all the numbers in a single array, like many functions do. They return a separate answer for each element of the structure array. You can test this like so:
>> [b,c,d,e,f] = a.data
b =
1
c =
4
d =
9
e =
16
f =
25
Five different answers there. But MATLAB gives you a cheat -- the square brackets! Put an expression like a.data inside square brackets, and all of a sudden those separate answers are compressed into a single array. It's magic!
Another note: for very large arrays, the for-loop version here will be very slow. It would be better to allocate the memory for A ahead of time. In the for-loop here, MATLAB is dynamically resizing the array each time through, and that can be very slow if your for-loop has 1 million iterations. If it's less than 1000 or so, you won't notice it at all.
Finally, the reason that HBHB could not run your struct creating code at the top is that it doesn't work unless a is already defined in your workspace. If you initialize a like this:
%% t = 1; %% by the way, you don't need this, the t value is overwritten by the loop below
a = []; %% always initialize!
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
then it runs for anyone the first time.
As an appendix to gariepy's answer:
The matrix concatenation
A = [A k];
as a way of appending to it is actually pretty slow. You end up reassigning N elements every time you concatenate to an N size vector. If all you're doing is adding elements to the end of it, it is better to use the following syntax
A(end+1) = k;
In MATLAB this is optimized such that on average you only need to reassign about 80% of the elements in a matrix. This might not seam much, but for 10k elements this adds up to ~ an order of magnitude of difference in time (at least for me).
Bare in mind that this works only in MATLAB 2012b and higher as described in this thead: Octave/Matlab: Adding new elements to a vector
This is the code I used. tic/toc syntax is not the most accurate method for profiling in MATLAB, but it illustrates the point.
close all; clear all; clc;
t_cnc = []; t_app = [];
N = 1000;
for n = 1:N;
% Concatenate
tic;
A = [];
for k = 1:n;
A = [A k];
end
t_cnc(end+1) = toc;
% Append
tic;
A = [];
for k = 1:n;
A(end+1) = k;
end
t_app(end+1) = toc;
end
t_cnc = t_cnc*1000; t_app = t_app*1000; % Convert to ms
% Fit a straight line on a log scale
P1 = polyfit(log(1:N),log(t_cnc),1); P_cnc = #(x) exp(P1(2)).*x.^P1(1);
P2 = polyfit(log(1:N),log(t_app),1); P_app = #(x) exp(P2(2)).*x.^P2(1);
% Plot and save
loglog(1:N,t_cnc,'.',1:N,P_cnc(1:N),'k--',...
1:N,t_app,'.',1:N,P_app(1:N),'k--');
grid on;
xlabel('log(N)');
ylabel('log(Elapsed time / ms)');
title('Concatenate vs. Append in MATLAB 2014b');
legend('A = [A k]',['O(N^{',num2str(P1(1)),'})'],...
'A(end+1) = k',['O(N^{',num2str(P2(1)),'})'],...
'Location','northwest');
saveas(gcf,'Cnc_vs_App_test.png');

Using struct arrays in parfor

I am having trouble using struct arrays in Matlab's parfor loop. The following code has 2 problems I do not understand:
s=struct('a',{},'b',{});
if matlabpool('size')==0
matlabpool open local 2
end
for j = 1:2
parfor k=1:4
fprintf('[%d,%d]\n',k,j)
s(j,k).a = k;
s(j,k).b = j;
end
end
matlabpool close
It fails with an error Error using parallel_function (line 589)
Insufficient number of outputs from right hand side of equal sign to satisfy assignment.
On output, variable s is a vector, not an array (as it should be, even if the code breaks before finishing).
EDIT the problem is solved if I initialize the struct arrays to the correct size, by:
s=struct('a',cell(2,4),'b',cell(2,4));
However, I would still be happy to get insights about the problem (e.g is it rally a bug, as suggested by Oleg Komarov)
It was originally working fine for me but then I don't know what happens. In general you need to be careful with parfor loops and there are ample documentation on how to align everything. Two different words of advice.
First and more importantly, the parfor loop is on the outside loop:
function s = foo
s=struct('a',{},'b',{});
parfor j = 1:2
for k=1:4
fprintf('[%d,%d]\n',k,j)
s(j,k).a = k;
s(j,k).b = j;
end
end
Two, Matlab gets very picky about writing the main exit variable (i.e. the variable contained in the parfor loop which is indexed to the loop, in your case, s). You first want to create a dummy variable that holds all the innerloop information, and then writes to it once at the end of the loops. Example:
function s = khal
s=struct('a',{},'b',{});
parfor j = 1:2
dummy=struct('a',{},'b',{});
for k=1:4
fprintf('[%d,%d]\n',k,j)
dummy(k).a = k;
dummy(k).b = j;
end
s(j,:) = dummy;
end
You don't have a problem here, but it can get complicated in other instances