Declaring a vector in matlab whose size we don't know - matlab

Suppose we are running an infinite for loop in MATLAB, and we want to store the iterative values in a vector. How can we declare the vector without knowing the size of it?
z=??
for i=1:inf
z(i,1)=i;
if(condition)%%condition is met then break out of the loop
break;
end;
end;

Please note first that this is bad practise, and you should preallocate where possible.
That being said, using the end keyword is the best option for extending arrays by a single element:
z = [];
for ii = 1:x
z(end+1, 1) = ii; % Index to the (end+1)th position, extending the array
end
You can also concatenate results from previous iterations, this tends to be slower since you have the assignment variable on both sides of the equals operator
z = [];
for ii = 1:x
z = [z; ii];
end
Sadar commented that directly indexing out of bounds (as other answers are suggesting) is depreciated by MathWorks, I'm not sure on a source for this.
If your condition computation is separate from the output computation, you could get the required size first
k = 0;
while ~condition
condition = true; % evaluate the condition here
k = k + 1;
end
z = zeros( k, 1 ); % now we can pre-allocate
for ii = 1:k
z(ii) = ii; % assign values
end

Depending on your use case you might not know the actual number of iterations and therefore vector elements, but you might know the maximum possible number of iterations. As said before, resizing a vector in each loop iteration could be a real performance bottleneck, you might consider something like this:
maxNumIterations = 12345;
myVector = zeros(maxNumIterations, 1);
for n = 1:maxNumIterations
myVector(n) = someFunctionReturningTheDesiredValue(n);
if(condition)
vecLength = n;
break;
end
end
% Resize the vector to the length that has actually been filled
myVector = myVector(1:vecLength);
By the way, I'd give you the advice to NOT getting used to use i as an index in Matlab programs as this will mask the imaginary unit i. I ran into some nasty bugs in complex calculations inside loops by doing so, so I would advise to just take n or any other letter of your choice as your go-to loop index variable name even if you are not dealing with complex values in your functions ;)

You can just declare an empty matrix with
z = []
This will create a 0x0 matrix which will resize when you write data to it.
In your case it will grow to a vector ix1.
Keep in mind that this is much slower than initializing your vector beforehand with the zeros(dim,dim) function.
So if there is any way to figure out the max value of i you should initialize it withz = zeros(i,1)
cheers,
Simon

You can initialize z to be an empty array, it'll expand automatically during looping ...something like:
z = [];
for i = 1:Inf
z(i) = i;
if (condition)
break;
end
end
However this looks nasty (and throws a warning: Warning: FOR loop index is too large. Truncating to 9223372036854775807), I would do here a while (true) or the condition itself and increment manually.
z = [];
i = 0;
while !condition
i=i+1;
z[i]=i;
end
And/or if your example is really what you need at the end, replace the re-creation of the array with something like:
while !condition
i=i+1;
end
z = 1:i;

As mentioned in various times in this thread the resizing of an array is very processing intensive, and could take a lot of time.
If processing time is not an issue:
Then something like #Wolfie mentioned would be good enough. In each iteration the array length will be increased and that is that:
z = [];
for ii = 1:x
%z = [z; ii];
z(end+1) = ii % Best way
end
If processing time is an issue:
If the processing time is a large factor, and you want it to run as smooth as possible, then you need to preallocating.If you have a rough idea of the maximum number of iterations that will run then you can use #PluginPenguin's suggestion. But there could still be a change of hitting that preset limit, which will break (or severely slow down) the program.
My suggestion:
If your loop is running infinitely until you stop it, you could do occasional resizing. Essentially extending the size as you go, but only doing it once in a while. For example every 100 loops:
z = zeros(100,1);
for i=1:inf
z(i,1)=i;
fprintf("%d,\t%d\n",i,length(z)); % See it working
if i+1 >= length(z) %The array as run out of space
%z = [z; zeros(100,1)]; % Extend this array (note the semi-colon)
z((length(z)+100),1) = 0; % Seems twice as fast as the commented method
end
if(condition)%%condition is met then break out of the loop
break;
end;
end
This means that the loop can run forever, the array will increase with it, but only every once in a while. This means that the processing time hit will be minimal.
Edit:
As #Cris kindly mentioned MATLAB already does what I proposed internally. This makes two of my comments completely wrong. So the best will be to follow what #Wolfie and #Cris said with:
z(end+1) = i
Hope this helps!

Related

Avoiding race conditions when using parfor in MATLAB

I'm looping in parallel and changing a variable if a condition is met. Super idiomatic code that I'm sure everyone has written a hundred times:
trials = 100;
greatest_so_far = 0;
best_result = 0;
for trial_i = 1:trials
[amount, result] = do_work();
if amount > greatest_so_far
greatest_so_far = amount;
best_result = result;
end
end
If I wanted to replace for by parfor, how can I ensure that there aren't race conditions when checking whether we should replace greatest_so_far? Is there a way to lock this variable outside of the check? Perhaps like:
trials = 100;
greatest_so_far = 0;
best_result = 0;
parfor trial_i = 1:trials
[amount, result] = do_work();
somehow_lock(greatest_so_far);
if amount > greatest_so_far
greatest_so_far = amount;
best_result = result;
end
somehow_unlock(greatest_so_far);
end
Skewed answer. It does not exactly solve your problem, but it might help you avoiding it.
If you can afford the memory to store the outputs of your do_work() in some vectors, then you could simply run your parfor on this function only, store the result, then do your scoring at the end (outside of the loop):
amount = zeros( trials , 1 ) ;
result = zeros( trials , 1 ) ;
parfor trial_i = 1:trials
[amount(i), result(i)] = do_work();
end
[ greatest_of_all , greatest_index ] = max(amount) ;
best_result = result(greatest_index) ;
Edit/comment : (wanted to put that in comment of your question but it was too long, sorry).
I am familiar with .net and understand completely your lock/unlock request. I myself tried many attempts to implement a kind of progress indicator for very long parfor loop ... to no avail.
If I understand Matlab classification of variable correctly, the mere fact that you assign greatest_so_far (in greatest_so_far=amount) make Matlab treat it as a temporary variable, which will be cleared and reinitialized at the beginning of every loop iteration (hence unusable for your purpose).
So an easy locked variable may not be a concept we can implement simply at the moment. Some convoluted class event or file writing/checking may do the trick but I am afraid the timing would suffer greatly. If each iteration takes a long time to execute, the overhead might be worth it, but if you use parfoor to accelerate a high number of short execution iterations, then the convoluted solutions would slow you down more than help ...
You can have a look at this stack exchange question, you may find something of interest for your case: Semaphores and locks in MATLAB
The solution from Hoki is the right way to solve the problem as stated. However, as you asked about race conditions and preventing them when loop iterations depend on each other you might want to investigate spmd and the various lab* functions.
You need to use SPMD to do this - SPMD allows communication between the workers. Something like this:
bestResult = -Inf;
bestIndex = NaN;
N = 97;
spmd
% we need to round up the loop range to ensure that each
% worker executes the same number of iterations
loopRange = numlabs * ceil(N / numlabs);
for idx = 1:numlabs:loopRange
if idx <= N
local_result = rand(); % obviously replace this with your actual function
else
local_result = -Inf;
end
% Work out which index has the best result - use a really simple approach
% by concatenating all the results this time from each worker
% allResultsThisTime will be 2-by-numlabs where the first row is all the
% the results this time, and the second row is all the values of idx from this time
allResultsThisTime = gcat([local_result; idx]);
% The best result this time - consider the first row
[bestResultThisTime, labOfBestResult] = max(allResultsThisTime(1, :));
if bestResultThisTime > bestResult
bestResult = bestResultThisTime;
bestIndex = allResultsThisTime(2, labOfBestResult);
end
end
end
disp(bestResult{1})
disp(bestIndex{1})

Get the iteration number inside a MATLAB for loop

Say I have a for loop in MATLAB:
scales = 5:5:95;
for scale = scales
do stuff
end
How can I get the iteration number inside a MATLAB for loop as concisely as possible?
In Python for example I would use:
for idx, item in enumerate(scales):
where idx is the iteration number.
I know that in MATLAB (like in any other language) I could create a count variable:
scales = 5:5:95;
scale_count = 0;
for scale = scales
scale_count = scale_count + 1;
do stuff
end
I could otherwise use find:
scales = 5:5:95;
for scale = scales
scale_count = find(scales == scale);
do stuff
end
But I'm curious to know whether there exists a more concise way to do it, e.g. like in the Python example.
Maybe you can use the following:
scales = 5:5:95;
for iter = 1:length(scales)
scale=scales(iter); % "iter" is the iteration number.
do stuff
end
Since for iterates over the columns of whatever you give it, another way of approximating multiple loop variables would be to use an appropriately constructed matrix:
for scale=[5:5:95; 1:19]
% do stuff with scale(1) or scale(2) as appropriate
end
(my personal preference is to loop over the indices as per Parag's answer and just refer to data(index) directly within the loop, without an intermediate. Matlab's syntax isn't very concise at the best of times - you just get used to it)
The MATLAB way is probably doing it with vectors.
For example suppose you want to find in a vector if there is a value that is equal to its position. You would generally do this:
a = [10 20 1 3 5];
found = 0;
for index = 1:length(a)
if a(index) == index
found = 1;
break;
end
end
Instead you can do:
found = any(a == 1:length(a));
In general
for i=1:length(a)
dostuff(a(i), i);
end
can be replaced with:
dostuff(a(i), 1:length(a))
it dostuff can be vectorized or
arrayfun(#dostuff, a, 1:length(a))
otherwise.

How to check a matrix is/isn't in an array (Matlab)

I have an array (M) of matrices. I perform an operation on the matrix in the ith position, and it adds three more matrices to my array in the (3i-1), (3i) and (3i+1)th positions. I want to continue this process until I reach the jth position in the array, where j is such that all matrices in the (j+1)th position and onwards have appeared already somewhere between positions 1 and j (inclusive).
EDIT: I've been asked to clarify what I mean. I am unable to write code that makes my algorithm terminate when I want it to as explained above. If I knew a proper way of searching through an array of matrices to check if a given matrix is contained, then I could do it. I tried the following:
done = 0;
ii = 1
while done ~= 1
%operation on matrix in ith position omitted, but this is where it goes
for jj = ii+1:numel(M)
for kk = 1:ii
if M{jj} == M{kk};
done = done + 1/(numel(M) - ii);
break
end
end
end
if done ~= 1
done = 0;
end
ii = ii + 1
end
The problem I have with this (as I'm sure you can see) is that if the process goes on for too long, rounding errors stop ever allowing done = 1, and the algorithm doesn't terminate. I tried getting round this by introducing thresholds, something like
while abs(done - 1) > thresh
and
if abs(done - 1) > thresh
done = 0;
end
This makes the algorithm work more often, but I don't have a 'one size fits all' threshold that I could use (the process could continue for arbitrarily many steps), so it still ends up breaking.
What can I do to fix this?
Thanks
Why don't you initialize done at 0, keep your while done==0 loop, and instead of computing done as a sum of elements, check if your condition (finding if the matrix already exists) is verified for all jj, something like this:
alldone=zeros(numel(M)-ii,1);
for jj = ii+1:numel(M)
for kk = 1:ii
if isequal(M{jj},M{kk})
alldone(jj-ii) = 1
break
end
end
end
done=prod(alldone);
There is probably a more elegant way to code this, though.
For instance, you could add early termination:
while done==0
done=1;
for jj = ii+1:numel(M)
match_success=0;
for kk = 1:ii
if isequal(M{jj},M{kk})
match_success=1;
break
end
end
if match_success==0
done=0;
break;
end
end
end
At the beginning of each loop, the algorithm assumes it is going to succeed and stop there (hence the done=1). Then for each jj, we create a match_success which will be set to 1 only if a match is found for M{jj}. If the match is found, we break and go to the next j. If no match if found for j, match_success is left to 0, done is initialized to 0 and the while loop continues. I haven't checked it, but I think it should work.
This is just a simple tweak, but again, more thought can probably speed up this whole code a lot.

How do I know how many iterations are left in a parfor loop in Matlab?

I am running a parfor loop in Matlab that takes a lot of time and I would like to know how many iterations are left. How can I get that info?
I don't believe you can get that information directly from MATLAB, short of printing something with each iteration and counting these lines by hand.
To see why, recall that each parfor iteration executes in its own workspace: while incrementing a counter within the loop is legal, accessing its "current" value is not (because this value does not really exist until completion of the loop). Furthermore, the parfor construct does not guarantee any particular execution order, so printing the iterator value isn't helpful.
cnt = 0;
parfor i=1:n
cnt = cnt + 1; % legal
disp(cnt); % illegal
disp(i); % legal ofc. but out of order
end
Maybe someone does have a clever workaround, but I think that the independent nature of the parfor iterations belies taking a reliable count. The restrictions mentioned above, plus those on using evalin, etc. support this conclusion.
As #Jonas suggested, you could obtain the iteration count via side effects occurring outside of MATLAB, e.g. creating empty files in a certain directory and counting them. This you can do in MATLAB of course:
fid = fopen(['countingDir/f' num2str(i)],'w');
fclose(fid);
length(dir('countingDir'));
Try this FEX file: http://www.mathworks.com/matlabcentral/fileexchange/32101-progress-monitor--progress-bar--that-works-with-parfor
You can easily modify it to return the iteration number instead of displaying a progress bar.
Something like a progress bar could be done similar to this...
Before the parfor loop :
fprintf('Progress:\n');
fprintf(['\n' repmat('.',1,m) '\n\n']);
And during the loop:
fprintf('\b|\n');
Here we have m is the total number of iterations, the . shows the total number of iterations and | shows the number of iterations completed. The \n makes sure the characters are printed in the parfor loop.
With Matlab 2017a or later you can use a data queue or a pollable data queue to achieve this. Here's the MathWorks documentation example of how to do a progress bar from the first link :
function a = parforWaitbar
D = parallel.pool.DataQueue;
h = waitbar(0, 'Please wait ...');
afterEach(D, #nUpdateWaitbar);
N = 200;
p = 1;
parfor i = 1:N
a(i) = max(abs(eig(rand(400))));
send(D, i);
end
function nUpdateWaitbar(~)
waitbar(p/N, h);
p = p + 1;
end
end
End result :
If you just want to know how much time is left approximately, you can run the program once record the max time and then do this
tStart = tic;
parfor i=1:n
tElapsed = toc(tStart;)
disp(['Time left in min ~ ', num2str( ( tMax - tElapsed ) / 60 ) ]);
...
end
I created a utility to do this:
http://www.mathworks.com/matlabcentral/fileexchange/48705-drdan14-parforprogress

Matlab: how to implement a dynamic vector

I am refering to an example like this
I have a function to analize the elements of a vector, 'input'. If these elements have a special property I store their values in a vector, 'output'.
The problem is that at the begging I don´t know the number of elements it will need to store in 'output'so I don´t know its size.
I have a loop, inside I go around the vector, 'input' through an index. When I consider special some element of this vector capture the values of 'input' and It be stored in a vector 'ouput' through a sentence like this:
For i=1:N %Where N denotes the number of elements of 'input'
...
output(j) = input(i);
...
end
The problem is that I get an Error if I don´t previously "declare" 'output'. I don´t like to "declare" 'output' before reach the loop as output = input, because it store values from input in which I am not interested and I should think some way to remove all values I stored it that don´t are relevant to me.
Does anyone illuminate me about this issue?
Thank you.
How complicated is the logic in the for loop?
If it's simple, something like this would work:
output = input ( logic==true )
Alternatively, if the logic is complicated and you're dealing with big vectors, I would preallocate a vector that stores whether to save an element or not. Here is some example code:
N = length(input); %Where N denotes the number of elements of 'input'
saveInput = zeros(1,N); % create a vector of 0s
for i=1:N
...
if (input meets criteria)
saveInput(i) = 1;
end
end
output = input( saveInput==1 ); %only save elements worth saving
The trivial solution is:
% if input(i) meets your conditions
output = [output; input(i)]
Though I don't know if this has good performance or not
If N is not too big so that it would cause you memory problems, you can pre-assign output to a vector of the same size as input, and remove all useless elements at the end of the loop.
output = NaN(N,1);
for i=1:N
...
output(i) = input(i);
...
end
output(isnan(output)) = [];
There are two alternatives
If output would be too big if it was assigned the size of N, or if you didn't know the upper limit of the size of output, you can do the following
lengthOutput = 100;
output = NaN(lengthOutput,1);
counter = 1;
for i=1:N
...
output(counter) = input(i);
counter = counter + 1;
if counter > lengthOutput
%# append output if necessary by doubling its size
output = [output;NaN(lengthOutput,1)];
lengthOutput = length(output);
end
end
%# remove unused entries
output(counter:end) = [];
Finally, if N is small, it is perfectly fine to call
output = [];
for i=1:N
...
output = [output;input(i)];
...
end
Note that performance degrades dramatically if N becomes large (say >1000).