how to assign one value to a list of Objects in an efficient way in Matlab? - matlab

I want to assign one value to a list of objects in Matlab without using a for-loop (In order to increase efficiency)
Basically this works:
for i=1:Nr_of_Objects
Objectlist(i,1).weight=0.2
end
But I would like something like this:
Objectlist(:,1).weight=0.2
Which is not working. I get this error:
Expected one output from a curly brace or dot indexing expression, but there were 5 results.
Writing an array to the right hand side is also not working.
I`m not very familiar with object oriented programming in Matlab, so I would be happy if someone could help me.

Your looking for the deal function:
S(1,1).a = 1
S(2,1).a = 2
S(1,2).a = 3
[S(:,1).a] = deal(4)
Now S(1,1).a and S(2,1).a equal to 4.
In matlab you can concatenate several output in one array using []. And deal(X) copies the single input to all the requested outputs.
So in your case:
[Objectlist(:,1).weight] = deal(0.2)
Should work.
Noticed that I'm not sure that it will be faster than the for loop since I don't know how the deal function is implemented.
EDIT: Benchmark
n = 1000000;
[S(1:n,1).a] = deal(1);
tic
for ii=1:n
S(ii,1).a = 2;
end
toc
% Elapsed time is 3.481088 seconds
tic
[S(1:n,1).a] = deal(2);
toc
% Elapsed time is 0.472028 seconds
Or with timeit
n = 1000000;
[S(1:n,1).a] = deal(1);
g = #() func1(S,n);
h = #() func2(S,n);
timeit(g)
% ans = 3.67
timeit(h)
% ans = 0.41
function func1(S,n)
for ii=1:n
S(ii,1).a = 2;
end
end
function func2(S,n)
[S(1:n,1).a] = deal(2);
end
So it seems that using the deal function reduce the computational time.

Related

Create array with function and size

Recently I wrote this statement
v = arrayfun(#(x) sum(randn(1, 4).^2), zeros(1, 1000000));
to create a new vector and now I'm asking if there exists a function in Matlab to avoid the creation of the unnecessary second vector zeros(1, 1000000). I'm looking for something like
v = FUN(#someInitFunction, [rows, cols]);
without loops, recursion and unnecessary allocation where someInitFunction is given and can't be changed. Does Matlab provide such a function FUN? A simple "No, it doesn't exist" would be a valid answer for me.
To summarize the function FUN: I want to create a new array by calling a function someInitFunction for each element of this new array. The array should be equivalent to
[
someInitFunction() someInitFunction() ...;
someInitFunction() someInitFunction() ...;
.
.
.
someInitFunction() someInitFunction() ...
]
As far as I know there is no builtin function for that. It's relatively easy to create your own however.
You asked a solution without loop but the current solution you are using (arrayfun) uses loop under the hood, and generally coding the same in a properly organised loop is actually faster than arrayfun.
For your case, the function GenArrayFun.m :
function out = GenArrayFun(initFunction , arraySize)
out = zeros(arraySize) ;
for k=1:numel(out)
out(k) = initFunction() ;
end
It has a loop, but no more than arrayfun, and seem to perform twice as fast (at least on my installation, R2016a, win10):
initFunction = #() sum(randn(1, 4).^2) ;
tic
v = arrayfun(#(x) sum(randn(1, 4).^2), zeros(1, 1000000));
toc
tic
out = GenArrayFun( initFunction , [1,1000000] );
toc
Sorry I did not take the time to build a proper timeit benchmark for such a small example, I think the results are significant enough to notice a difference:
Elapsed time is 6.815043 seconds. % arrayfun
Elapsed time is 3.060161 seconds. % GenArrayFun
And just to make sure it evaluate the initFunction for every element:
>> out = GenArrayFun( initFunction , [2,3] )
out =
6.25676106665387 6.52758807745462 2.99236122767462
0.386750258201569 0.566092999842791 2.21158011908878

Declaring a vector in matlab whose size we don't know

Suppose we are running an infinite for loop in MATLAB, and we want to store the iterative values in a vector. How can we declare the vector without knowing the size of it?
z=??
for i=1:inf
z(i,1)=i;
if(condition)%%condition is met then break out of the loop
break;
end;
end;
Please note first that this is bad practise, and you should preallocate where possible.
That being said, using the end keyword is the best option for extending arrays by a single element:
z = [];
for ii = 1:x
z(end+1, 1) = ii; % Index to the (end+1)th position, extending the array
end
You can also concatenate results from previous iterations, this tends to be slower since you have the assignment variable on both sides of the equals operator
z = [];
for ii = 1:x
z = [z; ii];
end
Sadar commented that directly indexing out of bounds (as other answers are suggesting) is depreciated by MathWorks, I'm not sure on a source for this.
If your condition computation is separate from the output computation, you could get the required size first
k = 0;
while ~condition
condition = true; % evaluate the condition here
k = k + 1;
end
z = zeros( k, 1 ); % now we can pre-allocate
for ii = 1:k
z(ii) = ii; % assign values
end
Depending on your use case you might not know the actual number of iterations and therefore vector elements, but you might know the maximum possible number of iterations. As said before, resizing a vector in each loop iteration could be a real performance bottleneck, you might consider something like this:
maxNumIterations = 12345;
myVector = zeros(maxNumIterations, 1);
for n = 1:maxNumIterations
myVector(n) = someFunctionReturningTheDesiredValue(n);
if(condition)
vecLength = n;
break;
end
end
% Resize the vector to the length that has actually been filled
myVector = myVector(1:vecLength);
By the way, I'd give you the advice to NOT getting used to use i as an index in Matlab programs as this will mask the imaginary unit i. I ran into some nasty bugs in complex calculations inside loops by doing so, so I would advise to just take n or any other letter of your choice as your go-to loop index variable name even if you are not dealing with complex values in your functions ;)
You can just declare an empty matrix with
z = []
This will create a 0x0 matrix which will resize when you write data to it.
In your case it will grow to a vector ix1.
Keep in mind that this is much slower than initializing your vector beforehand with the zeros(dim,dim) function.
So if there is any way to figure out the max value of i you should initialize it withz = zeros(i,1)
cheers,
Simon
You can initialize z to be an empty array, it'll expand automatically during looping ...something like:
z = [];
for i = 1:Inf
z(i) = i;
if (condition)
break;
end
end
However this looks nasty (and throws a warning: Warning: FOR loop index is too large. Truncating to 9223372036854775807), I would do here a while (true) or the condition itself and increment manually.
z = [];
i = 0;
while !condition
i=i+1;
z[i]=i;
end
And/or if your example is really what you need at the end, replace the re-creation of the array with something like:
while !condition
i=i+1;
end
z = 1:i;
As mentioned in various times in this thread the resizing of an array is very processing intensive, and could take a lot of time.
If processing time is not an issue:
Then something like #Wolfie mentioned would be good enough. In each iteration the array length will be increased and that is that:
z = [];
for ii = 1:x
%z = [z; ii];
z(end+1) = ii % Best way
end
If processing time is an issue:
If the processing time is a large factor, and you want it to run as smooth as possible, then you need to preallocating.If you have a rough idea of the maximum number of iterations that will run then you can use #PluginPenguin's suggestion. But there could still be a change of hitting that preset limit, which will break (or severely slow down) the program.
My suggestion:
If your loop is running infinitely until you stop it, you could do occasional resizing. Essentially extending the size as you go, but only doing it once in a while. For example every 100 loops:
z = zeros(100,1);
for i=1:inf
z(i,1)=i;
fprintf("%d,\t%d\n",i,length(z)); % See it working
if i+1 >= length(z) %The array as run out of space
%z = [z; zeros(100,1)]; % Extend this array (note the semi-colon)
z((length(z)+100),1) = 0; % Seems twice as fast as the commented method
end
if(condition)%%condition is met then break out of the loop
break;
end;
end
This means that the loop can run forever, the array will increase with it, but only every once in a while. This means that the processing time hit will be minimal.
Edit:
As #Cris kindly mentioned MATLAB already does what I proposed internally. This makes two of my comments completely wrong. So the best will be to follow what #Wolfie and #Cris said with:
z(end+1) = i
Hope this helps!

Fast way to get mean values of rows accordingly to subscripts

I have a data, which may be simulated in the following way:
N = 10^6;%10^8;
K = 10^4;%10^6;
subs = randi([1 K],N,1);
M = [randn(N,5) subs];
M(M<-1.2) = nan;
In other words, it is a matrix, where the last row is subscripts.
Now I want to calculate nanmean() for each subscript. Also I want to save number of rows for each subscript. I have a 'dummy' code for this:
uniqueSubs = unique(M(:,6));
avM = nan(numel(uniqueSubs),6);
for iSub = 1:numel(uniqueSubs)
tmpM = M(M(:,6)==uniqueSubs(iSub),1:5);
avM(iSub,:) = [nanmean(tmpM,1) size(tmpM,1)];
end
The problem is, that it is too slow. I want it to work for N = 10^8 and K = 10^6 (see commented part in the definition of these variables.
How can I find the mean of the data in a faster way?
This sounds like a perfect job for findgroups and splitapply.
% Find groups in the final column
G = findgroups(M(:,6));
% function to apply per group
fcn = #(group) [mean(group, 1, 'omitnan'), size(group, 1)];
% Use splitapply to apply fcn to each group in M(:,1:5)
result = splitapply(fcn, M(:, 1:5), G);
% Check
assert(isequaln(result, avM));
M = sortrows(M,6); % sort the data per subscript
IDX = diff(M(:,6)); % find where the subscript changes
tmp = find(IDX);
tmp = [0 ;tmp;size(M,1)]; % add start and end of data
for iSub= 2:numel(tmp)
% Calculate the mean over just a single subscript, store in iSub-1
avM2(iSub-1,:) = [nanmean(M(tmp(iSub-1)+1:tmp(iSub),1:5),1) tmp(iSub)-tmp(iSub-1)];tmp(iSub-1)];
end
This is some 60 times faster than your original code on my computer. The speed-up mainly comes from presorting the data and then finding all locations where the subscript changes. That way you do not have to traverse the full array each time to find the correct subscripts, but rather you only check what's necessary each iteration. You thus calculate the mean over ~100 rows, instead of first having to check in 1,000,000 rows whether each row is needed that iteration or not.
Thus: in the original you check numel(uniqueSubs), 10,000 in this case, whether all N, 1,000,000 here, numbers belong to a certain category, which results in 10^12 checks. The proposed code sorts the rows (sorting is NlogN, thus 6,000,000 here), and then loop once over the full array without additional checks.
For completion, here is the original code, along with my version, and it shows the two are the same:
N = 10^6;%10^8;
K = 10^4;%10^6;
subs = randi([1 K],N,1);
M = [randn(N,5) subs];
M(M<-1.2) = nan;
uniqueSubs = unique(M(:,6));
%% zlon's original code
avM = nan(numel(uniqueSubs),7); % add the subscript for comparison later
tic
uniqueSubs = unique(M(:,6));
for iSub = 1:numel(uniqueSubs)
tmpM = M(M(:,6)==uniqueSubs(iSub),1:5);
avM(iSub,:) = [nanmean(tmpM,1) size(tmpM,1) uniqueSubs(iSub)];
end
toc
%%%%% End of zlon's code
avM = sortrows(avM,7); % Sort for comparison
%% Start of Adriaan's code
avM2 = nan(numel(uniqueSubs),6);
tic
M = sortrows(M,6);
IDX = diff(M(:,6));
tmp = find(IDX);
tmp = [0 ;tmp;size(M,1)];
for iSub = 2:numel(tmp)
avM2(iSub-1,:) = [nanmean(M(tmp(iSub-1)+1:tmp(iSub),1:5),1) tmp(iSub)-tmp(iSub-1)];
end
toc %tic/toc should not be used for accurate timing, this is just for order of magnitude
%%%% End of Adriaan's code
all(avM(:,1:6) == avM2) % Do the comparison
% End of script
% Output
Elapsed time is 58.561347 seconds.
Elapsed time is 0.843124 seconds. % ~70 times faster
ans =
1×6 logical array
1 1 1 1 1 1 % i.e. the matrices are equal to one another

MATLAB: Using a for loop within another function

I am trying to concatenate several structs. What I take from each struct depends on a function that requires a for loop. Here is my simplified array:
t = 1;
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
Here I am doing concatenation manually:
A = [a(1:2).data a(1:3).data a(1:4).data a(1:5).data] %concatenation function
As you can see, the range (1:2), (1:3), (1:4), and (1:5) can be looped, which I attempt to do like this:
t = 2;
A = [for t = 2:5
a(1:t).data
end]
This results in an error "Illegal use of reserved keyword "for"."
How can I do a for loop within the concatenate function? Can I do loops within other functions in Matlab? Is there another way to do it, other than copy/pasting the line and changing 1 number manually?
You were close to getting it right! This will do what you want.
A = []; %% note: no need to initialize t, the for-loop takes care of that
for t = 2:5
A = [A a(1:t).data]
end
This seems strange though...you are concatenating the same elements over and over...in this example, you get the result:
A =
1 4 1 4 9 1 4 9 16 1 4 9 16 25
If what you really need is just the .data elements concatenated into a single array, then that is very simple:
A = [a.data]
A couple of notes about this: why are the brackets necessary? Because the expressions
a.data, a(1:t).data
don't return all the numbers in a single array, like many functions do. They return a separate answer for each element of the structure array. You can test this like so:
>> [b,c,d,e,f] = a.data
b =
1
c =
4
d =
9
e =
16
f =
25
Five different answers there. But MATLAB gives you a cheat -- the square brackets! Put an expression like a.data inside square brackets, and all of a sudden those separate answers are compressed into a single array. It's magic!
Another note: for very large arrays, the for-loop version here will be very slow. It would be better to allocate the memory for A ahead of time. In the for-loop here, MATLAB is dynamically resizing the array each time through, and that can be very slow if your for-loop has 1 million iterations. If it's less than 1000 or so, you won't notice it at all.
Finally, the reason that HBHB could not run your struct creating code at the top is that it doesn't work unless a is already defined in your workspace. If you initialize a like this:
%% t = 1; %% by the way, you don't need this, the t value is overwritten by the loop below
a = []; %% always initialize!
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
then it runs for anyone the first time.
As an appendix to gariepy's answer:
The matrix concatenation
A = [A k];
as a way of appending to it is actually pretty slow. You end up reassigning N elements every time you concatenate to an N size vector. If all you're doing is adding elements to the end of it, it is better to use the following syntax
A(end+1) = k;
In MATLAB this is optimized such that on average you only need to reassign about 80% of the elements in a matrix. This might not seam much, but for 10k elements this adds up to ~ an order of magnitude of difference in time (at least for me).
Bare in mind that this works only in MATLAB 2012b and higher as described in this thead: Octave/Matlab: Adding new elements to a vector
This is the code I used. tic/toc syntax is not the most accurate method for profiling in MATLAB, but it illustrates the point.
close all; clear all; clc;
t_cnc = []; t_app = [];
N = 1000;
for n = 1:N;
% Concatenate
tic;
A = [];
for k = 1:n;
A = [A k];
end
t_cnc(end+1) = toc;
% Append
tic;
A = [];
for k = 1:n;
A(end+1) = k;
end
t_app(end+1) = toc;
end
t_cnc = t_cnc*1000; t_app = t_app*1000; % Convert to ms
% Fit a straight line on a log scale
P1 = polyfit(log(1:N),log(t_cnc),1); P_cnc = #(x) exp(P1(2)).*x.^P1(1);
P2 = polyfit(log(1:N),log(t_app),1); P_app = #(x) exp(P2(2)).*x.^P2(1);
% Plot and save
loglog(1:N,t_cnc,'.',1:N,P_cnc(1:N),'k--',...
1:N,t_app,'.',1:N,P_app(1:N),'k--');
grid on;
xlabel('log(N)');
ylabel('log(Elapsed time / ms)');
title('Concatenate vs. Append in MATLAB 2014b');
legend('A = [A k]',['O(N^{',num2str(P1(1)),'})'],...
'A(end+1) = k',['O(N^{',num2str(P2(1)),'})'],...
'Location','northwest');
saveas(gcf,'Cnc_vs_App_test.png');

MATLAB: Nested for-loop takes longer every successive iteration

/edit: The loop doesn't become slower. I didn't take the time correctly. See Rasman's answer.
I'm looping over 3 parameters for a somewhat long and complicated function and I noticed two things that I don't understand:
The execution gets slower with each successive iteration, although the function only returns one struct (of which I only need one field) that I overwrite with each iteration.
The profiler shows that the end statement for the innermost for takes a quite long time.
Consider the following example (I'm aware that this can easily be vectorized, but as far as I understand the function I call can't):
function stuff = doSomething( x, y, z )
stuff.one = x+y+z;
stuff.two = x-y-z;
end
and how I execute the function
n = 50;
i = 0;
currenttoc = 0;
output = zeros(n^3,4);
tic
for x = 1:n
for y = 1:n
for z = 1:n
i = i + 1;
output(i,1) = x;
output(i,2) = y;
output(i,3) = z;
stuff = doSomething(x,y,z);
output(i,4) = stuff.one;
if mod(i,1e4) == 0 % only for demonstration, not in final script
currenttoc = toc - currenttoc;
fprintf(1,'time for last 10000 iterations: %f \n',currenttoc)
end
end
end
end
How can I speed this up? Why does every iteration take longer than the one before? I'm pretty sure this is horrible programming, sorry for that.
When I replace the call to doSomething with output(i,4)=toc;, and I plot diff(output(:,4)), I see that it's the call to fprintf that takes longer and longer every time, apparently.
Removing the if-clause returns to every iteration taking about the same amount of time.
So, the problem gets largely eliminated when I replace the if statement with:
if mod(i,1e4) == 0 % only for demonstration, not in final script
fprintf(1,'time for last 10000 iterations: %f \n',toc); tic;
end
I think the operation on toc may be causing the problem
It's MUCH faster if doSomething returns multiple output variables rather than a struct
function [out1,out2] = doSomething( x, y, z )
out1 = x+y+z;
out2 = x-y-z;
end
The fact that it gets slower on each subsequent iteration is strange and i have no explanation for it but hopefully that gives you some speed up at least.