Avoiding loops in MatLab code (barycentric weights) - matlab

After having learned basic programming in Java, I have found that the most difficult part of transitioning to MatLab for my current algorithm course, is to avoid loops. I know that there are plenty of smart ways to vectorize operations in MatLab, but my mind is so "stuck" in loop-thinking, that I am finding it hard to intuitively see how I may vectorize code. Once I am shown how it can be done, it makes sense to me, but I just don't see it that easily myself. Currently I have the following code for finding the barycentric weights used in Lagrangian interpolation:
function w = barycentric_weights(x);
% The function is used to find the weights of the
% barycentric formula based on a given grid as input.
n = length(x);
w = zeros(1,n);
% Calculating the weights
for i = 1:n
prod = 1;
for j = 1:n
if i ~= j
prod = prod*(x(i) - x(j));
end
end
w(i) = prod;
end
w = 1./w;
I am pretty sure there must be a smarter way to do this in MatLab, but I just can't think of it. If anyone has any tips I will be very grateful :). And the only way I'll ever learn all the vectorizing tricks in MatLab is to see how they are used in various scenarios such as above.

One has to be creative in matlab to avoid for loop:
[X,Y] =meshgrid(x,x)
Z = X - Y
w =1./prod(Z+eye(length(x)))

Kristian, there are a lot of ways to vectorize code. You've already gotten two. (And I agree with shakinfree: you should always consider 1) how long it takes to run in non-vectorized form (so you'll have an idea of how much time you might save by vectorizing); 2) how long it might take you to vectorize (so you'll have a better sense of whether or not it's worth your time; 3) how many times you will call it (again: is it worth doing); and 3) readability. As shakinfree suggests, you don't want to come back to your code a year from now and scratch your head about what you've implemented. At least make sure you've commented well.
But at a meta-level, when you decide that you need to improve runtime performance by vectorizing, first start with small (3x1 ?) array and make sure you understand exactly what's happening for each iteration. Then, spend some time reading this document, and following relevant links:
http://www.mathworks.com/help/releases/R2012b/symbolic/code-performance.html
It will help you determine when and how to vectorize.
Happy MATLABbing!
Brett

I can see the appeal of vectorization, but I often ask myself how much time it actually saves when I go back to the code a month later and have to decipher all that repmat gibberish. I think your current code is clean and clear and I wouldn't mess with it unless performance is really critical. But to answer your question here is my best effort:
function w = barycentric_weights_vectorized(x)
n = length(x);
w = 1./prod(eye(n) + repmat(x,n,1) - repmat(x',1,n),1);
end
Hope that helps!
And I am assuming x is a row vector here.

Related

Replace two for loops using vectorization?

I have a function f(x,y) = abs(cos(x+3) * sin(y+2)) that I need to sum up using two for loops. Note: the real function is more complex, this is a toy version of it for the purposes of the question.
f = #(x,y) abs(cos(x+3) * sin(y+2));
tot = 0;
for m=1:100
for n=1:100
tot = tot + f(m,n);
end
end
disp(tot)
Output: 4.026314876227891e+03
How can I vectorize this code to get rid of the for loops and make it faster?
[n,m]=meshgrid(1:100,1:100);
tot=sum(f(m,n),'all')
However I am not sure this is any faster, you can time it. Matlab is quite fast in loops, the old truth about it being slower when you loop is outdated by 5 years or so. Most of the times the JIT compiler will find the fastest way to run it. This is one of the cases where your toy problem may hid the actual problem, as the JIT may find this toy problem easier to speed up, but not your real one, or vice versa.
You will need to time.

Optimizing Fourier Series Fitting Function Matlab

I am trying to iterate through a set of samples that seems to show periodic changes. I need continuously apply the fit function to get the fourier series coefficients, the regression has to be n samples in the past (in my case, around 30). The problem is, my code is extremely slow! It will take like 1 hour to do this for a set of 50,000 samples. Is there any way to optimize this? What am I doing wrong?
Here's my code:
function[coefnames,coef] = fourier_regression(vect_waves,n)
j = 1;
coef = zeros(length(vect_waves)-n,10);
for i=n+1:length(vect_waves)
take_fourier = vect_waves(i-n+1:i);
x = 1:n;
f = fit(x,take_fourier,'fourier4');
current_coef = coeffvalues(f);
coef(j,1:length(current_coef)) = current_coef;
j = j + 1;
end
coefnames = coeffnames(f);
end
When I call [coefnames,coef] = fourier_regression(VECTOR,30); This takes forever to compute. Is there any way to fix it? What's wrong with my code?
Note: I have a intel i7 5500 U cpu, 16GB RAM, and using Matlab 2015a.
As I am not familiar with your application, I am not sure whether it is possible to vectorize the code to improve performance. However, I have a couple of other tips.
One thing you should consider is preallocation of arrays. In this case, you should preallocate at least the array coef since I believe you do know its size before starting the loop.
Another thing I suggest is to profile your code. This will provide information on what parts of your code are consuming the most time, helping you focus your effort on improving those parts' performance.

vectorizing "for" loop with bidirectionally related variables

Last week I asked the following:
https://stackoverflow.com/questions/32658199/vectorizing-gibbs-sampler-in-matlab
Perhaps it was not that clear what I want to do, so this might be more clear.
I would like to vectorize a "for" loop in matlab, where some variables inside of the loop are bidirectionally related. So, here is an example:
A=2;
B=3;
for i=1:10000
A=3*B;
B=exp(A*(-1/2))
end
Thank you once again for your time.
A quick Excel calculation indicates that this quickly converges to 0.483908 (after much less than 10000 loops - so one way of speeding it up would be to check for convergence). If A and B are always 2 and 3 respectively, you could just replace the loop with this value.
Alternatively, using some series analysis you might be able to come up with an analytical expression for B when i is large - although with the nested exponents deriving this is a bit beyond my own abilities!
Edit
A bit of googling reveals this. Wikipedia states that for a tetration of x to infinity (i.e. x^x^x^x^x...), the solution y satisfies y = x^y. In your case, for example, 0.483908 = e^(-3/2)^0.483908, so 0.483908 is a solution. Not sure how you would exploit this though.
Wikipedia also gives a convergence condition, which might be of use to you: x lies between e^-e and e^1/e.
Final Edit (?)
Turns out you need Lambert's W function to solve for equations of the form of y = x^y. There seems to be no native function for this, but there seems to be something in the FileExchange - see here and here.

Replacement for repmat in MATLAB

I have a function which does the following loop many, many times:
for cluster=1:max(bins), % bins is a list in the same format as kmeans() IDX output
select=bins==cluster; % find group of values
means(select,:)=repmat_fast_spec(meanOneIn(x(select,:)),sum(select),1);
% (*, above) for each point, write the mean of all points in x that
% share its label in bins to the equivalent row of means
delta_x(select,:)=x(select,:)-(means(select,:));
%subtract out the mean from each point
end
Noting that repmat_fast_spec and meanOneIn are stripped-down versions of repmat() and mean(), respectively, I'm wondering if there's a way to do the assignment in the line labeled (*) that avoids repmat entirely.
Any other thoughts on how to squeeze performance out of this thing would also be welcome.
Here is a possible improvement to avoid REPMAT:
x = rand(20,4);
bins = randi(3,[20 1]);
d = zeros(size(x));
for i=1:max(bins)
idx = (bins==i);
d(idx,:) = bsxfun(#minus, x(idx,:), mean(x(idx,:)));
end
Another possibility:
x = rand(20,4);
bins = randi(3,[20 1]);
m = zeros(max(bins),size(x,2));
for i=1:max(bins)
m(i,:) = mean( x(bins==i,:) );
end
dd = x - m(bins,:);
One obvious way to speed up calculation in MATLAB is to make a MEX file. You can compile C code and perform any operations you want. If you're searching for the fastest-possible performance, turning the operation into a custom MEX file would likely be the way to go.
You may be able to get some improvement by using ACCUMARRAY.
%# gather array sizes
[nPts,nDims] = size(x);
nBins = max(bins);
%# calculate means. Not sure whether it might be faster to loop over nDims
meansCell = accumarray(bins,1:nPts,[nBins,1],#(idx){mean(x(idx,:),1)},{NaN(1,nDims)});
means = cell2mat(meansCell);
%# subtract cluster means from x - this is how you can avoid repmat in your code, btw.
%# all you need is the array with cluster means.
delta_x = x - means(bins,:);
First of all: format your code properly, surround any operator or assignment by whitespace. I find your code very hard to comprehend as it looks like a big blob of characters.
Next of all, you could follow the other responses and convert the code to C (mex) or Java, automatically or manually, but in my humble opinion this is a last resort. You should only do such things when your performance is not there yet by a small margin. On the other hand, your algorithm doesn't show obvious flaws.
But the first thing you should do when trying to improve performance: profile. Use the MATLAB profiler to determine which part of your code is causing your problems. How much would you need to improve this to meet your expectations? If you don't know: first determine this boundary, otherwise you will be looking for a needle in a hay stack which might not even be in there in the first place. MATLAB will never be the fastest kid on the block with respect to runtime, but it might be the fastest with respect to development time for certain kinds of operations. In that respect, it might prove useful to sacrifice the clarity of MATLAB over the execution speed of other languages (C or even Java). But in the same respect, you might as well code everything in assembler to squeeze all of the performance out of the code.
Another obvious way to speed up calculation in MATLAB is to make a Java library (similar to #aardvarkk's answer) since MATLAB is built on Java and has very good integration with user Java libraries.
Java's easier to interface and compile than C. It might be slower than C in some cases, but the just-in-time (JIT) compiler in the Java virtual machine generally speeds things up very well.

Methods to speed up for loop in MATLAB

I have just profiled my MATLAB code and there is a bottle-neck in this for loop:
for vert=-down:up
for horz=-lhs:rhs
y = y + x(k+vert.*length+horz).*DM(abs(vert).*nu+abs(horz)+1);
end
end
where y, x and DM are vectors I have already defined. I vectorised the loop by writing,
B=(-down:up)'*ones(1,lhs+rhs+1);
C=ones(up+down+1,1)*(-lhs:rhs);
y = sum(sum(x(k+length.*B+C).*DM(abs(B).*nu+abs(C)+1)));
But this ended up being sufficiently slower.
Are there any suggestions on how I can speed up this for loop?
Thanks in advance.
What you've done is not really vectorization. It's very difficult, if not impossible, to write proper vectorization procedures for image processing (I assume that's what you're doing) in Matlab. When we use the term vectorized, we really mean "vectorized with no additional computation". For example, this code
a = 1:1000000;
for i = a
n = n+i;
end
would run much slower then this code
a = 1:1000000;
sum(a)
Update: code above has been modified, thanks to #Rasman's keen suggestion. The reason is that Matlab does not compile your code into machine language before running it, and that's what causes it to be slower. Built-in functions like sum, mean and the .* operator run pre-compiled C code behind the scenes. For loops are a great example of code that runs slowly when not optimized for you CPU's registers.
What you have done, and please ignore my first comment, is rewriting your procedure with a vector operation and some additional operations. Those are the operations that take extra CPU simply because you're telling your computer to do more computations, even though each computation separately may (or may not) take less time.
If you are really after speeding up you code, take a look at MEX files. They allow you to write and compile C and C++ code, compile it and run as Matlab functions, just like those fast built-in ones. In any case, Matlab is not meant to be a fast general-purpose programming platform, but rather a computer simulation environment, though this approach has been changing in the recent years. My advise (from experience) is that if you do image processing, you will write for loops, and there's rarely a way around it. Vector operations were written for a more intuitive approach to linear algebra problems, and we rarely treat digital images as regular rectangular matrices in terms of what we do with them.
I hope this helps.
I would use matrices when handling images... you could then try to extract submatrices like so:
X = reshape(x,height,length);
kx = mod(k,length);
ky = floor(k/length);
xstamp = X( [kx-down:kx+up], [ky-lhs:ky+rhs]);
xstamp = xstamp.*getDMMMask(width, height);
y = sum(xstamp);
...
function mask = getDMMask(width, height, nu)
% I don't get what you're doing there .. return an appropriate sized mask here.
return mask;
end