Optimising loop: simple operations but large number of iterations taking long - matlab

I am running a piece of Matlab code that is taking almost 70 hours and I'm sure there's a more efficient way of scripting it, but I cannot figure out how.
Looping over 1 iteration takes 1 second. The problem of course is that length(i) is 186144.
braindip = normrnd(0, 50, 186144,3);
nobrain = normrnd(0, 45, 25014656,3);
ok = 1;
alpha = 2;
h = waitbar(0,'Please wait...');
dip_away = nan(size(braindip));
for i = 1:size(braindip,1)
tic
h_norm = repmat(braindip(i,:), size(nobrain,1),1);
nn = sqrt(sum((h_norm - nobrain).^2,2));
if min(nn) > alpha
dip_away(ok,:) = braindip(i,:);
ok = ok+1;
end
toc
waitbar(i / size(braindip,1))
end
Does any one have a clever suggestion for optimising this loop? Thanks very much!

Assuming you are using MATLAB 2016b or later, which supports automatic broadcasting, you can change:
h_norm = repmat(braindip(i,:), size(nobrain,1),1);
nn = sqrt(sum((h_norm - nobrain).^2,2));
to
nn = sqrt(sum((braindip(i,:) - nobrain).^2,2));
A second option would be, to eliminate the sqrt. Use:
nn_qbd = sum((braindip(i,:) - nobrain).^2,2);
if min(nn_qbd) > alpha_qbd
Where alpha_qbd=alpha.^2, obviously calculated only once in advance. This leads to the third step, no need to store nn_qbd in a variable. You are only interested in the minimum:
nn_qbd_min = min(sum((braindip(i,:) - nobrain).^2,2));
if nn_qbd_min > alpha_qbd
Comparing the original code to the third option, the execution time is roughly cut in half.

Related

Solving probability problems with MATLAB

How can I simulate this question using MATLAB?
Out of 100 apples, 10 are rotten. We randomly choose 5 apples without
replacement. What is the probability that there is at least one
rotten?
The Expected Answer
0.4162476
My Attempt:
r=0
for i=1:10000
for c=1:5
a = randi(1,100);
if a < 11
r=r+1;
end
end
end
r/10000
but it didn't work, so what would be a better way of doing it?
Use randperm to choose randomly without replacement:
A = false(1, 100);
A(1:10) = true;
r = 0;
for k = 1:10000
a = randperm(100, 5);
r = r + any(A(a));
end
result = r/10000;
Short answer:
Your problem follow an hypergeometric distribution (similar to a binomial distribution but without replacement), if you have the necessary toolbox you can simply use the probability density function of the hypergeometric distribution:
r = 1-hygepdf(0,100,10,5) % r = 0.4162
Since P(x>=1) = P(x=1) + P(x=2) + P(x=3) + P(x=4) + P(x=5) = 1-P(x=0)
Of course, here I calculate the exact probability, this is not an experimental result.
To get further:
Noticed that if you do not have access to hygepdf, you can easily write the function yourself by using binomial coefficient:
N = 100; K = 10;
n = 5; k = 0;
r = 1-(nchoosek(K,k)*nchoosek(N-K,n-k))/nchoosek(N,n) % r = 0.4162
You can also use the binomial probability density function, it is a bit more tricky (but also more intuitive):
r = 1-prod(binopdf(0,1,10./(100-[0:4])))
Here we compute the probability to obtain 0 rotten apple five time in a row, the probabily increase at every step since we remove 1 good juicy apple each time. And then, according to the above explaination, we take 1-P(x=0).
There are a couple of issues with your code. First of all, implicitly in what you wrote, you replace the apple after you look at it. When you generate the random number, you need to eliminate the possibility of choosing that number again.
I've rewritten your code to include better practices:
clear
n_runs = 1000;
success = zeros(n_runs, 1);
failure = zeros(n_runs, 1);
approach = zeros(n_runs, 1);
for ii = 1:n_runs
apples = 1:100;
a = randperm(100, 5);
if any(a < 11)
success(ii) = 1;
elseif a >= 11
failure(ii) = 1;
end
approach(ii) = sum(success)/(sum(success)+sum(failure));
end
figure; hold on
plot(approach)
title("r = "+ approach(end))
hold off
The results are stored in an array (called approach), rather than a single number being updated every time, which means you can see how quickly you approach the end value of r.
Another good habit is including clear at the beginning of any script, which reduces the possibility of an error occurring due to variables stored in the workspace.

How do I properly "slice" a 4D matrix in Matlab in a parfor loop?

I am trying to make a portion of my code run faster in MatLab, and I'd like to use parfor. When I try to, I get the following error about one of my variables D_all.
"The PARFOR loop cannot run because of the way D_all is used".
Here is a sample of my code.
M = 161;
N = 24;
P = 161;
parfor n=1:M*N*P
[j,i,k] = ind2sub([N,M,P],n);
r0 = Rw(n,1:3);
R0 = repmat(r0,M*N*P,1);
delta = sqrt(dXnd(i)^2 + dZnd(k)^2);
d = R_prime - R0;
inS = Rw_prime(find(sqrt(sum(d.^2,2))<0.8*delta),:);
if isempty(inS)
D_all(j,i,k,tj) = D_all(j,i,k,tj-1);
else
y0 = r0(2);
inC = inS(find(inS(:,2)==y0),:);
dw = sqrt(sum(d(find(sqrt(sum(d.^2,2))<0.8*delta & d(:,2)==0),:).^2,2));
V_avg = sum(dw.^(-1).*inC(:,4))/sum(dw.^(-1));
D_all(j,i,k,tj) = V_avg;
end
end
I'm not very familiar with parallel computing, and I've looked at the guides online and don't really understand how to apply them to my situation. I guess I need to "slice" D_all but I don't know how to do that.
EDIT: I think I understand that the major problem is that when using D_all I have tj and tj-1.
EDIT 2: I don't show this above, it probably would have been helpful, but I defined D_all(:,:,:,1) = V_1; where V_1 corresponds to a previous time step. I tried making multiple variables V_2, V_3, etc. for each step and replacing D_all(j,i,k,tj-1) with V_1(j,i,k). This still led to the same error I am seeing with D_all.
"Valid indices for D_all are restricted for PARFOR loops"

On histogram equalization

So I'd like to do it without histeq, but my code seems to get out a rather peculiar, really whited out image, and doesn't seem all too much improved from the original picture. Is there a better way to apply the proper histogram?
Cumlative=zeros(256,1);
CumHisty=uint8(zeros(ROWS,COLS));
% First we need to find the probabilities and the frequencies
freq = zeros(256,1);
probab = zeros(256,1);
for i=1:ROWS
for j=1:COLS
value=I1(i,j);
freq(value+1)=freq(value+1)+1;
probab(value+1)=freq(value+1)/(ROWS*COLS);
end
end
count=0;
cumprobab=zeros(256,1);
distrib=zeros(256,1);
for i=1:size(probab)
count=count+freq(i);
Cumlative(i)=count;
cumprobab(i)=Cumlative(i)/(ROWS*COLS);
distrib(i)=round(cumprobab(i)*(ROWS*COLS));
end
for i=1:ROWS
for j=1:COLS
CumHisty(i,j)=distrib(I1(i,j)+1);
end
You probably want to do:
distrib(i) = round(cumprobab(i)*255);
EDIT:
Here is a version of your code without the redundant computations, and simplified looping:
freq = zeros(256,1);
for i = 1:numel(I1)
index = I1(i) + 1;
freq(index) = freq(index)+1;
end
count = 0;
distrib = zeros(256,1);
for i = 1:length(freq)
count = count + freq(i);
cumprobab = count/numel(I1);
distrib(i) = round(cumprobab*255);
end
CumHisty = zeros(size(I1),'uint8');
for i = 1:numel(I1)
CumHisty(i) = distrib(I1(i)+1);
end
I use linear indexing above, it's simpler (one loop instead of 2) and automatically helps you access the pixels in the same order that they are stored in. The way you looped (over rows in the outer loop, and columns in the inner loop) means that you are not accessing pixels in the optimal order, since arrays are stored column-wise (column-major order). Accessing data in the order in which it is stored in memory allows for an optimal cache usage (i.e. is faster).
The above can also be written as:
freq = histcounts(I1,0:256);
distrib = round(cumsum(freq)*(255/numel(I1)));
distrib = uint8(distrib);
CumHisty = distrib(I1+1);
This is faster than the loop code, but within the same order of magnitude. Recent versions of MATLAB are no longer terribly slow doing loops.
I clocked your code at 40 ms, with simplified loops at 19.5 ms, and without loops at 5.8 ms, using an image of size 1280x1024.

Matlab Code To Approximate The Exponential Function

Does anyone know how to make the following Matlab code approximate the exponential function more accurately when dealing with large and negative real numbers?
For example when x = 1, the code works well, when x = -100, it returns an answer of 8.7364e+31 when it should be closer to 3.7201e-44.
The code is as follows:
s=1
a=1;
y=1;
for k=1:40
a=a/k;
y=y*x;
s=s+a*y;
end
s
Any assistance is appreciated, cheers.
EDIT:
Ok so the question is as follows:
Which mathematical function does this code approximate? (I say the exponential function.) Does it work when x = 1? (Yes.) Unfortunately, using this when x = -100 produces the answer s = 8.7364e+31. Your colleague believes that there is a silly bug in the program, and asks for your assistance. Explain the behaviour carefully and give a simple fix which produces a better result. [You must suggest a modification to the above code, or it's use. You must also check your simple fix works.]
So I somewhat understand that the problem surrounds large numbers when there is 16 (or more) orders of magnitude between terms, precision is lost, but the solution eludes me.
Thanks
EDIT:
So in the end I went with this:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:40
x1 = x/10;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^10;
s
Not sure if it's completely correct but it returns some good approximations.
exp(-100) = 3.720075976020836e-044
s = 3.722053303838800e-044
After further analysis (and unfortunately submitting the assignment), I realised increasing the number of iterations, and thus increasing terms, further improves efficiency. In fact the following was even more efficient:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:200
x1 = x/200;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^200;
s
Which gives:
exp(-100) = 3.720075976020836e-044
s = 3.720075976020701e-044
As John points out in a comment, you have an error inside the loop. The y = y*k line does not do what you need. Look more carefully at the terms in the series for exp(x).
Anyway, I assume this is why you have been given this homework assignment, to learn that series like this don't converge very well for large values. Instead, you should consider how to do range reduction.
For example, can you use the identity
exp(x+y) = exp(x)*exp(y)
to your advantage? Suppose you store the value of exp(1) = 2.7182818284590452353...
Now, if I were to ask you to compute the value of exp(1.3), how would you use the above information?
exp(1.3) = exp(1)*exp(0.3)
But we KNOW the value of exp(1) already. In fact, with a little thought, this will let you reduce the range for an exponential down to needing the series to converge rapidly only for abs(x) <= 0.5.
Edit: There is a second way one can do range reduction using a variation of the same identity.
exp(x) = exp(x/2)*exp(x/2) = exp(x/2)^2
Thus, suppose you wish to compute the exponential of large number, perhaps 12.8. Getting this to converge acceptably fast will take many terms in the simple series, and there will be a great deal of subtractive cancellation happening, so you won't get good accuracy anyway. However, if we recognize that
12.8 = 2*6.4 = 2*2*3.2 = ... = 16*0.8
then IF you could efficiently compute the exponential of 0.8, then the desired value is easy to recover, perhaps by repeated squaring.
exp(12.8)
ans =
362217.449611248
a = exp(0.8)
a =
2.22554092849247
a = a*a;
a = a*a;
a = a*a;
a = a*a
362217.449611249
exp(0.8)^16
ans =
362217.449611249
Note that WHENEVER you do range reduction using methods like this, while you may incur numerical problems due to the additional computations necessary, you will usually come out way ahead due to the greatly enhanced convergence of your series.
Why do you think that's the wrong answer? Look at the last term of that sequence, and it's size, and tell me why you expect you should have an answer that's close to 0.
My original answer stated that roundoff error was the problem. That will be a problem with this basic approach, but why do you think 40 is enough terms for the appropriate mathematical ( as opposed to computer floating point arithmetic) answer.
100^40 / 40! ~= 10^31.
Woodchip has the right idea with range reduction. That's the typical approach people use to implement these kinds of functions very quickly. Once you get that all figured out, you deal with roundoff errors of alternating sequences, by summing adjacent terms within the loop, and stepping with k = 1 : 2 : 40 (for instance). That doesn't work here until you use woodchips's idea because for x = -100, the summands grow for a very long time. You need |x| < 1 to guarantee intermediate terms are shrinking, and thus a rewrite will work.

How to run multiple trials in this Matlab program without iteration step?

I have a simple Matlab program that uses a set of random number lists and runs a series of trials using those numbers. Right now, the trials are run iteratively using the code below. How could this code be modified to eliminate the need for that iterative step? The program would be a lot more efficient if it could be properly vectorized.
size = 1000;
trials = 1000;
grid = zeros(size,size);
rx1 = randi(size,trials,1);
ry1 = randi(size,trials,1);
rx2 = randi(size,trials,1);
ry2 = randi(size,trials,1);
xmin = min(rx1,rx2);
xmax = max(rx1,rx2);
ymin = min(ry1,ry2);
ymax = max(ry1,ry2);
%This is the loop that I want to eliminate
for n=1:trials;
grid(ymin(n):ymax(n),xmin(n):xmax(n)) = grid(ymin(n):ymax(n),xmin(n):xmax(n)) + 1;
end
figure
mesh(grid);
I would use a trick inspired by integral images:
grid(ymin(n):ymax(n),xmin(n):xmax(n))=1;
is equivalent to:
grid(ymin(n),xmin(n))=1;
grid(ymin(n),xmax(n)+1)=-1;
grid(ymax(n)+1,xmin(n))=-1;
grid(ymax(n)+1,xmax(n)+1)=1;
grid=cumsum(cumsum(grid,1),2);
So for your problem I would do:
grid = zeros(size+1,size+1);
grid=full( sparse(ymin,xmin,1,size+1,size+1)...
+sparse(ymax+1,xmax+1,1,size+1,size+1)...
-sparse(ymin,xmax+1,1,size+1,size+1)...
-sparse(ymax+1,xmin,1,size+1,size+1));
grid=cumsum(cumsum(grid,1),2);
grid=grid(1:end-1,1:end-1);
I've tested it on my laptop. Results are same:
Elapsed time for code with loop is 1.802788 seconds.
Elapsed time for vectorized code is 0.033834 seconds.