optimizing nested for loop in matlab - matlab

I'm trying to optimize the performance (e.g. speed) of my code. I 'm new to vectorization and tried myself to vectorize, but unsucessful ( also try bxsfun, parfor, some kind of vectorization, etc ). Can anyone help me optimize this code, and a short description of how to do this?
% for simplify, create dummy data
Z = rand(250,1)
z1 = rand(100,100)
z2 = rand(100,100)
%update missing param on the last updated, thanks #Bas Swinckels and #Daniel R
j = 2;
n = length(Z);
h = 0.4;
tic
[K1, K2] = size(z1);
result = zeros(K1,K2);
for l = 1 : K1
for m = 1: K2
result(l,m) = sum(K_h(h, z1(l,m), Z(j+1:n)).*K_h(h, z2(l,m), Z(1:n-j)));
end
end
result = result ./ (n-j);
toc
The K_h.m function is the boundary kernel and defined as (x is scalar and y can be vector)
function res = K_h(h, x,y)
res = 0;
if ( x >= 0 & x < h)
denominator = integral(#kernelFunc,-x./h,1);
res = 1./h.*kernelFunc((x-y)/h)/denominator;
elseif (x>=h & x <= 1-h)
res = 1./h*kernelFunc((x-y)/h);
elseif (x > 1 - h & x <= 1)
denominator = integral(#kernelFunc,-1,(1-x)./h);
res = 1./h.*kernelFunc((x-y)/h)/denominator;
else
fprintf('x is out of [0,1]');
return;
end
end
It takes a long time to obtain the results: \Elapsed time is 13.616413 seconds.
Thank you. Any comments are welcome.
P/S: Sorry for my lack of English

Some observations: it seems that Z(j+1:n)) and Z(1:n-j) are constant inside the loop, so do the indexing operation before the loop. Next, it seems that the loop is really simple, every result(l, m) depends on z1(l, m) and z2(l, m). This is an ideal case for the use of arrayfun. A solution might look something like this (untested):
tic
% do constant stuff outside of the loop
Zhigh = Z(j+1:n);
Zlow = Z(1:n-j);
result = arrayfun(#(zz1, zz2) sum(K_h(h, zz1, Zhigh).*K_h(h, zz2, Zlow)), z1, z2)
result = result ./ (n-j);
toc
I am not sure if this will be a lot faster, since I guess the running time will not be dominated by the for-loops, but by all the work done inside the K_h function.

Related

Serious performance issue with iterating simulations

I recently stumbled upon a performance problem while implementing a simulation algorithm. I managed to find the bottleneck function (signally, it's the internal call to arrayfun that slows everything down):
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = arrayfun(#(x) find(x <= the_f,1,'first'),r);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
It is being used in other parts of code as follows:
h0 = zeros(1,sims);
for i = 1:sims
p = simulate_frequency(the_f,k,n);
h0(i) = max(abs(p - the_p));
end
Here are some possible values:
% Test Case 1
sims = 10000;
the_f = [0.3010; 0.4771; 0.6021; 0.6990; 0.7782; 0.8451; 0.9031; 0.9542; 1.0000];
k = 9;
n = 95;
% Test Case 2
sims = 10000;
the_f = [0.0413; 0.0791; 0.1139; 0.1461; 0.1760; 0.2041; 0.2304; 0.2552; 0.2787; 0.3010; 0.3222; 0.3424; 0.3617; 0.3802; 0.3979; 0.4149; 0.4313; 0.4471; 0.4623; 0.4771; 0.4913; 0.5051; 0.5185; 0.5314; 0.5440; 0.5563; 0.5682; 0.5797; 0.5910; 0.6020; 0.6127; 0.6232; 0.6334; 0.6434; 0.6532; 0.6627; 0.6720; 0.6812; 0.6901; 0.6989; 0.7075; 0.7160; 0.7242; 0.7323; 0.7403; 0.7481; 0.7558; 0.7634; 0.7708; 0.7781; 0.7853; 0.7923; 0.7993; 0.8061; 0.8129; 0.8195; 0.8260; 0.8325; 0.8388; 0.8450; 0.8512; 0.8573; 0.8633; 0.8692; 0.8750; 0.8808; 0.8864; 0.8920; 0.8976; 0.9030; 0.9084; 0.9138; 0.9190; 0.9242; 0.9294; 0.9344; 0.9395; 0.9444; 0.9493; 0.9542; 0.9590; 0.9637; 0.9684; 0.9731; 0.9777; 0.9822; 0.9867; 0.9912; 0.9956; 1.000];
k = 90;
n = 95;
The scalar sims must be in the range 1000 1000000. The vector of cumulated frequencies the_f never contains more than 100 elements. The scalar k represents the number of elements in the_f. Finally, the scalar n represents the number of elements in the empirical sample vector, and can even be very large (up to 10000 elements, as far as I can tell).
Any clue about how to improve the computation time of this process?
This seems to be slightly faster for me in the second test case, not the first. The time differences might be larger for longer the_f and larger values of n.
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
[row,col] = find(r <= the_f); % Implicit singleton expansion going on here!
[~,ind] = unique(col,'first');
x = row(ind);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
I'm using implicit singleton expansion in r <= the_f, use bsxfun if you have an older version of MATLAB (but you know the drill).
Find then returns row and column to all the locations where r is larger than the_f. unique finds the indices into the result for the first element of each column.
Credit: Andrei Bobrov over on MATLAB Answers
Another option (derived from this other answer) is a bit shorter but also a bit more obscure IMO:
mask = r <= the_f;
[x,~] = find(mask & (cumsum(mask,1)==1));
If I want performance, I would avoid arrayfun. Even this for loop is faster:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
for i = 1:numel(r)
x(i) = find(r(i)<the_f,1,'first');
end
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
Running 10000 sims with the first set of the sample data gives the following timing.
Your arrayfun function:
>Elapsed time is 2.848206 seconds.
The for loop function:
>Elapsed time is 0.938479 seconds.
Inspired by Cris Luengo's answer, I suggest below:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = sum(r > the_f)+1;
sim = (histcounts(x,[1:k Inf]) ./ n)';
end
Time:
>Elapsed time is 0.264146 seconds.
You can use histcounts with r as its input:
r = rand(1,n);
sim = (histcounts(r,[-inf ;the_f]) ./ n).';
If histc is used instead of histcounts the whole simulation can be vectorized:
r = rand(n,sims);
p = histc(r, [-inf; the_f],1);
p = [p(1:end-2,:) ;sum(p(end-1:end,:))]./n;
h0 = max(abs(p-the_p(:))); %h0 = max(abs(bsxfun(#minus,p,the_p(:))));

MATLAB - Finding Zero of Sum of Functions by Iteration

I am trying to sum a function and then attempting to find the root of said function. That is, for example, take:
Consider that I have a matrix,X, and vector,t, of values: X(2*n+1,n+1), t(n+1)
for j = 1:n+1
sum = 0;
for i = 1:2*j+1
f = #(g)exp[-exp[X(i,j)+g]*(t(j+1)-t(j))];
sum = sum + f;
end
fzero(sum,0)
end
That is,
I want to evaluate at
j = 1
f = #(g)exp[-exp[X(1,1)+g]*(t(j+1)-t(j))]
fzero(f,0)
j = 2
f = #(g)exp[-exp[X(1,2)+g]*(t(j+1)-t(j))] + exp[-exp[X(2,2)+g]*(t(j+1)-t(j))] + exp[-exp[X(3,2)+g]*(t(j+1)-t(j))]
fzero(f,0)
j = 3
etc...
However, I have no idea how to actually implement this in practice.
Any help is appreciated!
PS - I do not have the symbolic toolbox in Matlab.
I suggest making use of matlab's array operations:
zerovec = zeros(1,n+1); %preallocate
for k = 1:n+1
f = #(y) sum(exp(-exp(X(1:2*k+1,k)+y)*(t(k+1)-t(k))));
zerovec(k) = fzero(f,0);
end
However, note that the sum of exponentials will never be zero, unless the exponent is complex. Which fzero will never find, so the question is a bit of a moot point.
Another solution is to write a function:
function [ sum ] = func(j,g,t,X)
sum = 0;
for i = 0:2*j
f = exp(-exp(X(i+1,j+1)+g)*(t(j+3)-t(j+2)));
sum = sum + f;
end
end
Then loop your solver
for j=0:n
fun = #(g)func(j,g,t,X);
fzero(fun,0)
end

Plotting own function in scilab

Hey i have an issuse with plotting my own function in scilab.
I want to plot the following function
function f = test(n)
if n < 0 then
f(n) = 0;
elseif n <= 1 & n >= 0 then
f(n) = sin((%pi * n)/2);
else
f(n) = 1;
end
endfunction
followed by the the console command
x = [-2:0.1:2];
plot(x, test(x));
i loaded the function and get the following error
!--error 21
Invalid Index.
at line 7 of function lala called by :
plot(x, test(x))
Can you please tell me how i can fix this
So i now did it with a for loop. I don't think it is the best solution but i can't get the other ones running atm...
function f = test(n)
f = zeros(size(n));
t = length(n);
for i = 1:t
if n(i) < 0 then
f(i) = 0;
elseif n(i) <= 1 & n(i) >= 0
f(i) = sin((%pi * n(i)/2));
elseif n(i) > 1 then
f(i) = 1;
end
end
endfunction
I guess i need to find a source about this issue and get used with the features and perks matlab/scilab have to over :)
Thanks for the help tho
The original sin is
function f = test(n)
(...)
f(n) = (...)
(...)
endfunction
f is supposed to be the result of the function. Therefore, f(n) is not "the value that the function test takes on argument n", but "the n-th element of f". Scilab then handles this however it can; on your test case, it tries to access a non-integer index, which results in an error. Your loop solution solves the problem.
Replacing all three f(n) by f in your first formulation makes it into something that works... as long as the argument is a scalar (not an array).
If you want test to be able to accept vector arguments without making a loop, the problem is that n < 0 is a vector of the same size as n. My solution would use logical arrays for indexing each of the three conditions:
function f = test(n)
f = zeros(size(n));
negative = (n<0);//parentheses are optional, but I like them for readability
greater_than_1 = (n>1);
others = ~negative & ~greater_than_1;
f(isnegative)=0;
f(greater_than_1)=1;
f(others) = sin(%pi/2*n(others));
endfunction

Create faster Fibonacci function for n > 100 in MATLAB / octave

I have a function that tells me the nth number in a Fibonacci sequence. The problem is it becomes very slow when trying to find larger numbers in the Fibonacci sequence does anyone know how I can fix this?
function f = rtfib(n)
if (n==1)
f= 1;
elseif (n == 2)
f = 2;
else
f =rtfib(n-1) + rtfib(n-2);
end
The Results,
tic; rtfib(20), toc
ans = 10946
Elapsed time is 0.134947 seconds.
tic; rtfib(30), toc
ans = 1346269
Elapsed time is 16.6724 seconds.
I can't even get a value after 5 mins doing rtfib(100)
PS: I'm using octave 3.8.1
If time is important (not programming techniques):
function f = fib(n)
if (n == 1)
f = 1;
elseif (n == 2)
f = 2;
else
fOld = 2;
fOlder = 1;
for i = 3 : n
f = fOld + fOlder;
fOlder = fOld;
fOld = f;
end
end
end
tic;fib(40);toc; ans = 165580141; Elapsed time is 0.000086 seconds.
You could even use uint64. n = 92 is the most you can get from uint64:
tic;fib(92);toc; ans = 12200160415121876738; Elapsed time is 0.001409 seconds.
Because,
fib(93) = 19740274219868223167 > intmax('uint64') = 18446744073709551615
Edit
In order to get fib(n) up to n = 183, It is possible to use two uint64 as one number,
with a special function for summation,
function [] = fib(n)
fL = uint64(0);
fH = uint64(0);
MaxNum = uint64(1e19);
if (n == 1)
fL = 1;
elseif (n == 2)
fL = 2;
else
fOldH = uint64(0);
fOlderH = uint64(0);
fOldL = uint64(2);
fOlderL = uint64(1);
for i = 3 : n
[fL q] = LongSum (fOldL , fOlderL , MaxNum);
fH = fOldH + fOlderH + q;
fOlderL = fOldL;
fOlderH = fOldH;
fOldL = fL;
fOldH = fH;
end
end
sprintf('%u',fH,fL)
end
LongSum is:
function [s q] = LongSum (a, b, MaxNum)
if a + b >= MaxNum
q = 1;
if a >= MaxNum
s = a - MaxNum;
s = s + b;
elseif b >= MaxNum
s = b - MaxNum;
s = s + a;
else
s = MaxNum - a;
s = b - s;
end
else
q = 0;
s = a + b;
end
Note some complications in LongSum might seem unnecessary, but they are not!
(All the deal with inner if is that I wanted to avoid s = a + b - MaxNum in one command, because it might overflow and store an irrelevant number in s)
Results
tic;fib(159);toc; Elapsed time is 0.009631 seconds.
ans = 1226132595394188293000174702095995
tic;fib(183);toc; Elapsed time is 0.009735 seconds.
fib(183) = 127127879743834334146972278486287885163
However, you have to be careful about sprintf.
I also did it with three uint64, and I could get up to,
tic;fib(274);toc; Elapsed time is 0.032249 seconds.
ans = 1324695516964754142521850507284930515811378128425638237225
(It's pretty much the same code, but I could share it if you are interested).
Note that we have fib(1) = 1 , fib(2) = 2according to question, while it is more common with fib(1) = 1 , fib(2) = 1, first 300 fibs are listed here (thanks to #Rick T).
Seems like fibonaacci series follows the golden ratio, as talked about in some detail here.
This was used in this MATLAB File-exchange code and I am writing here, just the esssence of it -
sqrt5 = sqrt(5);
alpha = (1 + sqrt5)/2; %// alpha = 1.618... is the golden ratio
fibs = round( alpha.^n ./ sqrt5 )
You can feed an integer into n for the nth number in Fibonacci Series or feed an array 1:n to have the whole series.
Please note that this method holds good till n = 69 only.
If you have access to the Symbolic Math Toolbox in MATLAB, you could always just call the Fibonacci function from MuPAD:
>> fib = #(n) evalin(symengine, ['numlib::fibonacci(' num2str(n) ')'])
>> fib(274)
ans =
818706854228831001753880637535093596811413714795418360007
It is pretty fast:
>> timeit(#() fib(274))
ans =
0.0011
Plus you can you go for as large numbers as you want (limited only by how much RAM you have!), it is still blazing fast:
% see if you can beat that!
>> tic
>> x = fib(100000);
>> toc % Elapsed time is 0.004621 seconds.
% result has more than 20 thousand digits!
>> length(char(x)) % 20899
Here is the full value of fib(100000): http://pastebin.com/f6KPGKBg
To reach large numbers you can use symbolic computation. The following works in Matlab R2010b.
syms x y %// declare variables
z = x + y; %// define formula
xval = '0'; %// initiallize x, y values
yval = '1';
for n = 2:300
zval = subs(z, [x y], {xval yval}); %// update z value
disp(['Iteration ' num2str(n) ':'])
disp(zval)
xval = yval; %// shift values
yval = zval;
end
You can do it in O(log n) time with matrix exponentiation:
X = [0 1
1 1]
X^n will give you the nth fibonacci number in the lower right-hand corner; X^n can be represented as the product of several matrices X^(2^i), so for example X^11 would be X^1 * X^2 * X^8, i <= log_2(n). And X^8 = (X^4)^2, etc, so at most 2*log(n) matrix multiplications.
One performance issue is that you use a recursive solution. Going for an iterative method will spare you of the argument passing for each function call. As Olivier pointed out, it will reduce the complexity to linear.
You can also look here. Apparently there's a formula that computes the n'th member of the Fibonacci sequence. I tested it for up to 50'th element. For higher n values it's not very accurate.
The implementation of a fast Fibonacci computation in Python could be as follows. I know this is Python not MATLAB/Octave, however it might be helpful.
Basically, rather than calling the same Fibonacci function over and over again with O(2n), we are storing Fibonacci sequence on a list/array with O(n):
#!/usr/bin/env python3.5
class Fib:
def __init__(self,n):
self.n=n
self.fibList=[None]*(self.n+1)
self.populateFibList()
def populateFibList(self):
for i in range(len(self.fibList)):
if i==0:
self.fibList[i]=0
if i==1:
self.fibList[i]=1
if i>1:
self.fibList[i]=self.fibList[i-1]+self.fibList[i-2]
def getFib(self):
print('Fibonacci sequence up to ', self.n, ' is:')
for i in range(len(self.fibList)):
print(i, ' : ', self.fibList[i])
return self.fibList[self.n]
def isNonnegativeInt(value):
try:
if int(value)>=0:#throws an exception if non-convertible to int: returns False
return True
else:
return False
except:
return False
n=input('Please enter a non-negative integer: ')
while isNonnegativeInt(n)==False:
n=input('A non-negative integer is needed: ')
n=int(n) # convert string to int
print('We are using ', n, 'based on what you entered')
print('Fibonacci result is ', Fib(n).getFib())
Output for n=12 would be like:
I tested the runtime for n=100, 300, 1000 and the code is really fast, I don't even have to wait for the output.
One simple way to speed up the recursive implementation of a Fibonacci function is to realize that, substituting f(n-1) by its definition,
f(n) = f(n-1) + f(n-2)
= f(n-2) + f(n-3) + f(n-2)
= 2*f(n-2) + f(n-3)
This simple transformation greatly reduces the number of steps taken to compute a number in the series.
If we start with OP's code, slightly corrected:
function result = fibonacci(n)
switch n
case 0
result = 0;
case 1
result = 1;
case 2
result = 1;
case 3
result = 2;
otherwise
result = fibonacci(n-2) + fibonacci(n-1);
end
And apply our transformation:
function result = fibonacci_fast(n)
switch n
case 0
result = 0;
case 1
result = 1;
case 2
result = 1;
case 3
result = 2;
otherwise
result = fibonacci_fast(n-3) + 2*fibonacci_fast(n-2);
end
Then we see a 30x speed improvement for computing the 20th number in the series (using Octave):
>> tic; for ii=1:100, fibonacci(20); end; toc
Elapsed time is 12.4393 seconds.
>> tic; for ii=1:100, fibonacci_fast(20); end; toc
Elapsed time is 0.448623 seconds.
Of course Rashid's non-recursive implementation is another 60x faster still: 0.00706792 seconds.

matlab parcial equation with 2 intervals

Here is the code I have so far :
function [u]=example222(xrange,trange,uinit,u0bound,uLbound);
n = length(xrange);
m = length(trange);
u = zeros(n,m); %%%
Dx = (xrange(n)-xrange(1))/(n-1);
Dt = (trange(m)-trange(1))/(m-1);
u(:,1)=uinit';
u(1,:)=u0bound;
u(n,:)=uLbound;
gegu=0.08;
alpha=0;
koefa=(-Dt*gegu/(2*Dx));
koefb=(alpha*Dt/(Dx)^2);
u
% first time step
for i = 2:n-1
u(i,2) = u(i,1)+2*koefa*(u(i+1,1)-u(i-1,1))+(koefb/2)*(u(i-1,1)-2*u(i,1)+u(i+1,1))
end;
% subsequent time steps
for j = 2:m-1
for i = 2:n-1
u(i,j+1)=u(i,j-1)+koefa*(u(i+1,j)-u(i-1,j))+koefb*(u(i-1,j)-2*u(i,j)+u(i+1,j))
end;
end;
______________________________________
x = (0:0.1:1);
t = (0:0.8:8) ;
u0=zeros;uL=zeros;
uinit=1-(10*x-1).^2;
[u]=example222(x,t,uinit,u0,uL);
surf(x,t,u,'EdgeColor','black')
Next thing I need to do is to implement an interval for uinit=1-(10*x-1).^2
IF x-0.08*t < 0.2. then => uinit=1-(10*x-1).^2;
else uinit=0;
Can someone help me with that please. I was trying to do it with if clauses and loops and couldn't make it work. Help is greatly appreciated.
There are many ways to define a piecewise function my usual method is as follows:
uinit = zeros(size(x));
I = x<(0.08*t+0.2); %// Find the indices where x<(0.08*t+0.2)
uinit(I) = 1-(10*x(I)-1).^2;
This case is easier than what you may often have since you want all the other values of uinit to be zero. If you had another function in the other region (say x^2) you could also do:
uinit(1-I) = x(1-I).^2;
The operation 1-I gives 0 where I==1 and 1 where I==0, therefore your get the complement of I.