Vectorizing for loop on square matrix in matlab - matlab

I have a big square matrix (refConnect) with approx 500000 elements.
I need to perform this operation:
tmp = find(referenceCluster == 67);
for j=1:length(tmp)
refConnect(tmp(j),tmp)=1;
end
I wonder if there is a simple way to vectorise this so I can avoid the for loop which is taking forever.
Thanks for any help.
Cheers

Seems you can't significantly decrease execution time.
Try evaluate the execution time with this test function.
function test_spped(N, M)
if nargin < 1
N = 707;
end
if nargin < 2
M = 2;
end
refConnectIn = rand(N, N);
referenceCluster = randi(M, N, 1);
refConnectA = refConnectIn;
tic
tmpA = find(referenceCluster == 1);
for j=1:length(tmpA)
refConnectA(tmpA(j),tmpA) = 1;
end
toc
refConnectB = refConnectIn;
tic
tmpB = referenceCluster == 1;
refConnectB(tmpB, tmpB) = 1;
toc
if isequal(refConnectA, refConnectB)
disp('Result are equals');
else
disp('Result are UNEQUALS!');
end
With default parameters you get:
>> test_speed
Elapsed time is 0.002865 seconds.
Elapsed time is 0.001575 seconds.
Result are equals
Note, the execution time of the vectorized code (B case) can be worse for large M:
>> test_speed(707,1000)
Elapsed time is 0.001623 seconds.
Elapsed time is 0.002219 seconds.
Result are equals

Related

How to speed up iterative function call in MatLab?

In MatLab I have to call the cdf of the t distribution (tcdf) iteratively (since the next input value depends on the previous output of tcdf), which unfortunately slows down my code massively.
tic
z = NaN(1e5,1);
z(1) = 1;
x = 2;
for ii = 2:1e5
x = tcdf(z(ii-1),x);
z(ii) = z(ii-1)*x;
end
toc
Elapsed time is 4.717087 seconds.
Is there a way to speed this up somehow?
For comparison:
tic
z = randn(1e5,1);
tcdf(z,5);
toc
Elapsed time is 0.091353 seconds.
Move the random number generation outside the loop as suggested below
numVal = 1e5
z = randn(numVal,1);
for ii = 2:numVal
z(ii) = z(ii-1) + z(ii);
end
tcdf(z,5);

Parallel implementation for Jacobi algorithm takes too much time

I implemented a parallel version of Jacobi's method for the resolution of a linear system. Doing some tests I noticed that the time to execute the function in parallel is very high compared to the time to execute the sequential function. This is strange because the Jacobi's method should be faster when executed with a parallel implementation.
I think I'm doing something wrong in the code:
function [x,niter,resrel] = Parallel_Jacobi(A,b,TOL,MAXITER)
[n, m] = size(A);
D = 1./spdiags(A,0);
B = speye(n)-A./spdiags(A,0);
C= D.*b;
x0=sparse(zeros(length(A),1));
spmd
cod_vett=codistributor1d(1,codistributor1d.unsetPartition,[n,1]);
cod_mat=codistributor1d(1,codistributor1d.unsetPartition,[n,m]);
B= codistributed(B,cod_mat);
C= codistributed(C,cod_vett);
x= codistributed(B*x0 + C,cod_vett);
Niter = 1;
TOLX = TOL;
while(norm(x-x0,Inf) > norm(x0,Inf)*TOLX && Niter < MAXITER)
if(TOL*norm(x,Inf) > realmin)
TOLX = norm(x,Inf)*TOL;
else
TOLX = realmin;
end
x0 = x;
x = B*x0 + C;
Niter=Niter+1;
end
end
Niter=Niter{1};
x=gather(x);
end
Below there are the tests
%sequential Jacobi
format long;
A = gallery('poisson',20);
tic;
x= jacobi(A,ones(400,1),1e-6,2000000);
toc;
Elapsed time is 0.009054 seconds.
%parallel Jacobi
format long;
A = gallery('poisson',20);
tic;
x= Parallel_Jacobi(A,ones(400,1),1e-6,2000000);
toc;
Elapsed time is 11.484130 seconds.
I timed the parpool function with 1,2,3 and 4 workers (I have a quad core processor) and the result is the following:
%Test
format long;
A = gallery('poisson',20);
delete(gcp('nocreate'));
tic
%parpool(1/2/3/4) means that i executed 4 tests that differ only for the
%argument in the function: first parpool(1), second parpool(2) and so on.
parpool(1/2/3/4);
toc
tic;
x= Parallel_Jacobi(A,ones(400,1),1e-6,2000000);
toc;
4 workers: parpool=13.322899 seconds, function=23.772271
3 workers: parpool=10.911769 seconds, function=16.402633
2 workers: parpool=9.371729 seconds, function=12.945154
1 worker: parpool=8.460357 seconds, function=7.982958 .
The less workers, the better the time is. Which is, like #Adriaan said, likely due overhead.
Does this mean that, in this case, the sequential function is always faster than the parallel function? Or is there a better way to implement the parallel one?
In this question it is said that the performance in parallel is better when the number of iterations is high. In my case, with this test, there are only 32 iteration.
The sequential implementation of Jacobi's method is this:
function [x,niter,resrel] = jacobi(A,b,TOL,MAXITER)
n = size(A,1);
D = 1./spdiags(A,0);
B = speye(n)-A./spdiags(A,0);
C= D.*b;
x0=sparse(zeros(length(A),1));
x = B*x0 + C;
Niter = 1;
TOLX = TOL;
while(norm(x-x0,Inf) > norm(x0,Inf)*TOLX && Niter < MAXITER)
if(TOL*norm(x,Inf) > realmin)
TOLX = norm(x,Inf)*TOL;
else
TOLX = realmin;
end
x0 = x;
x = B*x0 + C;
Niter=Niter+1;
end
end
I timed the code with the timeit function and the results are these(the inputs are the same of the previous):
4 workers: 11.693473075964102
3 workers: 9.221281335264003
2 workers: 9.150417240778545
1 worker: 6.047181992020434
sequential: 0.002893932969688

Computational complexity of expanding a vector using its index in MATLAB

suppose I have a n-by-1 vector A, and a m-by-1 all integer vector b, where max(b)<=n, min(b)>0. Can anyone tell me what is the computational complexity (in big-O notation) for performing command A(b) in MATLAB?
As I found and get a test from A with n = 100, 1000, 10^4, ..., 10^7 and m = 100 which A and b generated randomly, average time on the same machine for all case of n are the same (also, approximately, all times are the same, and equail to 5.00679e-05 for A(b)). Part of the code is like below (you should be aware that to get the more accurate, you should run in 100 times and get an average. Also, get the variance to see that values are not so different):
A = randi(100,100,1);
b = randi(100,100,1);
tic; A(b); toc
> Elapsed time is 5.38826e-05 seconds.
A = randi(100,1000,1);
b = randi(100,100,1);
tic; A(b); toc
> Elapsed time is 6.31809e-05 seconds.
A = randi(100,10000,1);
b = randi(100,100,1);
tic; A(b); toc
> Elapsed time is 4.88758e-05 seconds.
...
% Also, as you can see the range of the b is growth, base on the size of A
% However, I can't see any changes in the time complexity scale.
A = randi(100,1000000,1);
b = randi(1000000,100,1);
tic; A(b); toc
> Elapsed time is 6.60419e-05 seconds.
6.60419e-05
Also, run test over n = 100 and m = 10000, 10^4, ..., 10^7, and on each run I got 10^-5, 10^-4, and so on. In this way, run this on different n, and I got the same result. In the below as the same as the previous part, you can follow the following:
A = randi(100,100,1);
b = randi(100,10000,1);
tic;A(b);toc
> Elapsed time is 7.00951e-05 seconds.
A = randi(100,100,1);
b = randi(100,100000,1);
tic;A(b);toc
> Elapsed time is 0.000529051 seconds.
A = randi(100,100,1);
b = randi(100,1000000,1);
tic;A(b);toc
> Elapsed time is 0.00533104 seconds.
...
For the case of be more aacurate, you can run the code like the following:
a = zeros(100,1);
for idx = 1:100
A = randi(100,100,1);b = randi(100,100000,1);
tic; A(b);a(idx) = toc;
end
var(a)
> ans = 4.4092e-10 % very little
mean(a)
> ans = 3.6702e-04 % 10^-4 scale
As you can see, the variance is very little, and mean scale is 10^-4. Hence, you can approve the others by the same method.
Therefore, base on the above analysis and common index methods, We can say the time complexity is not dependent on n and dependent on size of the b. In sum, the time complexity is O(m).
The complexity is 'unusual'
I came up with a short test script, and ran it a few times. If the size of B increases, there is a clear increase in the required time. However, the results get confusing if you look at the size of A.
R = zeros(3,4);
for m = 10.^[4 5 6]
mc = mc+1;
nc =0;
for n = 10.^[1 2 3 4]
nc = nc+1;
A = rand(m,1);
b = randi(m, n, 1);
tic; A(b); R(mc,nc) = toc*10^5;
end
end
R
The above script gave the following results (also similar results when repeated). Note that this was in Octave Online, rather than Matlab.
R =
3.2902 1.7881 2.3127 9.3937
5.3167 2.0027 2.8133 8.2970
21.6007 3.3140 4.9829 17.3092

Create faster Fibonacci function for n > 100 in MATLAB / octave

I have a function that tells me the nth number in a Fibonacci sequence. The problem is it becomes very slow when trying to find larger numbers in the Fibonacci sequence does anyone know how I can fix this?
function f = rtfib(n)
if (n==1)
f= 1;
elseif (n == 2)
f = 2;
else
f =rtfib(n-1) + rtfib(n-2);
end
The Results,
tic; rtfib(20), toc
ans = 10946
Elapsed time is 0.134947 seconds.
tic; rtfib(30), toc
ans = 1346269
Elapsed time is 16.6724 seconds.
I can't even get a value after 5 mins doing rtfib(100)
PS: I'm using octave 3.8.1
If time is important (not programming techniques):
function f = fib(n)
if (n == 1)
f = 1;
elseif (n == 2)
f = 2;
else
fOld = 2;
fOlder = 1;
for i = 3 : n
f = fOld + fOlder;
fOlder = fOld;
fOld = f;
end
end
end
tic;fib(40);toc; ans = 165580141; Elapsed time is 0.000086 seconds.
You could even use uint64. n = 92 is the most you can get from uint64:
tic;fib(92);toc; ans = 12200160415121876738; Elapsed time is 0.001409 seconds.
Because,
fib(93) = 19740274219868223167 > intmax('uint64') = 18446744073709551615
Edit
In order to get fib(n) up to n = 183, It is possible to use two uint64 as one number,
with a special function for summation,
function [] = fib(n)
fL = uint64(0);
fH = uint64(0);
MaxNum = uint64(1e19);
if (n == 1)
fL = 1;
elseif (n == 2)
fL = 2;
else
fOldH = uint64(0);
fOlderH = uint64(0);
fOldL = uint64(2);
fOlderL = uint64(1);
for i = 3 : n
[fL q] = LongSum (fOldL , fOlderL , MaxNum);
fH = fOldH + fOlderH + q;
fOlderL = fOldL;
fOlderH = fOldH;
fOldL = fL;
fOldH = fH;
end
end
sprintf('%u',fH,fL)
end
LongSum is:
function [s q] = LongSum (a, b, MaxNum)
if a + b >= MaxNum
q = 1;
if a >= MaxNum
s = a - MaxNum;
s = s + b;
elseif b >= MaxNum
s = b - MaxNum;
s = s + a;
else
s = MaxNum - a;
s = b - s;
end
else
q = 0;
s = a + b;
end
Note some complications in LongSum might seem unnecessary, but they are not!
(All the deal with inner if is that I wanted to avoid s = a + b - MaxNum in one command, because it might overflow and store an irrelevant number in s)
Results
tic;fib(159);toc; Elapsed time is 0.009631 seconds.
ans = 1226132595394188293000174702095995
tic;fib(183);toc; Elapsed time is 0.009735 seconds.
fib(183) = 127127879743834334146972278486287885163
However, you have to be careful about sprintf.
I also did it with three uint64, and I could get up to,
tic;fib(274);toc; Elapsed time is 0.032249 seconds.
ans = 1324695516964754142521850507284930515811378128425638237225
(It's pretty much the same code, but I could share it if you are interested).
Note that we have fib(1) = 1 , fib(2) = 2according to question, while it is more common with fib(1) = 1 , fib(2) = 1, first 300 fibs are listed here (thanks to #Rick T).
Seems like fibonaacci series follows the golden ratio, as talked about in some detail here.
This was used in this MATLAB File-exchange code and I am writing here, just the esssence of it -
sqrt5 = sqrt(5);
alpha = (1 + sqrt5)/2; %// alpha = 1.618... is the golden ratio
fibs = round( alpha.^n ./ sqrt5 )
You can feed an integer into n for the nth number in Fibonacci Series or feed an array 1:n to have the whole series.
Please note that this method holds good till n = 69 only.
If you have access to the Symbolic Math Toolbox in MATLAB, you could always just call the Fibonacci function from MuPAD:
>> fib = #(n) evalin(symengine, ['numlib::fibonacci(' num2str(n) ')'])
>> fib(274)
ans =
818706854228831001753880637535093596811413714795418360007
It is pretty fast:
>> timeit(#() fib(274))
ans =
0.0011
Plus you can you go for as large numbers as you want (limited only by how much RAM you have!), it is still blazing fast:
% see if you can beat that!
>> tic
>> x = fib(100000);
>> toc % Elapsed time is 0.004621 seconds.
% result has more than 20 thousand digits!
>> length(char(x)) % 20899
Here is the full value of fib(100000): http://pastebin.com/f6KPGKBg
To reach large numbers you can use symbolic computation. The following works in Matlab R2010b.
syms x y %// declare variables
z = x + y; %// define formula
xval = '0'; %// initiallize x, y values
yval = '1';
for n = 2:300
zval = subs(z, [x y], {xval yval}); %// update z value
disp(['Iteration ' num2str(n) ':'])
disp(zval)
xval = yval; %// shift values
yval = zval;
end
You can do it in O(log n) time with matrix exponentiation:
X = [0 1
1 1]
X^n will give you the nth fibonacci number in the lower right-hand corner; X^n can be represented as the product of several matrices X^(2^i), so for example X^11 would be X^1 * X^2 * X^8, i <= log_2(n). And X^8 = (X^4)^2, etc, so at most 2*log(n) matrix multiplications.
One performance issue is that you use a recursive solution. Going for an iterative method will spare you of the argument passing for each function call. As Olivier pointed out, it will reduce the complexity to linear.
You can also look here. Apparently there's a formula that computes the n'th member of the Fibonacci sequence. I tested it for up to 50'th element. For higher n values it's not very accurate.
The implementation of a fast Fibonacci computation in Python could be as follows. I know this is Python not MATLAB/Octave, however it might be helpful.
Basically, rather than calling the same Fibonacci function over and over again with O(2n), we are storing Fibonacci sequence on a list/array with O(n):
#!/usr/bin/env python3.5
class Fib:
def __init__(self,n):
self.n=n
self.fibList=[None]*(self.n+1)
self.populateFibList()
def populateFibList(self):
for i in range(len(self.fibList)):
if i==0:
self.fibList[i]=0
if i==1:
self.fibList[i]=1
if i>1:
self.fibList[i]=self.fibList[i-1]+self.fibList[i-2]
def getFib(self):
print('Fibonacci sequence up to ', self.n, ' is:')
for i in range(len(self.fibList)):
print(i, ' : ', self.fibList[i])
return self.fibList[self.n]
def isNonnegativeInt(value):
try:
if int(value)>=0:#throws an exception if non-convertible to int: returns False
return True
else:
return False
except:
return False
n=input('Please enter a non-negative integer: ')
while isNonnegativeInt(n)==False:
n=input('A non-negative integer is needed: ')
n=int(n) # convert string to int
print('We are using ', n, 'based on what you entered')
print('Fibonacci result is ', Fib(n).getFib())
Output for n=12 would be like:
I tested the runtime for n=100, 300, 1000 and the code is really fast, I don't even have to wait for the output.
One simple way to speed up the recursive implementation of a Fibonacci function is to realize that, substituting f(n-1) by its definition,
f(n) = f(n-1) + f(n-2)
= f(n-2) + f(n-3) + f(n-2)
= 2*f(n-2) + f(n-3)
This simple transformation greatly reduces the number of steps taken to compute a number in the series.
If we start with OP's code, slightly corrected:
function result = fibonacci(n)
switch n
case 0
result = 0;
case 1
result = 1;
case 2
result = 1;
case 3
result = 2;
otherwise
result = fibonacci(n-2) + fibonacci(n-1);
end
And apply our transformation:
function result = fibonacci_fast(n)
switch n
case 0
result = 0;
case 1
result = 1;
case 2
result = 1;
case 3
result = 2;
otherwise
result = fibonacci_fast(n-3) + 2*fibonacci_fast(n-2);
end
Then we see a 30x speed improvement for computing the 20th number in the series (using Octave):
>> tic; for ii=1:100, fibonacci(20); end; toc
Elapsed time is 12.4393 seconds.
>> tic; for ii=1:100, fibonacci_fast(20); end; toc
Elapsed time is 0.448623 seconds.
Of course Rashid's non-recursive implementation is another 60x faster still: 0.00706792 seconds.

MATLAB sum series function

I am very new in Matlab. I just try to implement sum of series 1+x+x^2/2!+x^3/3!..... . But I could not find out how to do it. So far I did just sum of numbers. Help please.
for ii = 1:length(a)
sum_a = sum_a + a(ii)
sum_a
end
n = 0 : 10; % elements of the series
x = 2; % value of x
s = sum(x .^ n ./ factorial(n)); % sum
The second part of your answer is:
n = 0:input('variable?')
Cheery's approach is perfectly valid when the number of terms of the series is small. For large values, a faster approach is as follows. This is more efficient because it avoids repeating multiplications:
m = 10;
x = 2;
result = 1+sum(cumprod(x./[1:m]));
Example running time for m = 1000; x = 1;
tic
for k = 1:1e4
result = 1+sum(cumprod(x./[1:m]));
end
toc
tic
for k = 1:1e4
result = sum(x.^(0:m)./factorial(0:m));
end
toc
gives
Elapsed time is 1.572464 seconds.
Elapsed time is 2.999566 seconds.