Time complexity in terms of Big O

Time complexity in terms of Big O - matlab

I have a code that does a 4 level decomposition of an image. The levels are similar to the Wavelet transform of an image that decomposes the image into 4 levels: The approximation part, and the three detailed parts. The code that I have implemented uses generalized SVD to do this decomposition. Here is the code
function[Y,U] = MSVD1(X)
%multiresolution SVD (MSVD)
%input-> x: image (spatial domain)
%outputs-> Y: one level MSVD decomposition of x
% U: the unitary matrix (U in SVD)
[m,n] = size(X);
m = m/2; n = n/2;
A = zeros(4,m*n);
for j = 1:n
for i = 1:m
A(:,i + (j-1)*m) = reshape(X((i-1)*2+(1:2),(j-1)*2+(1:2)),4,1);
end
end
[U,S] = svd(A);
T = U'*A;
Y.LL = reshape(T(1,:),m,n);
Y.LH = reshape(T(2,:),m,n);
Y.HL = reshape(T(3,:),m,n);
Y.HH = reshape(T(4,:),m,n);
end
Now the basic operations involved in this are using SVD. So my question is should the time complexity in terms of Big O notation be same as a normal SVD of a matrix? If not what should be the terms that we need to take into account find the complexity in terms of input size of an image? Does the reshaping elements also add account for the time complexity or is it just O(1)?
Can somebody help?

First, the complexity of the constant size reshape (inside the loop) is O(1). Hence, the complexity of the for loop is \Theta(m*n). Second, the complexity of svd is O(max(m, n) * min(m, n)) and based on what data will be returned by the function it can be O(max(m, n)^2) (according to this reference). Moreover, base on the #Daniel comment, the worst-case scenario for reshapes at the end of your code, can O(m*n) (it is usually less than this).
Therefore, the complexity of the code is O(max(m, n)^2). Also, because of the loop, it is Omega(m*n).

Related

MATLAB - Solving multiple linear systems where A (the matrix to be inverted) is the same

I have the following code excerpt:
%suppose A is a 40000 by 40000 matrix
phi = zeros(40000,1);
for t = 1:360
phivec = -phi;
phi = full(A\sparse(phivec));
end
For each t, I solve the system A*x = phivec. For t=1, phivec is given. For t>1, phivec comes from the phivec at t-1. Is there a way to speed up the process, since A is the same for each t. I did try to invert A before the loop starts and do a matrix multiplication, but since A is huge (40000*40000), it's taking a lot of memory. Is there a way to invert A for just once to save time?

Vectorizing the solution of a linear equation system in MATLAB

Summary: This question deals with the improvement of an algorithm for the computation of linear regression.
I have a 3D (dlMAT) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form:
DL = R*IT^P, where DL and IT are obtained experimentally and R and P must be estimated.
The above equation can be transformed into a simple linear model after applying a logarithm:
log(DL) = log(R) + P*log(IT) => y = a + b*x
Presented below is the most "naive" way to solve this system of equations, which essentially involves iterating over all "3rd dimension vectors" and fitting a polynomial of order 1 to (IT,DL(ind1,ind2,:):
%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedP = cellfun(#(x)x(1),sol); %// Estimate of pMAT
fittedR = cellfun(#(x)exp(x(2)),sol); %// Estimate of rMAT
The above approach seems like a good candidate for vectorization, since it does not utilize MATLAB's main strength that is MATrix operations. For this reason, it does not scale very well and takes much longer to execute than I think it should.
There exist alternative ways to perform this computation based on matrix division, as demonstrated here and here, which involve something like this:
sol = [ones(size(x)),log(x)]\log(y);
That is, appending a vector of 1s to the observations, followed by mldivide to solve the equation system.
The main challenge I'm facing is how to adapt my data to the algorithm (or vice versa).
Question #1: How can the matrix-division-based solution be extended to solve the problem presented above (and potentially replace the loops I am using)?
Question #2 (bonus): What is the principle behind this matrix-division-based solution?

The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error sense‡. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer.
Below is benchmarking code that compares the original solution with two alternatives suggested in chat1, 2 :
function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(#power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));
The reason why method 2 can be vectorized into method 3 is essentially that matrix multiplication can be separated by the columns of the second matrix. If A*B produces matrix X, then by definition A*B(:,n) gives X(:,n) for any n. Moving A to the right-hand side with mldivide, this means that the divisions A\X(:,n) can be done in one go for all n with A\X. The same holds for an overdetermined system (linear regression problem), in which there is no exact solution in general, and mldivide finds the matrix that minimizes the mean-square error. In this case too, the operations A\X(:,n) (method 2) can be done in one go for all n with A\X (method 3).
The implications of improving the algorithm when increasing the size of dlMAT can be seen below:
For the case of 500*500 (or 2.5E5) elements, the speedup from Method 1 to Method 3 is about x3500!
It is also interesting to observe the output of profile (here, for the case of 500*500):
Method 1
Method 2
Method 3
From the above it is seen that rearranging the elements via squeeze and flipud takes up about half (!) of the runtime of Method 2. It is also seen that some time is lost on the conversion of the solution from cells to matrices.
Since the 3rd solution avoids all of these pitfalls, as well as the loops altogether (which mostly means re-evaluation of the script on every iteration) - it unsurprisingly results in a considerable speedup.
Notes:
There was very little difference between the "compressed" and the "explicit" versions of Method 3 in favor of the "explicit" version. For this reason it was not included in the comparison.
A solution was attempted where the inputs to Method 3 were gpuArray-ed. This did not provide improved performance (and even somewhat degradaed them), possibly due to wrong implementation, or the overhead associated with copying matrices back and forth between RAM and VRAM.

Vectorize Evaluations of Meshgrid Points in Matlab

I need the "for" loop in the following representative section of code to run as efficiently as possible. The mean function in the code is acting as a representative placeholder for my own function.
x = linspace(-1,1,15);
y = linspace(2,4,15);
[xgrid, ygrid] = meshgrid(x,y);
mc = rand(100000,1);
z=zeros(size(xgrid));
for i=1:length(xgrid)
for j=1:length(ygrid)
z(i,j) = mean(xgrid(i,j) + ygrid(i,j) + xgrid(i,j)*ygrid(i,j)*mc);
end
end
I have vectorized the code and improved its speed by about 2.5 times by building a matrix in which mc is replicated for each grid point. My implementation results in a very large matrix (3 x 22500000) filled with repeated data. I've mitigated the memory penalty of this approach by converting the matrix to single precision, but it seems like there should be a more efficient way to do what I want that avoids replicating so much data.

You could use bsxfun with few reshapes -
A = bsxfun(#times,y,x.'); %//'
B = bsxfun(#plus,y,x.'); %//'
C = mean(bsxfun(#plus,bsxfun(#times,mc,reshape(A,1,[])) , reshape(B,1,[])),1);
z_out = reshape(C,numel(x),[]).';

How can I Vectorize this For Loop in MATLAB Code?

I have the for loop (outlined below) in my code which takes a while to run. CALC is a function I have defined; Dis a matrix; Y is a matrix; k is a vector. Is there a way I can vectorize this code such that I do away with the for loop? Any contribution will be highly appreciated.
for column = 1:n
q(:,column) = CALC(D,Y(:,column), k(column));
end
The CALC function is outlined below:
function [x] = CALC(A, y, s)
[m, n] = size(A);
% y is an m x 1 vector
% s is an integer
r = y;
index_cols = [];
atoms = [];
for i = 1 : s
[max_r, lambda_t] = max(abs(r'*A));
index_cols = [index_cols, lambda_t];
atoms = [atoms, A(:,lambda_t)];
x_t = pinv(atoms)*y;
r = y - atoms*x_t;
end
x = zeros(n,1);
x(index_cols) = x_t;
end

I will expand on rayryeng's comment. Vectorization means grouping some elementary operations together in such a way that they can be jointly handled by a low-level routine. But the bulk of execution time of your code is the computation of pinv(atoms); everything else is not nearly as expensive.
If your task is to saw several pieces of wood, you can clamp them together and saw them all at once. That's vectorization.
But that does not work when you're a mechanic whose task is to repair several cars. The bulk of your time will have to be spent working on an individual car.
Things you can consider:
Caching. Your code computes pseudoinverses of matrices that are always made of the columns of the same matrix D. So it may happen to call pinv with the same atoms input multiple times. Investigate whether this happens often enough to warrant caching the pseudoinverses. Here's an example of caching Matlab results
Parallelizing, if you have the hardware and software for this.
Rethink the algorithm...

matlab - optimize getting the angle between each vector with all others in a large array

I am trying to get the angle between every vector in a large array (1896378x4 -EDIT: this means I need 1.7981e+12 angles... TOO LARGE, but if there's room to optimize the code, let me know anyways). It's too slow - I haven't seen it finish yet. Here's the steps towards optimizing I've taken:
First, logically what I (think I) want (just use Bt=rand(N,4) for testing):
[ro,col]=size(Bt);
angbtwn = zeros(ro-1); %too long to compute!! total non-zero = ro*(ro-1)/2
count=1;
for ii=1:ro-1
for jj=ii+1:ro
angbtwn(count) = atan2(norm(cross(Bt(ii,1:3),Bt(jj,1:3))), dot(Bt(ii,1:3),Bt(jj,1:3))).*180/pi;
count=count+1;
end
end
So, I though I'd try and vectorize it, and get rid of the non-built-in functions:
[ro,col]=size(Bt);
% angbtwn = zeros(ro-1); %TOO LONG!
for ii=1:ro-1
allAxes=Bt(ii:ro,1:3);
repeachAxis = allAxes(ones(ro-ii+1,1),1:3);
c = [repeachAxis(:,2).*allAxes(:,3)-repeachAxis(:,3).*allAxes(:,2)
repeachAxis(:,3).*allAxes(:,1)-repeachAxis(:,1).*allAxes(:,3)
repeachAxis(:,1).*allAxes(:,2)-repeachAxis(:,2).*allAxes(:,1)];
crossedAxis = reshape(c,size(repeachAxis));
normedAxis = sqrt(sum(crossedAxis.^2,2));
dottedAxis = sum(repeachAxis.*allAxes,2);
angbtwn(1:ro-ii+1,ii) = atan2(normedAxis,dottedAxis)*180/pi;
end
angbtwn(1,:)=[]; %angle btwn vec and itself
%only upper left triangle are values...
Still too long, even to pre-allocate... So I try to do sparse, but not implemented right:
[ro,col]=size(Bt);
%spalloc:
angbtwn = sparse([],[],[],ro,ro,ro*(ro-1)/2);%zeros(ro-1); %cell(ro,1)
for ii=1:ro-1
...same
angbtwn(1:ro-ii+1,ii) = atan2(normedAxis,dottedAxis)*180/pi; %WARNED: indexing = >overhead
% WHAT? Can't index sparse?? what's the point of spalloc then?
end
So if my logic can be improved, or if sparse is really the way to go, and I just can't implement it right, let me know where to improve. THANKS for your help.

Are you trying to get the angle between every pair of vectors in Bt? If Bt has 2 million vectors that's a trillion pairs each (apparently) requiring an inner product to get the angle between. I don't know that any kind of optimization is going to help have this operation finish in a reasonable amount of time in MATLAB on a single machine.
In any case, you can turn this problem into a matrix multiplication between matrices of unit vectors:
N=1000;
Bt=rand(N,4); % for testing. A matrix of N (row) vectors of length 4.
[ro,col]=size(Bt);
magnitude = zeros(N,1); % the magnitude of each row vector.
units = zeros(size(Bt)); % the unit vectors
% Compute the unit vectors for the row vectors
for ii=1:ro
magnitude(ii) = norm(Bt(ii,:));
units(ii,:) = Bt(ii,:) / magnitude(ii);
end
angbtwn = acos(units * units') * 360 / (2*pi);
But you'll run out of memory during the matrix multiplication for largish N.

You might want to use pdist with 'cosine' distance to compute the 1-cos(angbtwn).
Another perk for this approach that it does not compute n^2 values but exaxtly .5*(n-1)*n unique values :)