I have a couple of big 3 dimensional matrices (e.g. dimension:16330,1300,16). For each cell I need to develop a simple linear regression model and extract some information such as slope and intercept of the fitted model.I created a loop and run the processing pixel by pixel but it will take for ever. Is there any suggestion that I can improve the following code?
% read the multiband image (16330,1300,16)
[A,R] = geotiffread('16Bands_image.tif');
% this is a vector (1*16) that i fit it against the third dimension of each
%pixel throughout the image
Load external.m
intercept = zeros(size(A,1),size(A,2));
slope = zeros(size(A,1),size(A,2));
for i=1:size(A,1)
for j=1:size(A,2)
REF=squeeze(A(i,j,:));
p=fitlm(REF,external);
intercept(i,j)=p.Coefficients.Estimate(1);
slope(i,j) = p.Coefficients.Estimate(2);
end
end
Thanks
If p = fitlm(external, REF) is what you need, there is a fast solution: reshape the image into 16 by (16330*1300), and apply the model without loop.
A = reshape(A, [], 16)'; % reshape and transpose to 16 by N
X = external(:);
X = X - mean(X);
b = [ones(16,1) X] \ A; % solve all once
Rows 1 and 2 of b are intercept and slope respectively.
I don't know your data, but this supposes A is the measured data.
If indeed you want the other way, you may still need loop over pixels:
external = external(:); % make sure it is column
b = zeros(2, size(A,2)); % A in 16 by N
for i = 1:size(A,2)
X = A(:,i);
X = X - mean(X);
b(:,i) = [ones(16,1) X] \ external;
end
But this is still slow, although it is faster than fitlm.
I only know of the following power iteration. But it needs to create a huge matrix A'*A when both of rows and columns are pretty large. And A is a dense matrix as well. Is there any alternative to power iteration method below? I have heard of krylov subspace method, but I am not familiar with it. In anycase I am looking for any faster method than the one mentioned below:
B = A'*A; % or B = A*A' if it is smaller
x = B(:,1); % example of starting point, x will have the largest eigenvector
x = x/norm(x);
for i = 1:200
y = B*x;
y = y/norm(y);
% norm(x - y); % <- residual, you can try to use it to stop iteration
x = y;
end;
n3 = sqrt(mean(B*x./x)) % translate eigenvalue of B to singular value of A
I checked 'svd' command of matlab with a 100*100 randomly generated matrix. It is almost 5 times faster than your code.
s = svd(A);
n3 = s(1);
Suppose I have a matrix A. I want to calculate its 2-norm/spectral norm. How can I calculate this efficiently?
I know 2-norm of a matrix is equal to its largest singular value. So, result of the following MATLAB code will be zero
>> [u,s,v]=svd(A,'econ');
norm(A,2)-s(1,1)
But to know 2-norm I have to calculate SVD of full matrix A, is there any efficient way to calculate 2-norm? Answer in form of MATLAB code will be much appereciated.
This example with norm and random data
A = randn(2000,2000);
tic;
n1 = norm(A)
toc;
gives
n1 = 89.298
Elapsed time is 2.16777 seconds.
You can try eigs to find only one (the largest) eigenvalue of the symmetric matrix A'*A (or A*A' if it is smaller for A rectangular). It uses a Lanczos iteration method.
tic;
B = A'*A; % symmetric positive-definite. B = A*A' if it is smaller
n2 = sqrt(eigs(B, 1)),
toc
it outputs:
n2 = 89.298
Elapsed time is 0.311942 seconds.
If you don't want to use norm or eigs, and your matrix A has good properties (singular values properly separated), you can try to approximate it with a power iteration method:
tic;
B = A'*A; % or B = A*A' if it is smaller
x = B(:,1); % example of starting point, x will have the largest eigenvector
x = x/norm(x);
for i = 1:200
y = B*x;
y = y/norm(y);
% norm(x - y); % <- residual, you can try to use it to stop iteration
x = y;
end;
n3 = sqrt(mean(B*x./x)) % translate eigenvalue of B to singular value of A
toc
which for the same random matrix (not particularly good properties) gives a ~0.1% accurate solution:
n3 = 89.420
Elapsed time is 0.428032 seconds.
Here is the original code:
K = zeros(N*N)
for a=1:N
for i=1:I
for j=1:J
M = kron(X(:,:,a).',Y(:,:,a,i,j));
%A function that essentially adds M to K.
end
end
end
The goal is to vectorize the kroniker multiplication calls. My intuition is to think of X and Y as containers of matrices (for reference, the slices of X and Y being fed to kron are square matrices of the order 7x7). Under this container scheme, X appears a 1-D container and Y as a 3-D container. My next guess was to reshape Y into a 2-D container or better yet a 1-D container and then do element wise multiplication of X and Y. Questions are: how would do this reshaping in a way that preserves the trace of M and can matlab even handle this idea in this container idea or do the containers need to be further reshaped to expose the inner matrix elements further?
Approach #1: Matrix multiplication with 6D permute
% Get sizes
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
% Lose the third dim from X and Y with matrix-multiplication
parte1 = reshape(permute(Y,[1,2,4,5,3]),[],N)*reshape(X,[],N).';
% Rearrange the leftover dims to bring kron format
parte2 = reshape(parte1,[n1,n2,I,J,m1,m2]);
% Lose dims correspinding to last two dims coming in from Y corresponding
% to the iterative summation as suggested in the question
out = reshape(permute(sum(sum(parte2,3),4),[1,6,2,5,3,4]),m1*n1,m2*n2)
Approach #2: Simple 7D permute
% Get sizes
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
% Perform kron format elementwise multiplication betwen the first two dims
% of X and Y, keeping the third dim aligned and "pushing out" leftover dims
% from Y to the back
mults = bsxfun(#times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
% Lose the two dims with summation reduction for final output
out = sum(reshape(mults,m1*n1,m2*n2,[]),3);
Verification
Here's a setup for running the original and the proposed approaches -
% Setup inputs
X = rand(10,10,10);
Y = rand(10,10,10,10,10);
% Original approach
[n1,n2,N,I,J] = size(Y);
K = zeros(100);
for a=1:N
for i=1:I
for j=1:J
M = kron(X(:,:,a).',Y(:,:,a,i,j));
K = K + M;
end
end
end
% Approach #1
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
mults = bsxfun(#times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
out1 = sum(reshape(mults,m1*n1,m2*n2,[]),3);
% Approach #2
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
parte1 = reshape(permute(Y,[1,2,4,5,3]),[],N)*reshape(X,[],N).';
parte2 = reshape(parte1,[n1,n2,I,J,m1,m2]);
out2 = reshape(permute(sum(sum(parte2,3),4),[1,6,2,5,3,4]),m1*n1,m2*n2);
After running, we see the max. absolute deviation with the proposed approaches against the original one -
>> error_app1 = max(abs(K(:)-out1(:)))
error_app1 =
1.1369e-12
>> error_app2 = max(abs(K(:)-out2(:)))
error_app2 =
1.1937e-12
Values look good to me!
Benchmarking
Timing these three approaches using the same big dataset as used for verification, we get something like this -
----------------------------- With Loop
Elapsed time is 1.541443 seconds.
----------------------------- With BSXFUN
Elapsed time is 1.283935 seconds.
----------------------------- With MATRIX-MULTIPLICATION
Elapsed time is 0.164312 seconds.
Seems like matrix-multiplication is doing fairly good for dataset of these sizes!
I have a KxLxM matrix A which is an image with a feature vector, length M, for each pixel location.
I have also have a feature vector v, length M. At each pixel location of image A i want to calculate the correlation of the pixel's feature vector with my feature vector v.
I've already done this using a loop, but loops are slow in matlab. Does anyone have a suggestion of how to vectorize this?
function test()
A = rand(4,5,3);
v = [1 2 3];
c = somecorr(A, v);
size(c)
function c = somecorr(a,v)
c = a(:,:,1).*0;
for y = 1:size(a,1)
for x = 1:size(a,2)
c(y,x) = corr2(squeeze(a(y,x,1:length(v)))',v);
end
end
>>test()
ans =
4 5
You could try this and see, if its faster:
function c = somecorr2(a,v)
as = reshape(a,size(a,1)*size(a,2),size(a,3));
cs = corr(as',v');
c = reshape(cs,size(a,1),size(a,2));
size(c)
I only did some small tests, but it seems to be more than 100x faster. At least for my test cases.
If you do not have the 'corr' function you can use this on, inspired by this [answer](
What is a fast way to compute column by column correlation in matlab):
function C = manualCorr(A,B)
An=bsxfun(#minus,A,mean(A,1)); %%% zero-mean
Bn=bsxfun(#minus,B,mean(B,1)); %%% zero-mean
An=bsxfun(#times,An,1./sqrt(sum(An.^2,1))); %% L2-normalization
Bn=bsxfun(#times,Bn,1./sqrt(sum(Bn.^2,1))); %% L2-normalization
C=sum(An.*repmat(Bn,1,size(An,2)),1); %% correlation
For a 100x100x3 matrix I get the following runtimes:
Your version: 1.643065 seconds.
mine with 'corr': 0.007191 seconds.
mine with 'manualCorr': 0.006206 seconds.
I was using Matlab R2012a.