matlab precision determinant problem - matlab
I have the following program
format compact; format short g; clear; clc;
L = 140; J = 77; Jm = 10540; G = 0.8*10^8; d = L/3;
for i=1:500000
omegan=1.+0.0001*i;
a(1,1) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(1,2) = 2; a(1,3) = 0; a(1,4) = 0;
a(2,1) = 1; a(2,2) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(2,3) = 1; a(2,4) = 0;
a(3,1) = 0; a(3,2) = 1; a(3,3) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(3,4) = 1;
a(4,1) = 0; a(4,2) = 0; a(4,3) = 2; a(4,4) = ((omegan^2)*(Jm/(G*J))*d^2)-2;
if(abs(det(a))<1E-10) sprintf('omegan= %8.3f det= %8.3f',omegan,det(a))
end
end
Analytical solution of the above system, and the same program written in fortran gives out values of omegan equal to 16.3818 and 32.7636 (fortran values; analytical differ a little, but they're there somewhere).
So, now I'm wondering ... where am I going wrong with this ? Why is matlab not giving the expected results ?
(this is probably something terribly simple, but it's giving me headaches)
You're looking for too small of determinant values because Matlab is using a different determinant function (or some other reason like something to do with the floating point accuracy involved in the two different methods). I'll show you that Matlab is essentially giving you the correct values and a better way to approach this problem in general.
First, let's take your code and change it slightly.
format compact; format short g; clear; clc;
L = 140; J = 77; Jm = 10540; G = 0.8*10^8; d = L/3;
vals = zeros(1,500000);
for i=1:500000
omegan=1.+0.0001*i;
a(1,1) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(1,2) = 2; a(1,3) = 0; a(1,4) = 0;
a(2,1) = 1; a(2,2) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(2,3) = 1; a(2,4) = 0;
a(3,1) = 0; a(3,2) = 1; a(3,3) = ((omegan^2)*(Jm/(G*J))*d^2)-2; a(3,4) = 1;
a(4,1) = 0; a(4,2) = 0; a(4,3) = 2; a(4,4) = ((omegan^2)*(Jm/(G*J))*d^2)-2;
vals(i) = abs(det(a));
if(vals(i)<1E-10)
sprintf('omegan= %8.3f det= %8.3f',omegan,det(a))
end
end
plot(1.+0.0001*(1:500000),log(vals))
All that I've done really is logged the values of the determinant for all values of omegan and plotted the log of those determinant values as a function of omegan. Here is the plot:
You notice three major dips in the graph. Two coincide with your results of 16.3818 and 32.7636, but there is also an additional one which you were missing (probably because your condition of the determinant being less than 1e-10 was too low even for your Fortran code to pick it up). Therefore, Matlab is also telling you that those are the values of omegan that you were looking for, but because of the determinant was determined in a different manner in Matlab, the values weren't the same - not surprising when dealing with badly conditioned matrices. Also, it probably has to do with Fortran using single precision floats as someone else said. I'm not going to look into why they aren't because I don't want to waste my time on that. Instead, let's look at what you are trying to do and try a different approach.
You, as I'm sure you are aware, are trying to find the eigenvalues of the matrix
a = [[-2 2 0 0]; [1 -2 1 0]; [0 1 -2 1]; [0 0 2 -2]];
, set them equal to
-omegan^2*(Jm/(G*J)*d^2)
and solve for omegan. This is how I went about it:
format compact; format short g; clear; clc;
L = 140; J = 77; Jm = 10540; G = 0.8*10^8; d = L/3;
C1 = (Jm/(G*J)*d^2);
a = [[-2 2 0 0]; [1 -2 1 0]; [0 1 -2 1]; [0,0,2,-2]];
myeigs = eig(a);
myeigs(abs(myeigs) < eps) = 0.0;
for i=1:4
sprintf('omegan= %8.3f', sqrt(-myeigs(i)/C1))
end
This gives you all four solutions - not just the two that you had found with your Fortran code (though one of them, zero, was outside of your testing range for omegan ). If you want to go about solving this by checking the determinant in Matlab, as you've been trying to do, then you'll have to play with the value that you're checking the absolute value of the determinant to be less than. I got it to work for a value of 1e-4 (it gave 3 solutions: 16.382, 28.374, and 32.764).
Sorry for such a long solution, but hopefully it helps.
Update:
In my first block of code above, I replaced
vals(i) = abs(det(a));
with
[L,U] = lu(a);
s = det(L);
vals(i) = abs(s*prod(diag(U)));
which is the algorithm that det is supposedly using according to the Matlab docs. Now, I am able to use 1E-10 as the condition and it works. So maybe Matlab isn't calculating the determinant exactly as the docs say? This is kind of disturbing.
New answer:
You can investigate this problem using symbolic equations, which gives me the correct answers:
>> clear all %# Clear all existing variables
>> format long %# Display more digits of precision
>> syms Jm d omegan G J %# Your symbolic variables
>> a = ((Jm*(d*omegan)^2)/(G*J)-2).*eye(4)+... %# Create the matrix a
diag([2 1 1],1)+...
diag([1 1 2],-1);
>> solns = solve(det(a),'omegan') %# Solve for where the determinant is 0
solns =
0
0
(G*J*Jm)^(1/2)/(Jm*d)
-(G*J*Jm)^(1/2)/(Jm*d)
-(2*(G*J*Jm)^(1/2))/(Jm*d)
(2*(G*J*Jm)^(1/2))/(Jm*d)
(3^(1/2)*(G*J*Jm)^(1/2))/(Jm*d)
-(3^(1/2)*(G*J*Jm)^(1/2))/(Jm*d)
>> solns = subs(solns,{G,J,Jm,d},{8e7,77,10540,140/3}) %# Substitute values
solns =
0
0
16.381862247021893
-16.381862247021893
-32.763724494043785
32.763724494043785
28.374217734436371
-28.374217734436371
I think you either just weren't choosing values in your loop close enough to the solutions for omegan or your threshold for how close the determinant is to zero is too strict. When I plug in the given values to a, along with omegan = 16.3819 (which is the closest value to one solution your loop produces), I get this:
>> det(subs(a,{omegan,G,J,Jm,d},{16.3819,8e7,77,10540,140/3}))
ans =
2.765476845475786e-005
Which is still larger in absolute amplitude than 1e-10.
I put this as an answer because I cannot paste this into a comment: Here's how Matlab calculates the determinant. I assume the rounding errors come from calculating the product of multiple diagonal elements in U.
Algorithm
The determinant is computed from the
triangular factors obtained by
Gaussian elimination
[L,U] = lu(A) s = det(L)
%# This is always +1 or -1
det(A) = s*prod(diag(U))
Related
Matlab function for cumulative power
Is there a function in MATLAB that generates the following matrix for a given scalar r: 1 r r^2 r^3 ... r^n 0 1 r r^2 ... r^(n-1) 0 0 1 r ... r^(n-2) ... 0 0 0 0 ... 1 where each row behaves somewhat like a power analog of the CUMSUM function?
You can compute each term directly using implicit expansion and element-wise power, and then apply triu: n = 5; % size r = 2; % base result = triu(r.^max((1:n)-(1:n).',0)); Or, maybe a little faster because it doesn't compute unwanted powers: n = 5; % size r = 2; % base t = (1:n)-(1:n).'; u = find(t>=0); t = t(u); result = zeros(n); result(u) = r.^t;
Using cumprod and triu: % parameters n = 5; r = 2; % Create a square matrix filled with 1: A = ones(n); % Assign the upper triangular part shifted by one with r A(triu(A,1)==1)=r; % cumprod along the second dimension and get only the upper triangular part A = triu(cumprod(A,2))
Well, cumsum accumulates the sum of a vector but you are asking for a specially design matrix, so the comparison is a bit problematic.... Anyway, it might be that there is a function for this if this is a common special case triangular matrix (my mathematical knowledge is limited here, sorry), but we can also build it quite easily (and efficiently=) ): N = 10; r = 2; % allocate arry ary = ones(1,N); % initialize array ary(2) = r; for i = 3:N ary(i) = ary(i-1)*r; end % build matrix i.e. copy the array M = eye(N); for i = 1:N M(i,i:end) = ary(1:end-i+1); end This assumes that you want to have a matrix of size NxN and r is the value that you want calculate the power of. FIX: a previous version stated in line 13 M(i,i:end) = ary(i:end);, but the assignment needs to start always at the first position of the ary
Modified Richardson Iteration - How to implement
I have an assignment to do in matlab. I have to implement the modified richardson iteration. I couldn't really understand the algorithm but i came up with this: A = [9 1 1; 2 10 3; 3 4 11]; b = [10; 19; 0]; x = [0; 0; 0]; G=eye(3)-A; %I-A z = [0,x']; for k=1:30 x = G*x + b; z = [k,x']; fprintf('Number of Iterations: %d \n', k); display(z); end The output i recieve is wrong and i don't really know why. Any help is well recieved. Thanks!
You are missing the omega parameter. From the wiki page, the iteration is: x(k+1) = x(k) + omega*( b - A*x(k) ) = (I - omega*A)*x(k) + omega*b where omega is a scalar parameter that has to be chosen appropriately. So you need to change your calculation of G to: G = eye(3)-omega*A; and the calculation of x inside the loop to: x = G*x + omega*b; The wiki page discusses how the value of omega can be chosen. For your particular case, omega = 0.1 seems to work well.
Replicate vectors shifting them to the right
In Matlab, I have two single row (1x249) vectors in a 2x249 matrix and I have to create a matrix A by replicating them many times, each time shifting the vectors of 2 positions to the right. I would like to fill the entries on the left with zeros. Is there a smart way to do this? Currently, I am using a for loop and circshift, and I add at each iteration I add the new row to A, but probably this is highly inefficient. Code (myMat is the matrix I want to shift): A = []; myMat = [1 0 -1 zeros(1,246); 0 2 0 -2 zeros(1,245)]; N = 20; for i=1:N-1 aux = circshift(myMat,[0,2*(i-1)]); aux(:,1:2*(i-1)) = 0; A =[A; aux]; end
As you are probably aware, loops in Matlab are not so efficient. I know that the Mathworks keep saying this is no longer so with JIT compilation, but I haven't experienced the fast loops yet. I put your method for constructiong the matrix A in a function: function A = replvector1(myMat,shift_right,width,N) pre_alloc = true; % make implementation faster using pre-allocation yes/no % Pad myMat with zeros to make it wide enough myMat(1,width)=0; % initialize A if pre_alloc A = zeros(size(myMat,1)*(N-1),width); else A = []; end % Fill A for i=1:N-1 aux = circshift(myMat,[0,shift_right*(i-1)]); aux(:,1:min(width,shift_right*(i-1))) = 0; A(size(myMat,1)*(i-1)+1:size(myMat,1)*i,:) =aux; end Your matrix-operation looks a lot like a kronecker product, but the block-matrixces have overlapping column ranges so a direct kronecker product will not work. Instead, I constructed the following function: function A = replvector2(myMat,shift_right,width,N) [i,j,a] = find(myMat); i = kron(ones(N-1,1),i) + kron([0:N-2]',ones(size(i))) * size(myMat,1); j = kron(ones(N-1,1),j) + kron([0:N-2]',ones(size(j))) * shift_right; a = kron(ones(N-1,1),a); ok = j<=width; A = full(sparse(i(ok),j(ok),a(ok),(N-1)*size(myMat,1),width)); You can follow the algorithm by removing semicolons and looking at intermediate results. The following main program runs your example, and can easily be modified to run similar examples: % inputs (you may vary them to see that it always works) shift_right = 2; width = 249; myMat1 = [ 1 0 -1 0 ; 0 2 0 -2 ]; N = 20; % Run your implementation tic; A = replvector1(myMat,shift_right,width,N); disp(sprintf('\n original implementation took %e sec',toc)) % Run the new implementation tic; B = replvector2(myMat,shift_right,width,N); disp(sprintf(' new implementation took %e sec',toc)) disp(sprintf('\n norm(B-A)=%e\n',norm(B-A)))
I've taken Nathan's code (see his answer to this question), and added another possible implementation (replvector3). My idea here stems from you not really needing a circular shift. You need to right-shift and add zeros to the left. If you start with a pre-allocated array (this is really where the big wins in time are for you, the rest is peanuts), then you already have the zeros. Now you just need to copy over myMat to the right locations. These are the times I see (MATLAB R2017a): OP's, with pre-allocation: 1.1730e-04 Nathan's: 5.1992e-05 Mine: 3.5426e-05 ^ shift by one on purpose, to make comparison of times easier This is the full copy, copy-paste into an M-file and run: function so shift_right = 2; width = 249; myMat = [ 1 0 -1 0 ; 0 2 0 -2 ]; N = 20; A = replvector1(myMat,shift_right,width,N); B = replvector2(myMat,shift_right,width,N); norm(B(:)-A(:)) C = replvector3(myMat,shift_right,width,N); norm(C(:)-A(:)) timeit(#()replvector1(myMat,shift_right,width,N)) timeit(#()replvector2(myMat,shift_right,width,N)) timeit(#()replvector3(myMat,shift_right,width,N)) % Original version, modified to pre-allocate function A = replvector1(myMat,shift_right,width,N) % Assuming width > shift_right * (N-1) + size(myMat,2) myMat(1,width) = 0; M = size(myMat,1); A = zeros(M*(N-1),width); for i = 1:N-1 aux = circshift(myMat,[0,shift_right*(i-1)]); aux(:,1:shift_right*(i-1)) = 0; A(M*(i-1)+(1:M),:) = aux; end % Nathan's version function A = replvector2(myMat,shift_right,width,N) [i,j,a] = find(myMat); i = kron(ones(N-1,1),i) + kron((0:N-2)',ones(size(i))) * size(myMat,1); j = kron(ones(N-1,1),j) + kron((0:N-2)',ones(size(j))) * shift_right; a = kron(ones(N-1,1),a); ok = j<=width; A = full(sparse(i(ok),j(ok),a(ok),(N-1)*size(myMat,1),width)); % My trivial version with loops function A = replvector3(myMat,shift_right,width,N) % Assuming width > shift_right * (N-1) + size(myMat,2) [M,K] = size(myMat); A = zeros(M*(N-1),width); for i = 1:N-1 A(M*(i-1)+(1:M),shift_right*(i-1)+(1:K)) = myMat; end
Optimizing repetitive estimation (currently a loop) in MATLAB
I've found myself needing to do a least-squares (or similar matrix-based operation) for every pixel in an image. Every pixel has a set of numbers associated with it, and so it can be arranged as a 3D matrix. (This next bit can be skipped) Quick explanation of what I mean by least-squares estimation : Let's say we have some quadratic system that is modeled by Y = Ax^2 + Bx + C and we're looking for those A,B,C coefficients. With a few samples (at least 3) of X and the corresponding Y, we can estimate them by: Arrange the (lets say 10) X samples into a matrix like X = [x(:).^2 x(:) ones(10,1)]; Arrange the Y samples into a similar matrix: Y = y(:); Estimate the coefficients A,B,C by solving: coeffs = (X'*X)^(-1)*X'*Y; Try this on your own if you want: A = 5; B = 2; C = 1; x = 1:10; y = A*x(:).^2 + B*x(:) + C + .25*randn(10,1); % added some noise here X = [x(:).^2 x(:) ones(10,1)]; Y = y(:); coeffs = (X'*X)^-1*X'*Y coeffs = 5.0040 1.9818 0.9241 START PAYING ATTENTION AGAIN IF I LOST YOU THERE *MAJOR REWRITE*I've modified to bring it as close to the real problem that I have and still make it a minimum working example. Problem Setup %// Setup xdim = 500; ydim = 500; ncoils = 8; nshots = 4; %// matrix size for each pixel is ncoils x nshots (an overdetermined system) %// each pixel has a matrix stored in the 3rd and 4rth dimensions regressor = randn(xdim,ydim, ncoils,nshots); regressand = randn(xdim, ydim,ncoils); So my problem is that I have to do a (X'*X)^-1*X'*Y (least-squares or similar) operation for every pixel in an image. While that itself is vectorized/matrixized the only way that I have to do it for every pixel is in a for loop, like: Original code style %// Actual work tic estimate = zeros(xdim,ydim); for col=1:size(regressor,2) for row=1:size(regressor,1) X = squeeze(regressor(row,col,:,:)); Y = squeeze(regressand(row,col,:)); B = X\Y; % B = (X'*X)^(-1)*X'*Y; %// equivalently estimate(row,col) = B(1); end end toc Elapsed time = 27.6 seconds EDITS in reponse to comments and other ideas I tried some things: 1. Reshaped into a long vector and removed the double for loop. This saved some time. 2. Removed the squeeze (and in-line transposing) by permute-ing the picture before hand: This save alot more time. Current example: %// Actual work tic estimate2 = zeros(xdim*ydim,1); regressor_mod = permute(regressor,[3 4 1 2]); regressor_mod = reshape(regressor_mod,[ncoils,nshots,xdim*ydim]); regressand_mod = permute(regressand,[3 1 2]); regressand_mod = reshape(regressand_mod,[ncoils,xdim*ydim]); for ind=1:size(regressor_mod,3) % for every pixel X = regressor_mod(:,:,ind); Y = regressand_mod(:,ind); B = X\Y; estimate2(ind) = B(1); end estimate2 = reshape(estimate2,[xdim,ydim]); toc Elapsed time = 2.30 seconds (avg of 10) isequal(estimate2,estimate) == 1; Rody Oldenhuis's way N = xdim*ydim*ncoils; %// number of columns M = xdim*ydim*nshots; %// number of rows ii = repmat(reshape(1:N,[ncoils,xdim*ydim]),[nshots 1]); %//column indicies jj = repmat(1:M,[ncoils 1]); %//row indicies X = sparse(ii(:),jj(:),regressor_mod(:)); Y = regressand_mod(:); B = X\Y; B = reshape(B(1:nshots:end),[xdim ydim]); Elapsed time = 2.26 seconds (avg of 10) or 2.18 seconds (if you don't include the definition of N,M,ii,jj) SO THE QUESTION IS: Is there an (even) faster way? (I don't think so.)
You can achieve a ~factor of 2 speed up by precomputing the transposition of X. i.e. for x=1:size(picture,2) % second dimension b/c already transposed X = picture(:,x); XX = X'; Y = randn(n_timepoints,1); %B = (X'*X)^-1*X'*Y; ; B = (XX*X)^-1*XX*Y; est(x) = B(1); end Before: Elapsed time is 2.520944 seconds. After: Elapsed time is 1.134081 seconds. EDIT: Your code, as it stands in your latest edit, can be replaced by the following tic xdim = 500; ydim = 500; n_timepoints = 10; % for example % Actual work picture = randn(xdim,ydim,n_timepoints); picture = reshape(picture, [xdim*ydim,n_timepoints])'; % note transpose YR = randn(n_timepoints,size(picture,2)); % (XX*X).^-1 = sum(picture.*picture).^-1; % XX*Y = sum(picture.*YR); est = sum(picture.*picture).^-1 .* sum(picture.*YR); est = reshape(est,[xdim,ydim]); toc Elapsed time is 0.127014 seconds. This is an order of magnitude speed up on the latest edit, and the results are all but identical to the previous method. EDIT2: Okay, so if X is a matrix, not a vector, things are a little more complicated. We basically want to precompute as much as possible outside of the for-loop to keep our costs down. We can also get a significant speed-up by computing XT*X manually - since the result will always be a symmetric matrix, we can cut a few corners to speed things up. First, the symmetric multiplication function: function XTX = sym_mult(X) % X is a 3-d matrix n = size(X,2); XTX = zeros(n,n,size(X,3)); for i=1:n for j=i:n XTX(i,j,:) = sum(X(:,i,:).*X(:,j,:)); if i~=j XTX(j,i,:) = XTX(i,j,:); end end end Now the actual computation script xdim = 500; ydim = 500; n_timepoints = 10; % for example Y = randn(10,xdim*ydim); picture = randn(xdim,ydim,n_timepoints); % 500x500x10 % Actual work tic % start timing picture = reshape(picture, [xdim*ydim,n_timepoints])'; % Here we precompute the (XT*Y) calculation to speed things up later picture_y = [sum(Y);sum(Y.*picture)]; % initialize est = zeros(size(picture,2),1); picture = permute(picture,[1,3,2]); XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture); XTX = sym_mult(XTX); % precompute (XT*X) for speed X = zeros(2,2); % preallocate for speed XY = zeros(2,1); for x=1:size(picture,2) % second dimension b/c already transposed %For some reason this is a lot faster than X = XTX(:,:,x); X(1,1) = XTX(1,1,x); X(2,1) = XTX(2,1,x); X(1,2) = XTX(1,2,x); X(2,2) = XTX(2,2,x); XY(1) = picture_y(1,x); XY(2) = picture_y(2,x); % Here we utilise the fact that A\B is faster than inv(A)*B % We also use the fact that (A*B)*C = A*(B*C) to speed things up B = X\XY; est(x) = B(1); end est = reshape(est,[xdim,ydim]); toc % end timing Before: Elapsed time is 4.56 seconds. After: Elapsed time is 2.24 seconds. This is a speed up of about a factor of 2. This code should be extensible to X being any dimensions you want. For instance, in the case where X = [1 x x^2], you would change picture_y to the following picture_y = [sum(Y);sum(Y.*picture);sum(Y.*picture.^2)]; and change XTX to XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture,picture.^2); You would also change a lot of 2s to 3s in the code, and add XY(3) = picture_y(3,x) to the loop. It should be fairly straight-forward, I believe.
Results I sped up your original version, since your edit 3 was actually not working (and also does something different). So, on my PC: Your (original) version: 8.428473 seconds. My obfuscated one-liner given below: 0.964589 seconds. First, for no other reason than to impress, I'll give it as I wrote it: %%// Some example data xdim = 500; ydim = 500; n_timepoints = 10; % for example estimate = zeros(xdim,ydim); %// initialization with explicit size picture = randn(xdim,ydim,n_timepoints); %%// Your original solution %// (slightly altered to make my version's results agree with yours) tic Y = randn(n_timepoints,xdim*ydim); ii = 1; for x = 1:xdim for y = 1:ydim X = squeeze(picture(x,y,:)); %// or similar creation of X matrix B = (X'*X)^(-1)*X' * Y(:,ii); ii = ii+1; %// sometimes you keep everything and do %// estimate(x,y,:) = B(:); %// sometimes just the first element is important and you do estimate(x,y) = B(1); end end toc %%// My version tic %// UNLEASH THE FURY!! estimate2 = reshape(sparse(1:xdim*ydim*n_timepoints, ... builtin('_paren', ones(n_timepoints,1)*(1:xdim*ydim),:), ... builtin('_paren', permute(picture, [3 2 1]),:))\Y(:), ydim,xdim).'; %' toc %%// Check for equality max(abs(estimate(:)-estimate2(:))) % (always less than ~1e-14) Breakdown First, here's the version that you should actually use: %// Construct sparse block-diagonal matrix %// (Type "help sparse" for more information) N = xdim*ydim; %// number of columns M = N*n_timepoints; %// number of rows ii = 1:N; jj = ones(n_timepoints,1)*(1:N); s = permute(picture, [3 2 1]); X = sparse(ii,jj(:), s(:)); %// Compute ALL the estimates at once estimates = X\Y(:); %// You loop through the *second* dimension first, so to make everything %// agree, we have to extract elements in the "wrong" order, and transpose: estimate2 = reshape(estimates, ydim,xdim).'; %' Here's an example of what picture and the corresponding matrix X looks like for xdim = ydim = n_timepoints = 2: >> clc, picture, full(X) picture(:,:,1) = -0.5643 -2.0504 -0.1656 0.4497 picture(:,:,2) = 0.6397 0.7782 0.5830 -0.3138 ans = -0.5643 0 0 0 0.6397 0 0 0 0 -2.0504 0 0 0 0.7782 0 0 0 0 -0.1656 0 0 0 0.5830 0 0 0 0 0.4497 0 0 0 -0.3138 You can see why sparse is necessary -- it's mostly zeros, but will grow large quickly. The full matrix would quickly consume all your RAM, while the sparse one will not consume much more than the original picture matrix does. With this matrix X, the new problem X·b = Y now contains all the problems X1 · b1 = Y1 X2 · b2 = Y2 ... where b = [b1; b2; b3; ...] Y = [Y1; Y2; Y3; ...] so, the single command X\Y will solve all your systems at once. This offloads all the hard work to a set of highly specialized, compiled to machine-specific code, optimized-in-every-way algorithms, rather than the interpreted, generic, always-two-steps-away from the hardware loops in MATLAB. It should be straightforward to convert this to a version where X is a matrix; you'll end up with something like what blkdiag does, which can also be used by mldivide in exactly the same way as above.
I had a wee play around with an idea, and I decided to stick it as a separate answer, as its a completely different approach to my other idea, and I don't actually condone what I'm about to do. I think this is the fastest approach so far: Orignal (unoptimised): 13.507176 seconds. Fast Cholesky-decomposition method: 0.424464 seconds First, we've got a function to quickly do the X'*X multiplication. We can speed things up here because the result will always be symmetric. function XX = sym_mult(X) n = size(X,2); XX = zeros(n,n,size(X,3)); for i=1:n for j=i:n XX(i,j,:) = sum(X(:,i,:).*X(:,j,:)); if i~=j XX(j,i,:) = XX(i,j,:); end end end The we have a function to do LDL Cholesky decomposition of a 3D matrix (we can do this because the (X'*X) matrix will always be symmetric) and then do forward and backwards substitution to solve the LDL inversion equation function Y = fast_chol(X,XY) n=size(X,2); L = zeros(n,n,size(X,3)); D = zeros(n,n,size(X,3)); B = zeros(n,1,size(X,3)); Y = zeros(n,1,size(X,3)); % These loops compute the LDL decomposition of the 3D matrix for i=1:n D(i,i,:) = X(i,i,:); L(i,i,:) = 1; for j=1:i-1 L(i,j,:) = X(i,j,:); for k=1:(j-1) L(i,j,:) = L(i,j,:) - L(i,k,:).*L(j,k,:).*D(k,k,:); end D(i,j,:) = L(i,j,:); L(i,j,:) = L(i,j,:)./D(j,j,:); if i~=j D(i,i,:) = D(i,i,:) - L(i,j,:).^2.*D(j,j,:); end end end for i=1:n B(i,1,:) = XY(i,:); for j=1:(i-1) B(i,1,:) = B(i,1,:)-D(i,j,:).*B(j,1,:); end B(i,1,:) = B(i,1,:)./D(i,i,:); end for i=n:-1:1 Y(i,1,:) = B(i,1,:); for j=n:-1:(i+1) Y(i,1,:) = Y(i,1,:)-L(j,i,:).*Y(j,1,:); end end Finally, we have the main script which calls all of this xdim = 500; ydim = 500; n_timepoints = 10; % for example Y = randn(10,xdim*ydim); picture = randn(xdim,ydim,n_timepoints); % 500x500x10 tic % start timing picture = reshape(pr, [xdim*ydim,n_timepoints])'; % Here we precompute the (XT*Y) calculation picture_y = [sum(Y);sum(Y.*picture)]; % initialize est2 = zeros(size(picture,2),1); picture = permute(picture,[1,3,2]); % Now we calculate the X'*X matrix XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture); XTX = sym_mult(XTX); % Call our fast Cholesky decomposition routine B = fast_chol(XTX,picture_y); est2 = B(1,:); est2 = reshape(est2,[xdim,ydim]); toc Again, this should work equally well for a Nx3 X matrix, or however big you want.
I use octave, thus I can't say anything about the resulting performance in Matlab, but would expect this code to be slightly faster: pictureT=picture' est=arrayfun(#(x)( (pictureT(x,:)*picture(:,x))^-1*pictureT(x,:)*randn(n_ti mepoints,1)),1:size(picture,2));
Writen Convolution function in matlab giving trouble
Hey there, I've been having difficulty writing the matlab equivalent of the conv(x,y) function. I cant figure out why this gives the incorrect output. For the arrays x1 = [1 2 1] and x2 = [3 1 1]. Here's what I have x1 = [1 2 1]; x2 = [3 1 1]; x1len = leng(x1); x2len = leng(x2); len = x1len + x2len - 1; x1 = zeros(1,len); x2 = zeros(1,len); buffer = zeros(1,len); answer = zeros(1,len); for n = 1:len buffer(n) = x(n); answer(n) = 0; for i = 1:len answer(n) = answer(n) + x(i) * buffer(i); end end The matlab conv(x1,x2) gives 3 7 6 3 1 as the output but this is giving me 3 5 6 6 6 for answer. Where have I gone wrong? Also, sorry for the formatting I am on opera mini.
Aside from not having x defined, and having all zeroes for your variables x1, x2, buffer, and answer, I'm not certain why you have your nested loops set up like they are. I don't know why you need to reproduce the behavior of CONV this way, but here's how I would set up a nested for-loop solution: X = [1 2 1]; Y = [3 1 1]; nX = length(X); nY = length(Y); nOutput = nX+nY-1; output = zeros(1,nOutput); for indexY = 1:nY for indexX = 1:nX indexOutput = indexY+indexX-1; output(indexOutput) = output(indexOutput) + X(indexX)*Y(indexY); end end However, since this is MATLAB, there are vectorized alternatives to looping in this way. One such solution is the following, which uses the functions SUM, SPDIAGS, and FLIPUD: output = sum(spdiags(flipud(X(:))*Y));
In the code as given, all vectors are zeroed out before you start, except for x which is never defined. So it's hard to see exactly what you're getting at. But a couple of things to note: In your inner for loop you are using values of buffer which have not yet been set by your outer loop. The inner loop always covers the full range 1:len rather than shifting one vector relative to the other. You might also want to think about "vectorizing" some of this rather than nesting for loops -- eg, your inner loop is just calculating a dot product, for which a perfectly good Matlab function already exists. (Of course the same can be said for conv -- but I guess you're reimplementing either as homework or to understand how it works?)