does white noise without mean zero do some changes? - matlab

again my question is related to white noise ,but with different meaning.let us compare following two code.first
function [ x ] = generate(N,m,A3)
f1 = 100;
f2 = 200;
T = 1./f1;
t = (0:(N*T/m):(N*T))'; %'
wn = rand(length(t),1).*2 - 1;
x = 20.*sin(2.*pi.*f1.*t) + 30.*cos(2.*pi.*f2.*t) + A3.*wn;
%[pks,locs] = findpeaks(x);
plot(x)
end
using generate(3,500,10)
graph of this code is following
but let us change our code so that it makes zero mean with white noise
function [ x ] = generate1(N,m,A3)
f1 = 100;
f2 = 200;
T = 1./f1;
t = (0:(N*T/m):(N*T))'; %'
wn = rand(length(t),1).*2 - 1;
mn=wn-mean(wn);
x = 20.*sin(2.*pi.*f1.*t) + 30.*cos(2.*pi.*f2.*t) + A3.*mn;
%[pks,locs] = findpeaks(x);
plot(x)
end
and graph is following
if we compare these two picture,we could say that it is almost same,just some changes,so does matter if we make zero mean or not?for real analysis,like for finding
peaks and so on.thanks very much
UPDATED:
there is updated code
function [ x ] = generate1(N,m,A3)
f1 = 100;
f2 = 200;
T = 1./f1;
t = (0:(N*T/m):(N*T))'; %'
wn = randn(length(t),1); %zero mean variance 1
x = 20.*sin(2.*pi.*f1.*t) + 30.*cos(2.*pi.*f2.*t) + A3.*wn;
%[pks,locs] = findpeaks(x);
plot(x)
end
and it's picture

Your initial noise is uniformly distributed between -1 & +1
Your second noise is also uniformly disributed between -1 & +1, because mean is already zero, subtracting it is meaningless
in order to obtain white noise you can use randn() function:
wn = randn(length(t),1); %zero mean variance 1
You may not observe any much difference again if your noise coefficient A3 has a much lower value compared to 20 & 30 which are the coefficients of your signal.
In order to find peaks, adding noise may not serve any purpose because noise tends to decrease the information content of signals

What is the value of mean(wm)? If it is close to zero, then no, it does not matter.
Technically, white noise has zero mean by definition.

Related

How to perform adaptive step size using Runge-Kutta fourth order (Matlab)?

For me, it seems like the estimated hstep takes quite a long time and long iteration to converge.
I tried it with this first ODE.
Basically, you perform the difference between RK4 with stepsize of h with h/2.Please note that to reach the same timestep value, you will have to use the y value after two timestep of h/2 so that it reaches h also.
frhs=#(x,y) x.^2*y;
Is my code correct?
clear all;close all;clc
c=[]; i=1; U_saved=[]; y_array=[]; y_array_alt=[];
y_arr=1; y_arr_2=1;
frhs=#(x,y) 20*cos(x);
tol=0.001;
y_ini= 1;
y_ini_2= 1;
c=abs(y_ini-y_ini_2)
hc=1
all_y_values=[];
for m=1:500
if (c>tol || m==1)
fprintf('More')
y_arr
[Unew]=vpa(Runge_Kutta(0,y_arr,frhs,hc))
if (m>1)
y_array(m)=vpa(Unew);
y_array=y_array(logical(y_array));
end
[Unew_alt]=Runge_Kutta(0,y_arr_2,frhs,hc/2);
[Unew_alt]=vpa(Runge_Kutta(hc/2,Unew_alt,frhs,hc/2))
if (m>1)
y_array_alt(m)=vpa(Unew_alt);
y_array_alt=y_array_alt(logical(y_array_alt));
end
fprintf('More')
%y_array_alt(m)=vpa(Unew_alt);
c=vpa(abs(Unew_alt-Unew) )
hc=abs(tol/c)^0.25*hc
if (c<tol)
fprintf('Less')
y_arr=vpa(y_array(end) )
y_arr_2=vpa(y_array_alt(end) )
[Unew]=Runge_Kutta(0,y_arr,frhs,hc)
all_y_values(m)=Unew;
[Unew_alt]=Runge_Kutta(0,y_arr_2,frhs,hc/2);
[Unew_alt]=Runge_Kutta(hc/2,Unew_alt,frhs,hc/2)
c=vpa( abs(Unew_alt-Unew) )
hc=abs(tol/c)^0.2*hc
end
end
end
all_y_values
A better structure for the time loop has only one place where the time step is computed.
x_array = [x0]; y_array = [y0]; h = h_init;
x = x0; y = y0;
while x < x_end
[y_new, err] = RK4_step_w_error(x,y,rhs,h);
factor = abs(tol/err)^0.2;
if factor >= 1
y_array(end+1) = y = y_new;
x_array(end+1) = x = x+h;
end
h = factor*h;
end
For the data given in the code
rhs = #(x,y) 20*cos(x);
x0 = 0; y0 = 1; x_end = 6.5; tol = 1e-3; h_init = 1;
this gives the result against the exact solution
The computed points lie exactly on the exact solution, for the segments between them one would need to use a "dense output" interpolation. Or as a first improvement, just include the middle value from the half-step computation.
function [ y_next, err] = RK4_step_w_error(x,y,rhs,h)
y2 = RK4_step(x,y,rhs,h);
y1 = RK4_step(x,y,rhs,h/2);
y1 = RK4_step(x+h/2,y1,rhs,h/2);
y_next = y1;
err = (y2-y1)/15;
end
function y_next = RK4_step(x,y,rhs,h)
k1 = h*rhs(x,y);
k2 = h*rhs(x+h/2,y+k1);
k3 = h*rhs(x+h/2,y+k2);
k4 = h*rhs(x+h,y+k3);
y_next = y + (k1+2*k2+2*k3+k4)/6;
end
Revision 1
The error returned is the actual step error. The error that is required for the step size control however is the unit step error or error density, which is the step error with divided by h
function [ y_next, err] = RK4_step_w_error(x,y,rhs,h)
y2 = RK4_step(x,y,rhs,h);
y1 = RK4_step(x,y,rhs,h/2);
y1 = RK4_step(x+h/2,y1,rhs,h/2);
y_next = y1;
err = (y2-y1)/15/h;
end
Changing the example to a simple bi-stable model oscillating between two branches of stable equilibria
rhs = #(x,y) 3*y-y^3 + 3*cos(x);
x0 = 0; y0 = 1; x_end = 13.5; tol = 5e-3; h_init = 5e-2;
gives plots of solution, error (against an ode45 integration) and step sizes
Red crosses are the step sizes of rejected steps.
Revision 2
The error in the function values can be used as an error guidance for the extrapolation value which is of 5th order, making the method a 5th order method in extrapolation mode. As it uses the 4th order error to predict the 5th order optimal step size, a caution factor is recommended, the code changes in the appropriate places to
factor = 0.75*abs(tol/err)^0.2;
...
function [ y_next, err] = RK4_step_w_error(x,y,rhs,h)
y2 = RK4_step(x,y,rhs,h);
y1 = RK4_step(x,y,rhs,h/2);
y1 = RK4_step(x+h/2,y1,rhs,h/2);
y_next = y1+(y1-y2)/15;
err = (y1-y2)/15;
end
In the plots the step size is appropriately larger, but the error shows sharper and larger spikes, this version of the method is apparently less stable.

Code unable to reconcile vector assignment

I am interested in modifying the code below to find the parameters I_1, I_2, I_3 and I_4, to be used in another code. Every time I run the code, it throws up
In an assignment A(:) = B, the number of elements in A and B must be the same
on this line " mult(mult == 0) = B;".
I have spent eternity figuring out what the problem could be. Here is the code:
%%% Some Parameters %%
delta = 0.6; % Blanked subframe ratio
B = [0 0.2 0.4 0.6 0.8 1]; %Power splitting factor
k = 2.3; %Macro BS density
f = k*5; %Small cell density
j = 300; %users density
P_m = 46; %Macro BS transmission power
P_s = 23; %SC transmit power
Zm = -15;
Zs = -15;
iter = 30; %Iteration run
h = 500; %Simulation area
hu = 0.8*h; %users simulation area
Vm = round(k*h); %Macro BS average no in h
Vs = round(f*h); %SC average no in h
Vu = round(j*hu); %%users average no in hu
Pm = 10^(P_m/10)/1000*10^(Zm/10);
Ps = 10^(P_s/10)/1000*10^(Zs/10);
for i = iter;
%% XY coodinates for Macrocell, small cells and users.
Xm = sqrt(h)*(rand(Vm,1)-0.5);
Ym = sqrt(h)*(rand(Vm,1)-0.5);
Xs = sqrt(h)*(rand(Vs,1)-0.5);
Ys = sqrt(h)*(rand(Vs,1)-0.5);
Xu = sqrt(hu)*(rand(Vu,1)-0.5);
Yu = sqrt(hu)*(rand(Vu,1)-0.5);
%Total coordinates for MBS and small cells
Total_Coord = [Xs Ys ones(size(Xs)) Xm Ym 2*ones(size(Xm))];
%Distance between BSs and users
[Xsm_mat, Xu_mat] = meshgrid(Total_Coord(:,1),Xu);
[Ysm_mat, Yu_mat] = meshgrid(Total_Coord(:,2),Yu);
Distance = sqrt((Xsm_mat-Xu_mat).^2 + (Ysm_mat-Yu_mat).^2);
%% To determine serving BS for each user
[D_m,idx_m] = min(Distance(:,(length(Xs)+1):end),[],2);
idx_m = idx_m + length(Xs);
[D_s,idx_s] = min(Distance(:,1:length(Xs)),[],2);
%% Power received by users from each BS
Psm_mat = [Ps*ones(length(Xu),length(Xs))
Pm*ones(length(Xu),length(Xm))]; % Transmit power of MBS and small cells
Pr_n = Psm_mat.*exprnd(1,size(Psm_mat))./(Distance*1e3).^4;
mult = binornd(1,delta,1,length(Xm)); % Full transmission power of each
interfering MBS for delta
mult(mult == 0) = B; % Reduced transmission power for (1-delta)
Pr = Pr_n.*[ones(length(Xu),length(Xs)) repmat(mult,length(Xu),1)];%
Interference from each BS
%% Power received by each user from serving BSs
Prm = Pr(sub2ind(size(Pr),(1:length(idx_m))',idx_m));
Prs = Pr(sub2ind(size(Pr),(1:length(idx_s))',idx_s));
P_m_n = Pr_n(sub2ind(size(Pr_n),(1:length(idx_m))',idx_m));
%% Total interference for each UE
I_T = sum(Pr,2) - Prm - Prs;
I_1 = P_m_n./(Prs + I_T);
I_2 = Prs./(P_m_n + I_T);
I_3 = B*I_1;
I_4 = Prs./(B*P_m_n + I_T);
end
The error appeared on this line "mult(mult == 0) = B;".
I know it to be assignment problem which requires equality in both the left and right dimensions. Suggestions for correction will be appreciated.
Your vector mult has length Vm (number of macro BS?). Assigning to mult(mult==0) will assign to a subset of this vector (those that have a value equal to zero). What you are assigning is your variable B which you define as B = [0 0.2 0.4 0.6 0.8 1], i.e., it is a length-6 vector. The assignment fails unless mult has exactly 6 zeros.
I highly doubt that this is what you want. It looks like you are trying to assign the same value ("Reduced transmission power"). Then your B should be scalar though.
Since we have no idea what you are trying to do (and your code is not exactly an MCVE), we can only guess.

MATLAB - Getting an NaN error

I'm trying to generate plots of a galactic orbit in a particular potential. My code is given by
function Eulersystem_MNmodel()
parsec = 3.08*10^18;
r_1 = 8.5*1000.0*parsec; % This converts our value of r into cm.
z_1 = 1.0;
theta_1 = 0.0; %Initial Value for Theta.
U_1 = 100.0*10^5; %Initial value for U in cm/sec
V = 156.972*10^5; %Proposed value of the radial velocity in cm/sec
W_1 = 150*10^5.0;
grav = 6.6720*10^-8; %Universal gravitational constant
amsun = 1.989*10^33; %Proposed Angular momentum of the sun
amg = 1.5d11*amsun;
gm = grav*amg; %Note this is constant
nsteps = 50000; %The number of steps
deltat = 5.0*10^11; %Measured in seconds
angmom = r_1*V; %The angular momentum
angmom2 = angmom^2.0; %The square of the angular momentum
E = -gm/r_1 + U_1*U_1/2 + angmom2/(2.0*r_1*r_1); %Total Energy of the system
time = 0.0;
for i=1:nsteps
r_1 = r_1 + deltat*U_1;
U_1 = U_1 + deltat*((-gm*r_1)/((r_1^2.0 + (1+sqrt(z_1^2.0+1))^2.0)^1.50))
z_1 = z_1 + deltat*W_1;
W_1 = W_1 + deltat*(gm*z_1*(1+sqrt(z_1^2.0+1))/(sqrt(z_1^2.0+1))*(r_1^2.0+(1+sqrt(z_1^2.0+1))^2.0)^1.5);
E = -gm/r_1+U_1/2.0+angmom2/(2.0*r_1*r_1);
ecc = (1.0 + (2.0*E*angmom2)/(gm^2.0))^0.5;
time1(i) = time;
time = time+deltat;
r(i) = r_1;
z(i) = z_1;
end
figure()
plot(r,z)
I keep getting straight lines, rather than a more interesting curve, which led me to investigate the output of the U_1 function. Upon this investigation I realised that it was consistently outputting NaN "Not a Number". I can't see why I'm getting this however. I've tried replacing sqrt() with ()^0.5, this still yields NaN.
You are operating with rather large numbers. The largest floating point number is something like 1.7e+308. Lager numbers would evaluate to Inf. Similarly, positive numbers smaller than 1.7e-308 would evaluate to zero and cause NaN during division?

spectral structure of sinusoidal model

let us consider following code
function [ x ] = generate1(N,m,A3)
f1 = 100;
f2 = 200;
T = 1./f1;
t = (0:(N*T/m):(N*T))'; %'
wn = randn(length(t),1); %zero mean variance 1
x = 20.*sin(2.*pi.*f1.*t) + 30.*cos(2.*pi.*f2.*t) + A3.*wn;
%[pks,locs] = findpeaks(x);
%plot(x);
end
as i know peaks in Fourier domain represent at this frequency,which are present in signal,for example let us take plot of Fourier transform of this signal
let us run this signal
y=generate1(3,500,1);
and plot
plot(abs(fft(y)))
but clearly it does not shows me peaks at frequency given in signal,what is problem?please help me,generally it is stationary signal,that why this graph should show me exact picture but it does not do,why?
EDITED :
y1=generate1(3,500,0);
function [ x, fs ] = generate1(N,m,A3)
f1 = 100;
f2 = 200;
T = 1./f1;
t = (0:(N*T/m):(N*T))'; %'
wn = randn(length(t),1); %zero mean variance 1
x = 20.*sin(2.*pi.*f1.*t) + 30.*cos(2.*pi.*f2.*t) + A3.*wn;
%[pks,locs] = findpeaks(x);
%plot(x);
fs = 1/(t(2)-t(1));
end
and see
absfft = abs(fft(y));
plot(fs/2*linspace(0,1,length(absfft)/2+1),2*absfft(1:end/2+1))
or
plot(linspace(-fs/2,fs/2,length(absfft)),fftshift(absfft))
the x-axis in your plot is from 0 to fs/2 and then from -fs/2 to 0

Optimizing repetitive estimation (currently a loop) in MATLAB

I've found myself needing to do a least-squares (or similar matrix-based operation) for every pixel in an image. Every pixel has a set of numbers associated with it, and so it can be arranged as a 3D matrix.
(This next bit can be skipped)
Quick explanation of what I mean by least-squares estimation :
Let's say we have some quadratic system that is modeled by Y = Ax^2 + Bx + C and we're looking for those A,B,C coefficients. With a few samples (at least 3) of X and the corresponding Y, we can estimate them by:
Arrange the (lets say 10) X samples into a matrix like X = [x(:).^2 x(:) ones(10,1)];
Arrange the Y samples into a similar matrix: Y = y(:);
Estimate the coefficients A,B,C by solving: coeffs = (X'*X)^(-1)*X'*Y;
Try this on your own if you want:
A = 5; B = 2; C = 1;
x = 1:10;
y = A*x(:).^2 + B*x(:) + C + .25*randn(10,1); % added some noise here
X = [x(:).^2 x(:) ones(10,1)];
Y = y(:);
coeffs = (X'*X)^-1*X'*Y
coeffs =
5.0040
1.9818
0.9241
START PAYING ATTENTION AGAIN IF I LOST YOU THERE
*MAJOR REWRITE*I've modified to bring it as close to the real problem that I have and still make it a minimum working example.
Problem Setup
%// Setup
xdim = 500;
ydim = 500;
ncoils = 8;
nshots = 4;
%// matrix size for each pixel is ncoils x nshots (an overdetermined system)
%// each pixel has a matrix stored in the 3rd and 4rth dimensions
regressor = randn(xdim,ydim, ncoils,nshots);
regressand = randn(xdim, ydim,ncoils);
So my problem is that I have to do a (X'*X)^-1*X'*Y (least-squares or similar) operation for every pixel in an image. While that itself is vectorized/matrixized the only way that I have to do it for every pixel is in a for loop, like:
Original code style
%// Actual work
tic
estimate = zeros(xdim,ydim);
for col=1:size(regressor,2)
for row=1:size(regressor,1)
X = squeeze(regressor(row,col,:,:));
Y = squeeze(regressand(row,col,:));
B = X\Y;
% B = (X'*X)^(-1)*X'*Y; %// equivalently
estimate(row,col) = B(1);
end
end
toc
Elapsed time = 27.6 seconds
EDITS in reponse to comments and other ideas
I tried some things:
1. Reshaped into a long vector and removed the double for loop. This saved some time.
2. Removed the squeeze (and in-line transposing) by permute-ing the picture before hand: This save alot more time.
Current example:
%// Actual work
tic
estimate2 = zeros(xdim*ydim,1);
regressor_mod = permute(regressor,[3 4 1 2]);
regressor_mod = reshape(regressor_mod,[ncoils,nshots,xdim*ydim]);
regressand_mod = permute(regressand,[3 1 2]);
regressand_mod = reshape(regressand_mod,[ncoils,xdim*ydim]);
for ind=1:size(regressor_mod,3) % for every pixel
X = regressor_mod(:,:,ind);
Y = regressand_mod(:,ind);
B = X\Y;
estimate2(ind) = B(1);
end
estimate2 = reshape(estimate2,[xdim,ydim]);
toc
Elapsed time = 2.30 seconds (avg of 10)
isequal(estimate2,estimate) == 1;
Rody Oldenhuis's way
N = xdim*ydim*ncoils; %// number of columns
M = xdim*ydim*nshots; %// number of rows
ii = repmat(reshape(1:N,[ncoils,xdim*ydim]),[nshots 1]); %//column indicies
jj = repmat(1:M,[ncoils 1]); %//row indicies
X = sparse(ii(:),jj(:),regressor_mod(:));
Y = regressand_mod(:);
B = X\Y;
B = reshape(B(1:nshots:end),[xdim ydim]);
Elapsed time = 2.26 seconds (avg of 10)
or 2.18 seconds (if you don't include the definition of N,M,ii,jj)
SO THE QUESTION IS:
Is there an (even) faster way?
(I don't think so.)
You can achieve a ~factor of 2 speed up by precomputing the transposition of X. i.e.
for x=1:size(picture,2) % second dimension b/c already transposed
X = picture(:,x);
XX = X';
Y = randn(n_timepoints,1);
%B = (X'*X)^-1*X'*Y; ;
B = (XX*X)^-1*XX*Y;
est(x) = B(1);
end
Before: Elapsed time is 2.520944 seconds.
After: Elapsed time is 1.134081 seconds.
EDIT:
Your code, as it stands in your latest edit, can be replaced by the following
tic
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
% Actual work
picture = randn(xdim,ydim,n_timepoints);
picture = reshape(picture, [xdim*ydim,n_timepoints])'; % note transpose
YR = randn(n_timepoints,size(picture,2));
% (XX*X).^-1 = sum(picture.*picture).^-1;
% XX*Y = sum(picture.*YR);
est = sum(picture.*picture).^-1 .* sum(picture.*YR);
est = reshape(est,[xdim,ydim]);
toc
Elapsed time is 0.127014 seconds.
This is an order of magnitude speed up on the latest edit, and the results are all but identical to the previous method.
EDIT2:
Okay, so if X is a matrix, not a vector, things are a little more complicated. We basically want to precompute as much as possible outside of the for-loop to keep our costs down. We can also get a significant speed-up by computing XT*X manually - since the result will always be a symmetric matrix, we can cut a few corners to speed things up. First, the symmetric multiplication function:
function XTX = sym_mult(X) % X is a 3-d matrix
n = size(X,2);
XTX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XTX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XTX(j,i,:) = XTX(i,j,:);
end
end
end
Now the actual computation script
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
% Actual work
tic % start timing
picture = reshape(picture, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation to speed things up later
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX); % precompute (XT*X) for speed
X = zeros(2,2); % preallocate for speed
XY = zeros(2,1);
for x=1:size(picture,2) % second dimension b/c already transposed
%For some reason this is a lot faster than X = XTX(:,:,x);
X(1,1) = XTX(1,1,x);
X(2,1) = XTX(2,1,x);
X(1,2) = XTX(1,2,x);
X(2,2) = XTX(2,2,x);
XY(1) = picture_y(1,x);
XY(2) = picture_y(2,x);
% Here we utilise the fact that A\B is faster than inv(A)*B
% We also use the fact that (A*B)*C = A*(B*C) to speed things up
B = X\XY;
est(x) = B(1);
end
est = reshape(est,[xdim,ydim]);
toc % end timing
Before: Elapsed time is 4.56 seconds.
After: Elapsed time is 2.24 seconds.
This is a speed up of about a factor of 2. This code should be extensible to X being any dimensions you want. For instance, in the case where X = [1 x x^2], you would change picture_y to the following
picture_y = [sum(Y);sum(Y.*picture);sum(Y.*picture.^2)];
and change XTX to
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture,picture.^2);
You would also change a lot of 2s to 3s in the code, and add XY(3) = picture_y(3,x) to the loop. It should be fairly straight-forward, I believe.
Results
I sped up your original version, since your edit 3 was actually not working (and also does something different).
So, on my PC:
Your (original) version: 8.428473 seconds.
My obfuscated one-liner given below: 0.964589 seconds.
First, for no other reason than to impress, I'll give it as I wrote it:
%%// Some example data
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
estimate = zeros(xdim,ydim); %// initialization with explicit size
picture = randn(xdim,ydim,n_timepoints);
%%// Your original solution
%// (slightly altered to make my version's results agree with yours)
tic
Y = randn(n_timepoints,xdim*ydim);
ii = 1;
for x = 1:xdim
for y = 1:ydim
X = squeeze(picture(x,y,:)); %// or similar creation of X matrix
B = (X'*X)^(-1)*X' * Y(:,ii);
ii = ii+1;
%// sometimes you keep everything and do
%// estimate(x,y,:) = B(:);
%// sometimes just the first element is important and you do
estimate(x,y) = B(1);
end
end
toc
%%// My version
tic
%// UNLEASH THE FURY!!
estimate2 = reshape(sparse(1:xdim*ydim*n_timepoints, ...
builtin('_paren', ones(n_timepoints,1)*(1:xdim*ydim),:), ...
builtin('_paren', permute(picture, [3 2 1]),:))\Y(:), ydim,xdim).'; %'
toc
%%// Check for equality
max(abs(estimate(:)-estimate2(:))) % (always less than ~1e-14)
Breakdown
First, here's the version that you should actually use:
%// Construct sparse block-diagonal matrix
%// (Type "help sparse" for more information)
N = xdim*ydim; %// number of columns
M = N*n_timepoints; %// number of rows
ii = 1:N;
jj = ones(n_timepoints,1)*(1:N);
s = permute(picture, [3 2 1]);
X = sparse(ii,jj(:), s(:));
%// Compute ALL the estimates at once
estimates = X\Y(:);
%// You loop through the *second* dimension first, so to make everything
%// agree, we have to extract elements in the "wrong" order, and transpose:
estimate2 = reshape(estimates, ydim,xdim).'; %'
Here's an example of what picture and the corresponding matrix X looks like for xdim = ydim = n_timepoints = 2:
>> clc, picture, full(X)
picture(:,:,1) =
-0.5643 -2.0504
-0.1656 0.4497
picture(:,:,2) =
0.6397 0.7782
0.5830 -0.3138
ans =
-0.5643 0 0 0
0.6397 0 0 0
0 -2.0504 0 0
0 0.7782 0 0
0 0 -0.1656 0
0 0 0.5830 0
0 0 0 0.4497
0 0 0 -0.3138
You can see why sparse is necessary -- it's mostly zeros, but will grow large quickly. The full matrix would quickly consume all your RAM, while the sparse one will not consume much more than the original picture matrix does.
With this matrix X, the new problem
X·b = Y
now contains all the problems
X1 · b1 = Y1
X2 · b2 = Y2
...
where
b = [b1; b2; b3; ...]
Y = [Y1; Y2; Y3; ...]
so, the single command
X\Y
will solve all your systems at once.
This offloads all the hard work to a set of highly specialized, compiled to machine-specific code, optimized-in-every-way algorithms, rather than the interpreted, generic, always-two-steps-away from the hardware loops in MATLAB.
It should be straightforward to convert this to a version where X is a matrix; you'll end up with something like what blkdiag does, which can also be used by mldivide in exactly the same way as above.
I had a wee play around with an idea, and I decided to stick it as a separate answer, as its a completely different approach to my other idea, and I don't actually condone what I'm about to do. I think this is the fastest approach so far:
Orignal (unoptimised): 13.507176 seconds.
Fast Cholesky-decomposition method: 0.424464 seconds
First, we've got a function to quickly do the X'*X multiplication. We can speed things up here because the result will always be symmetric.
function XX = sym_mult(X)
n = size(X,2);
XX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XX(j,i,:) = XX(i,j,:);
end
end
end
The we have a function to do LDL Cholesky decomposition of a 3D matrix (we can do this because the (X'*X) matrix will always be symmetric) and then do forward and backwards substitution to solve the LDL inversion equation
function Y = fast_chol(X,XY)
n=size(X,2);
L = zeros(n,n,size(X,3));
D = zeros(n,n,size(X,3));
B = zeros(n,1,size(X,3));
Y = zeros(n,1,size(X,3));
% These loops compute the LDL decomposition of the 3D matrix
for i=1:n
D(i,i,:) = X(i,i,:);
L(i,i,:) = 1;
for j=1:i-1
L(i,j,:) = X(i,j,:);
for k=1:(j-1)
L(i,j,:) = L(i,j,:) - L(i,k,:).*L(j,k,:).*D(k,k,:);
end
D(i,j,:) = L(i,j,:);
L(i,j,:) = L(i,j,:)./D(j,j,:);
if i~=j
D(i,i,:) = D(i,i,:) - L(i,j,:).^2.*D(j,j,:);
end
end
end
for i=1:n
B(i,1,:) = XY(i,:);
for j=1:(i-1)
B(i,1,:) = B(i,1,:)-D(i,j,:).*B(j,1,:);
end
B(i,1,:) = B(i,1,:)./D(i,i,:);
end
for i=n:-1:1
Y(i,1,:) = B(i,1,:);
for j=n:-1:(i+1)
Y(i,1,:) = Y(i,1,:)-L(j,i,:).*Y(j,1,:);
end
end
Finally, we have the main script which calls all of this
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
tic % start timing
picture = reshape(pr, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est2 = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
% Now we calculate the X'*X matrix
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX);
% Call our fast Cholesky decomposition routine
B = fast_chol(XTX,picture_y);
est2 = B(1,:);
est2 = reshape(est2,[xdim,ydim]);
toc
Again, this should work equally well for a Nx3 X matrix, or however big you want.
I use octave, thus I can't say anything about the resulting performance in Matlab, but would expect this code to be slightly faster:
pictureT=picture'
est=arrayfun(#(x)( (pictureT(x,:)*picture(:,x))^-1*pictureT(x,:)*randn(n_ti
mepoints,1)),1:size(picture,2));