2nd Order ODE Approx using Finite Difference - matlab

I am trying to approximate and plot the solution to u"(x) = exp(x) in the interval 0-3, with boundary conditions x(0)=1,x(3)=3. I am able to plot the approximate solution vs the exact, but the plot looks a bit off:
% Interval
a=0;
b=3;
n=10;
% Boundary vals
alpha=1;
beta=3;
%grid size
h=(b-a)/(n+1);
%Matrix generation
m = -2;
u = 1;
l = 1;
% Obtained from (y(i-1) -2y(i) + y(i+1)) = h^2 exp(x(i)
M = (1/h^2).*(diag(m*ones(1,n)) + diag(u*ones(1,n-1),1) + diag(l*ones(1,n-1),-1));
B=[];
xjj=[];
for j=1:n
xjj=[xjj,j*h];
if j==1
B=[B,f(j*h)-(alpha/h^2)];
continue
end
if j==n
B=[B,f(j*h)-(beta/h^2)];
continue
else
B=[B,f(j*h)];
end
end
X=M\B';
x=linspace(0,3,101);
plot(xjj',X,'r*')
hold on
plot(x,exp(x),'b-')
I appreciate all the advice and explanation. This is the scheme I am following: http://web.mit.edu/10.001/Web/Course_Notes/Differential_Equations_Notes/node9.html

You could shorten the big loop to simply
x=linspace(a,b,n+2);
B = f(x(2:end-1));
B(1)-=alpha/h^2;
B(n)-=beta/h^2;
The exact solution is u(x)=C*x+D+exp(x), the boundary conditions give D=0 and 3*C+exp(3)=3 <=> C=1-exp(3)/3.
Plotting this exact solution against the numerical solution gives a quite good fit for this large step size:
f=#(x)exp(x)
a=0; b=3;
n=10;
% Boundary vals
alpha=1; beta=3;
%grid
x=linspace(a,b,n+2);
h=x(2)-x(1);
% M*u=B obtained from (u(i-1) -2u(i) + u(i+1)) = h^2 exp(x(i))
M = (1/h^2).*(diag(-2*ones(1,n)) + diag(1*ones(1,n-1),1) + diag(1*ones(1,n-1),-1));
B = f(x(2:end-1));
B(1)-=alpha/h^2; B(n)-=beta/h^2;
U=M\B';
U = [ alpha; U; beta ];
clf;
plot(x',U,'r*')
hold on
x=linspace(0,3,101);
C = 1-exp(3)/3
plot(x,exp(x)+C*x,'b-')

Related

How do I fix my code to find the convergence rate

The IBVP is
u_t+2u_x=0 , 0<x<1, t>0
u(x,0)=sin(pi*x)
u(0,t)=g(t)=sin(-2*pi*t)
I have to first implement the scheme using 4th-order central difference SBP operator in space and RK4 in time with spacing xi=i*h, i=0,1,...,N and update g(t) at every RK stage. Also, using the code compute the convergence rate on a sequence of grids.
Below I have shown my working which is not helping me find the convergence rate so can anyone help me with finding my mistakes.
%Parameters
nstp = 16; %number of grid points in space
t0 = 0; %initial time
tend = 1; %end time
x0 = 0; %left boundary
xN = 1; %right boundary
x = linspace(0,1,nstp); %points in space
h = xN-x0/nstp-1; %grid size
cfl = 4;
k = cfl*h; %length of time steps
N = ceil(tend/k); %number of steps in time
k = tend/N; %length of time steps
u0 = sin(pi*x); %initial data
e = zeros(nstp);
e(1) = 1;
e0 = e(:,1);
m = zeros(nstp);
m(1) = sin(-2*pi*tend);
g = m(:,1);
%4th order central SBP operator in space
m=10; %points
H=diag(ones(m,1),0);
H(1:4,1:4)=diag([17/48 59/48 43/48 49/48]);
H(m-3:m,m-3:m)=fliplr(flipud(diag([17/48 59/48 43/48 49/48])));
H=H*h;
HI=inv(H);
D1=(-1/12*diag(ones(m-2,1),2)+8/12*diag(ones(m-1,1),1)- ...
8/12*diag(ones(m-1,1),-1)+1/12*diag(ones(m-2,1),-2));
D1(1:4,1:6)=[-24/17,59/34,-4/17,-3/34,0,0; -1/2,0,1/2,0,0,0;
4/43,-59/86,0,59/86,-4/43,0; 3/98,0,-59/98,0,32/49,-4/49];
D1(m-3:m,m-5:m)=flipud( fliplr(-D1(1:4,1:6)));
D1=D1/h;
%SBP-SAT scheme
u = -2*D1*x(1:N-1)'-(2*HI*(u0-g)*e);
%Runge Kutta for ODE
for i=1:nstp %calculation loop
t=(i-1)*k;
k1=D*u;
k2=D*(u+k*k1/2);
k3=D*(u+k*k2/2);
k4=D*(u+k*k3);
u=u+(h*(k1+k2+k3+k4))/6; %main equation
figure(1)
plot(x(1:N-1),u); %plot
drawnow
end
%error calculcation
ucorrect = sin(pi*(x-2*tend)); %correct solution
ucomp = u(:,end); %computed solution
errornorm = sqrt((ucomp-ucorrect)'*H*(ucomp-ucorrect)); %norm of error**
One egregious error is
h = xN-x0/nstp-1; %grid size
You need parentheses here, else you compute h as xN-(x0/nstp)-1=1-(0/nstp)-1=0.
Or avoid this all and use the actual grid step. Also, if nstp is the number of steps or segments in the subdivision, the number of nodes is one larger. And if you define x0,xN then you should also actually use them.
x = linspace(x0,xN,nstp+1); %points in space
h = x(2)-x(1); %grid size
As to the SBP-SAT approach, the space discretization gives
u_t(t) + 2*D_1*u(t) = - H_inv * (dot(e_0,u(t)) - g(t))*e_0
which should give the MoL ODE function as
F = #(u,t) -2*D1*u - HI*e0*(u(1)-g(t))
The matrix construction has to be adapted to have the correct extends for these changed operations.
The implement RK4 as
k1 = F(u,t)
k2 = F(u+0.5*h*k1, t+0.5*h)
...
In total this gives a modified script below. Stable solutions only occur for cfl=1 or below. With that the errors seemingly behave like 4th order. The factor for the initial condition also works best in the range 0.5 to 1, larger factors tend to increase the error at x=0, at least in the initial time steps. With the smaller factors the error is uniform over the interval.
%Parameters
xstp = 16; %number of grid points in space
x0 = 0; %left boundary
xM = 1; %right boundary
x = linspace(x0,xM,xstp+1)'; %points in space
h = x(2)-x(1); %grid size
cfl = 1;
t0 = 0; %initial time
tend = 1; %end time
k = cfl*h; %length of time steps
N = ceil(tend/k); %number of steps in time
k = tend/N; %length of time steps
u = sin(pi*x); %initial data
e0 = zeros(xstp+1,1);
e0(1) = 1;
g = #(t) sin(-2*pi*t);
%4th order central SBP operator in space
M=xstp+1; %points
H=diag(ones(M,1),0); % or eye(M)
H(1:4,1:4)=diag([17/48 59/48 43/48 49/48]);
H(M-3:M,M-3:M)=fliplr(flipud(diag([17/48 59/48 43/48 49/48])));
H=H*h;
HI=inv(H);
D1=(-1/12*diag(ones(M-2,1),2)+8/12*diag(ones(M-1,1),1)- ...
8/12*diag(ones(M-1,1),-1)+1/12*diag(ones(M-2,1),-2));
D1(1:4,1:6) = [ -24/17, 59/34, -4/17, -3/34, 0, 0;
-1/2, 0, 1/2, 0, 0, 0;
4/43, -59/86, 0, 59/86, -4/43, 0;
3/98, 0, -59/98, 0, 32/49, -4/49 ];
D1(M-3:M,M-5:M) = flipud( fliplr(-D1(1:4,1:6)));
D1=D1/h;
%SBP-SAT scheme
F = #(u,t) -2*D1*u - 0.5*HI*(u(1)-g(t))*e0;
clf;
%Runge Kutta for ODE
for i=1:N %calculation loop
t=(i-1)*k;
k1=F(u,t);
k2=F(u+0.5*k*k1, t+0.5*k);
k3=F(u+0.5*k*k2, t+0.5*k);
k4=F(u+k*k3, t+k);
u=u+(k/6)*(k1+2*k2+2*k3+k4); %main equation
figure(1)
subplot(211);
plot(x,u,x,sin(pi*(x-2*(t+k)))); %plot
ylim([-1.1,1.1]);
grid on;
legend(["computed";"exact"])
drawnow
subplot(212);
plot(x,u-sin(pi*(x-2*(t+k)))); %plot
grid on;
hold on;
drawnow
end
%error calculcation
ucorrect = sin(pi*(x-2*tend)); %correct solution
ucomp = u; %computed solution
errornorm = sqrt((ucomp-ucorrect)'*H*(ucomp-ucorrect)); %norm of error**

Error in FDM for a coupled PDEs

Here is the code which is trying to solve a coupled PDEs using finite difference method,
clear;
Lmax = 1.0; % Maximum length
Wmax = 1.0; % Maximum wedth
Tmax = 2.; % Maximum time
% Parameters needed to solve the equation
K = 30; % Number of time steps
n = 3; % Number of space steps
m =30; % Number of space steps
M = 2;
N = 1;
Pr = 1;
Re = 1;
Gr = 5;
maxn=20; % The wave-front: intermediate point from which u=0
maxm = 20;
maxk = 20;
dt = Tmax/K;
dx = Lmax/n;
dy = Wmax/m;
%M = a*B1^2*l/(p*U)
b =1/(1+M*dt);
c =dt/(1+M*dt);
d = dt/((1+M*dt)*dy);
%Gr = gB*(T-T1)*l/U^2;
% Initial value of the function u (amplitude of the wave)
for i = 1:n
if i < maxn
u(i,1)=1.;
else
u(i,1)=0.;
end
x(i) =(i-1)*dx;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for j = 1:m
if j < maxm
v(j,1)=1.;
else
v(j,1)=0.;
end
y(j) =(j-1)*dy;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for k = 1:K
if k < maxk
T(k,1)=1.;
else
T(k,1)=0.;
end
z(k) =(k-1)*dt;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Value at the boundary
%for k=0:K
%end
% Implementation of the explicit method
for k=0:K % Time loop
for i=1:n % Space loop
for j=1:m
u(i,j,k+1) = b*u(i,j,k)+c*Gr*T(i,j,k+1)+d*[((u(i,j+1,k)-u(i,j,k))/dy)^(N-1)*((u(i,j+1,k)-u(i,j,k))/dy)]-d*[((u(i,j,k)-u(i,j-1,k))/dy)^(N-1)*((u(i,j,k)-u(i,j-1,k))/dy)]-d*[u(i,j,k)*((u(i,j,k)-u(i-1,j,k))/dx)+v(i,j,k)*((u(i,j+1,k)-u(i,j,k))/dy)];
v(i,j,k+1) = dy*[(u(i-1,j,k+1)-u(i,j,k+1))/dx]+v(i,j-1,k+1);
T(i,j,k+1) = T(i,j,k)+(dt/(Pr*Re))*{(T(i,j+1,k)-2*T(i,j,k)+T(i,j-1,k))/dy^2-Pr*Re{u(i,j,k)*((T(i,j,k)-T(i-1,j,k))/dx)+v(i,j,k)*((T(i,j+1,k)-T(i,j,k))/dy)}};
end
end
end
% Graphical representation of the wave at different selected times
plot(x,u(:,1),'-',x,u(:,10),'-',x,u(:,50),'-',x,u(:,100),'-')
title('graphs')
xlabel('X')
ylabel('Y')
But I am getting this error
Subscript indices must either be real positive integers or logicals.
I am trying to implement this
with boundary conditions
Can someone please help me out!
Thanks
To be quite honest, it looks like you started with something that's way over your head, just typed everything down in one go without thinking much, and now you are surprised that it doesn't work...
In the future, please break down problems like these into waaaay smaller chunks that you can individually plot, check, test, etc. Better yet, try simpler problems first (wave equation, heat equation, ...), gradually working your way up to this.
I say this so harshly, because there were quite a number of fairly basic things wrong with your code:
you've used braces ({}) and brackets ([]) exactly as they are written in the equation. In MATLAB, braces are a constructor for a special container object called a cell array, and brackets are used to construct arrays and matrices. To group things like in the equation, you always have to use parentheses (()).
You had quite a number of parentheses wrong, which became apparent when I re-grouped and broke up those huge unintelligible lines into multiple lines that humans can actually read with understanding
you forgot to take the absolute values in the 3rd and 4th terms of u
you looped over k = 0:K and j = 1:m and then happily index everything with k and j-1. MATLAB is 1-based, meaning, the first element of anything is element 1, and indexing with 0 is an error
you've initialized 3 vectors u, v and T, but then index those in the loop as if they are 3D arrays
Now, I've managed to come up with the following code, which runs OK and at least more or less agrees with the equations shown. But I think it still doesn't make much sense because I get only zeros out (except for the initial values).
But, with this feedback, you should be able to correct any problems left.
Lmax = 1.0; % Maximum length
Wmax = 1.0; % Maximum wedth
Tmax = 2.; % Maximum time
% Parameters needed to solve the equation
K = 30; % Number of time steps
n = 3; % Number of space steps
m = 30; % Number of space steps
M = 2;
N = 1;
Pr = 1;
Re = 1;
Gr = 5;
maxn = 20; % The wave-front: intermediate point from which u=0
maxm = 20;
maxk = 20;
dt = Tmax/K;
dx = Lmax/n;
dy = Wmax/m;
%M = a*B1^2*l/(p*U)
b = 1/(1+M*dt);
c = dt/(1+M*dt);
d = dt/((1+M*dt)*dy);
%Gr = gB*(T-T1)*l/U^2;
% Initial value of the function u (amplitude of the wave)
u = zeros(n,m,K+1);
x = zeros(n,1);
for i = 1:n
if i < maxn
u(i,1)=1.;
else
u(i,1)=0.;
end
x(i) =(i-1)*dx;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
v = zeros(n,m,K+1);
y = zeros(m,1);
for j = 1:m
if j < maxm
v(1,j,1)=1.;
else
v(1,j,1)=0.;
end
y(j) =(j-1)*dy;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
T = zeros(n,m,K+1);
z = zeros(K,1);
for k = 1:K
if k < maxk
T(1,1,k)=1.;
else
T(1,1,k)=0.;
end
z(k) =(k-1)*dt;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Value at the boundary
%for k=0:K
%end
% Implementation of the explicit method
for k = 2:K % Time loop
for i = 2:n % Space loop
for j = 2:m-1
u(i,j,k+1) = b*u(i,j,k) + ...
c*Gr*T(i,j,k+1) + ...
d*(abs(u(i,j+1,k) - u(i,j ,k))/dy)^(N-1)*((u(i,j+1,k) - u(i,j ,k))/dy) - ...
d*(abs(u(i,j ,k) - u(i,j-1,k))/dy)^(N-1)*((u(i,j ,k) - u(i,j-1,k))/dy) - ...
d*(u(i,j,k)*((u(i,j ,k) - u(i-1,j,k))/dx) +...
v(i,j,k)*((u(i,j+1,k) - u(i ,j,k))/dy));
v(i,j,k+1) = dy*(u(i-1,j,k+1)-u(i,j,k+1))/dx + ...
v(i,j-1,k+1);
T(i,j,k+1) = T(i,j,k) + dt/(Pr*Re) * (...
(T(i,j+1,k) - 2*T(i,j,k) + T(i,j-1,k))/dy^2 - Pr*Re*(...
u(i,j,k)*((T(i,j,k) - T(i-1,j,k))/dx) + v(i,j,k)*((T(i,j+1,k) - T(i,j,k))/dy))...
);
end
end
end
% Graphical representation of the wave at different selected times
figure, hold on
plot(x, u(:, 1), '-',...
x, u(:, 10), '-',...
x, u(:, 50), '-',...
x, u(:,100), '-')
title('graphs')
xlabel('X')
ylabel('Y')

Code wont produce the value of a definite integral in MATLAB

I've had problems with my code as I've tried to make an integral compute, but it will not for the power, P2.
I've tried using anonymous function handles to use the integral() function on MATLAB as well as just using int(), but it will still not compute. Are the values too small for MATLAB to integrate or am I just missing something small?
Any help or advice would be appreciated to push me in the right direction. Thanks!
The problem in the code is in the bottom of the section labelled "Power Calculations". My integral also gets quite messy if that makes a difference.
%%%%%%%%%%% Parameters %%%%%%%%%%%%
n0 = 1; %air
n1 = 1.4; %layer 1
n2 = 2.62; %layer 2
n3 = 3.5; %silicon
L0 = 650*10^(-9); %centre wavelength
L1 = 200*10^(-9): 10*10^(-9): 2200*10^(-9); %lambda from 200nm to 2200nm
x = ((pi./2).*(L0./L1)); %layer phase thickness
r01 = ((n0 - n1)./(n0 + n1)); %reflection coefficient 01
r12 = ((n1 - n2)./(n1 + n2)); %reflection coefficient 12
r23 = ((n2 - n3)./(n2 + n3)); %reflection coefficient 23
t01 = ((2.*n0)./(n0 + n1)); %transmission coefficient 01
t12 = ((2.*n1)./(n1 + n2)); %transmission coefficient 12
t23 = ((2.*n2)./(n2 + n3)); %transmission coefficient 23
Q1 = [1 r01; r01 1]; %Matrix Q1
Q2 = [1 r12; r12 1]; %Matrix Q2
Q3 = [1 r23; r23 1]; %Matrix Q3
%%%%%%%%%%%% Graph of L vs R %%%%%%%%%%%
R = zeros(size(x));
for i = 1:length(x)
P = [exp(j.*x(i)) 0; 0 exp(-j.*x(i))]; %General Matrix P
T = ((1./(t01.*t12.*t23)).*(Q1*P*Q2*P*Q3)); %Transmission
T11 = T(1,1); %T11 value
T21 = T(2,1); %T21 value
R(i) = ((abs(T21./T11))^2).*100; %Percent reflectivity
end
plot(L1,R)
title('Percent Reflectance vs. wavelength for 2 Layers')
xlabel('Wavelength (m)')
ylabel('Reflectance (%)')
%%%%%%%%%%% Power Calculation %%%%%%%%%%
syms L; %General lamda
y = ((pi./2).*(L0./L)); %Layer phase thickness with variable Lamda
P1 = [exp(j.*y) 0; 0 exp(-j.*y)]; %Matrix P with variable Lambda
T1 = ((1./(t01.*t12.*t23)).*(Q1*P1*Q2*P1*Q3)); %Transmittivity matrix T1
I = ((6.16^(15))./((L.^(5)).*exp(2484./L) - 1)); %Blackbody Irradiance
Tf11 = T1(1,1); %New T11 section of matrix with variable Lambda
Tf2 = (((abs(1./Tf11))^2).*(n3./n0)); %final transmittivity
P1 = Tf2.*I; %Power before integration
L_initial = 200*10^(-9); %Initial wavelength
L_final = 2200*10^(-9); %Final wavelength
P2 = int(P1, L, L_initial, L_final) %Power production
I've refactored your code
to make it easier to read
to improve code reuse
to improve performance
to make it easier to understand
Why do you use so many unnecessary parentheses?!
Anyway, there's a few problems I saw in your code.
You used i as a loop variable, and j as the imaginary unit. It was OK for this one instance, but just barely so. In the future it's better to use 1i or 1j for the imaginary unit, and/or m or ii or something other than i or j as the loop index variable. You're helping yourself and your colleagues; it's just less confusing that way.
Towards the end, you used the variable name P1 twice in a row, and in two different ways. Although it works here, it's confusing! Took me a while to unravel why a matrix-producing function was producing scalars instead...
But by far the biggest problem in your code is the numerical problems with the blackbody irradiance computation. The term
L⁵ · exp(2484/L) - 1
for λ₀ = 200 · 10⁻⁹ m will require computing the quantity
exp(1.242 · 10¹⁰)
which, needless to say, is rather difficult for a computer :) Actually, the problem with your computation is two-fold. First, the exponentiation is definitely out of range of 64 bit IEEE-754 double precision, and will therefore result in ∞. Second, the parentheses are wrong; Planck's law should read
C/L⁵ · 1/(exp(D) - 1)
with C and D the constants (involving Planck's constant, speed of light, and Boltzmann constant), which you've presumably precomputed (I didn't check the values. I do know choice of units can mess these up, so better check).
So, aside from the silly parentheses error, I suspect the main problem is that you simply forgot to rescale λ to nm. Changing everything in the blackbody equation to nm and correcting those parentheses gives the code
I = 6.16^(15) / ( (L*1e+9)^5 * (exp(2484/(L*1e+9)) - 1) );
With this, I got a finite value for the integral of
P2 = 1.052916498836486e-010
But, again, you'd better double-check everything.
Note that I used quadgk(), because it's one of the better ones available on R2010a (which I'm stuck with), but you can just as easily replace this with integral() available on anything newer than R2012a.
Here's the code I ended up with:
function pwr = my_fcn()
% Parameters
n0 = 1; % air
n1 = 1.4; % layer 1
n2 = 2.62; % layer 2
n3 = 3.5; % silicon
L0 = 650e-9; % centre wavelength
% Reflection coefficients
r01 = (n0 - n1)/(n0 + n1);
r12 = (n1 - n2)/(n1 + n2);
r23 = (n2 - n3)/(n2 + n3);
% Transmission coefficients
t01 = (2*n0) / (n0 + n1);
t12 = (2*n1) / (n1 + n2);
t23 = (2*n2) / (n2 + n3);
% Quality factors
Q1 = [1 r01; r01 1];
Q2 = [1 r12; r12 1];
Q3 = [1 r23; r23 1];
% Initial & Final wavelengths
L_initial = 200e-9;
L_final = 2200e-9;
% plot reflectivity for selected lambda range
plot_reflectivity(L_initial, L_final, 1000);
% Compute power production
pwr = quadgk(#power_production, L_initial, L_final);
% Helper functions
% ========================================
% Graph of lambda vs reflectivity
function plot_reflectivity(L_initial, L_final, N)
L = linspace(L_initial, L_final, N);
R = zeros(size(L));
for ii = 1:numel(L)
% Transmission
T = transmittivity(L(ii));
% Percent reflectivity
R(ii) = 100 * abs(T(2,1)/T(1,1))^2 ;
end
plot(L, R)
title('Percent Reflectance vs. wavelength for 2 Layers')
xlabel('Wavelength (m)')
ylabel('Reflectance (%)')
end
% Compute transmittivity matrix for a single wavelength
function T = transmittivity(L)
% Layer phase thickness with variable Lamda
y = pi/2 * L0/L;
% Matrix P with variable Lambda
P1 = [exp(+1j*y) 0
0 exp(-1j*y)];
% Transmittivity matrix T1
T = 1/(t01*t12*t23) * Q1*P1*Q2*P1*Q3;
end
% Power for a specific wavelength. Note that this function
% accepts vector-valued wavelengths; needed for quadgk()
function pwr = power_production(L)
pwr = zeros(size(L));
for ii = 1:numel(L)
% Transmittivity matrix
T1 = transmittivity(L(ii));
% Blackbody Irradiance
I = 6.16^(15) / ( (L(ii)*1e+9)^5 * (exp(2484/(L(ii)*1e+9)) - 1) );
% final transmittivity
Tf2 = abs(1/T1(1))^2 * n3/n0;
% Power before integration
pwr(ii) = Tf2 * I;
end
end
end

Optimizing repetitive estimation (currently a loop) in MATLAB

I've found myself needing to do a least-squares (or similar matrix-based operation) for every pixel in an image. Every pixel has a set of numbers associated with it, and so it can be arranged as a 3D matrix.
(This next bit can be skipped)
Quick explanation of what I mean by least-squares estimation :
Let's say we have some quadratic system that is modeled by Y = Ax^2 + Bx + C and we're looking for those A,B,C coefficients. With a few samples (at least 3) of X and the corresponding Y, we can estimate them by:
Arrange the (lets say 10) X samples into a matrix like X = [x(:).^2 x(:) ones(10,1)];
Arrange the Y samples into a similar matrix: Y = y(:);
Estimate the coefficients A,B,C by solving: coeffs = (X'*X)^(-1)*X'*Y;
Try this on your own if you want:
A = 5; B = 2; C = 1;
x = 1:10;
y = A*x(:).^2 + B*x(:) + C + .25*randn(10,1); % added some noise here
X = [x(:).^2 x(:) ones(10,1)];
Y = y(:);
coeffs = (X'*X)^-1*X'*Y
coeffs =
5.0040
1.9818
0.9241
START PAYING ATTENTION AGAIN IF I LOST YOU THERE
*MAJOR REWRITE*I've modified to bring it as close to the real problem that I have and still make it a minimum working example.
Problem Setup
%// Setup
xdim = 500;
ydim = 500;
ncoils = 8;
nshots = 4;
%// matrix size for each pixel is ncoils x nshots (an overdetermined system)
%// each pixel has a matrix stored in the 3rd and 4rth dimensions
regressor = randn(xdim,ydim, ncoils,nshots);
regressand = randn(xdim, ydim,ncoils);
So my problem is that I have to do a (X'*X)^-1*X'*Y (least-squares or similar) operation for every pixel in an image. While that itself is vectorized/matrixized the only way that I have to do it for every pixel is in a for loop, like:
Original code style
%// Actual work
tic
estimate = zeros(xdim,ydim);
for col=1:size(regressor,2)
for row=1:size(regressor,1)
X = squeeze(regressor(row,col,:,:));
Y = squeeze(regressand(row,col,:));
B = X\Y;
% B = (X'*X)^(-1)*X'*Y; %// equivalently
estimate(row,col) = B(1);
end
end
toc
Elapsed time = 27.6 seconds
EDITS in reponse to comments and other ideas
I tried some things:
1. Reshaped into a long vector and removed the double for loop. This saved some time.
2. Removed the squeeze (and in-line transposing) by permute-ing the picture before hand: This save alot more time.
Current example:
%// Actual work
tic
estimate2 = zeros(xdim*ydim,1);
regressor_mod = permute(regressor,[3 4 1 2]);
regressor_mod = reshape(regressor_mod,[ncoils,nshots,xdim*ydim]);
regressand_mod = permute(regressand,[3 1 2]);
regressand_mod = reshape(regressand_mod,[ncoils,xdim*ydim]);
for ind=1:size(regressor_mod,3) % for every pixel
X = regressor_mod(:,:,ind);
Y = regressand_mod(:,ind);
B = X\Y;
estimate2(ind) = B(1);
end
estimate2 = reshape(estimate2,[xdim,ydim]);
toc
Elapsed time = 2.30 seconds (avg of 10)
isequal(estimate2,estimate) == 1;
Rody Oldenhuis's way
N = xdim*ydim*ncoils; %// number of columns
M = xdim*ydim*nshots; %// number of rows
ii = repmat(reshape(1:N,[ncoils,xdim*ydim]),[nshots 1]); %//column indicies
jj = repmat(1:M,[ncoils 1]); %//row indicies
X = sparse(ii(:),jj(:),regressor_mod(:));
Y = regressand_mod(:);
B = X\Y;
B = reshape(B(1:nshots:end),[xdim ydim]);
Elapsed time = 2.26 seconds (avg of 10)
or 2.18 seconds (if you don't include the definition of N,M,ii,jj)
SO THE QUESTION IS:
Is there an (even) faster way?
(I don't think so.)
You can achieve a ~factor of 2 speed up by precomputing the transposition of X. i.e.
for x=1:size(picture,2) % second dimension b/c already transposed
X = picture(:,x);
XX = X';
Y = randn(n_timepoints,1);
%B = (X'*X)^-1*X'*Y; ;
B = (XX*X)^-1*XX*Y;
est(x) = B(1);
end
Before: Elapsed time is 2.520944 seconds.
After: Elapsed time is 1.134081 seconds.
EDIT:
Your code, as it stands in your latest edit, can be replaced by the following
tic
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
% Actual work
picture = randn(xdim,ydim,n_timepoints);
picture = reshape(picture, [xdim*ydim,n_timepoints])'; % note transpose
YR = randn(n_timepoints,size(picture,2));
% (XX*X).^-1 = sum(picture.*picture).^-1;
% XX*Y = sum(picture.*YR);
est = sum(picture.*picture).^-1 .* sum(picture.*YR);
est = reshape(est,[xdim,ydim]);
toc
Elapsed time is 0.127014 seconds.
This is an order of magnitude speed up on the latest edit, and the results are all but identical to the previous method.
EDIT2:
Okay, so if X is a matrix, not a vector, things are a little more complicated. We basically want to precompute as much as possible outside of the for-loop to keep our costs down. We can also get a significant speed-up by computing XT*X manually - since the result will always be a symmetric matrix, we can cut a few corners to speed things up. First, the symmetric multiplication function:
function XTX = sym_mult(X) % X is a 3-d matrix
n = size(X,2);
XTX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XTX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XTX(j,i,:) = XTX(i,j,:);
end
end
end
Now the actual computation script
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
% Actual work
tic % start timing
picture = reshape(picture, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation to speed things up later
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX); % precompute (XT*X) for speed
X = zeros(2,2); % preallocate for speed
XY = zeros(2,1);
for x=1:size(picture,2) % second dimension b/c already transposed
%For some reason this is a lot faster than X = XTX(:,:,x);
X(1,1) = XTX(1,1,x);
X(2,1) = XTX(2,1,x);
X(1,2) = XTX(1,2,x);
X(2,2) = XTX(2,2,x);
XY(1) = picture_y(1,x);
XY(2) = picture_y(2,x);
% Here we utilise the fact that A\B is faster than inv(A)*B
% We also use the fact that (A*B)*C = A*(B*C) to speed things up
B = X\XY;
est(x) = B(1);
end
est = reshape(est,[xdim,ydim]);
toc % end timing
Before: Elapsed time is 4.56 seconds.
After: Elapsed time is 2.24 seconds.
This is a speed up of about a factor of 2. This code should be extensible to X being any dimensions you want. For instance, in the case where X = [1 x x^2], you would change picture_y to the following
picture_y = [sum(Y);sum(Y.*picture);sum(Y.*picture.^2)];
and change XTX to
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture,picture.^2);
You would also change a lot of 2s to 3s in the code, and add XY(3) = picture_y(3,x) to the loop. It should be fairly straight-forward, I believe.
Results
I sped up your original version, since your edit 3 was actually not working (and also does something different).
So, on my PC:
Your (original) version: 8.428473 seconds.
My obfuscated one-liner given below: 0.964589 seconds.
First, for no other reason than to impress, I'll give it as I wrote it:
%%// Some example data
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
estimate = zeros(xdim,ydim); %// initialization with explicit size
picture = randn(xdim,ydim,n_timepoints);
%%// Your original solution
%// (slightly altered to make my version's results agree with yours)
tic
Y = randn(n_timepoints,xdim*ydim);
ii = 1;
for x = 1:xdim
for y = 1:ydim
X = squeeze(picture(x,y,:)); %// or similar creation of X matrix
B = (X'*X)^(-1)*X' * Y(:,ii);
ii = ii+1;
%// sometimes you keep everything and do
%// estimate(x,y,:) = B(:);
%// sometimes just the first element is important and you do
estimate(x,y) = B(1);
end
end
toc
%%// My version
tic
%// UNLEASH THE FURY!!
estimate2 = reshape(sparse(1:xdim*ydim*n_timepoints, ...
builtin('_paren', ones(n_timepoints,1)*(1:xdim*ydim),:), ...
builtin('_paren', permute(picture, [3 2 1]),:))\Y(:), ydim,xdim).'; %'
toc
%%// Check for equality
max(abs(estimate(:)-estimate2(:))) % (always less than ~1e-14)
Breakdown
First, here's the version that you should actually use:
%// Construct sparse block-diagonal matrix
%// (Type "help sparse" for more information)
N = xdim*ydim; %// number of columns
M = N*n_timepoints; %// number of rows
ii = 1:N;
jj = ones(n_timepoints,1)*(1:N);
s = permute(picture, [3 2 1]);
X = sparse(ii,jj(:), s(:));
%// Compute ALL the estimates at once
estimates = X\Y(:);
%// You loop through the *second* dimension first, so to make everything
%// agree, we have to extract elements in the "wrong" order, and transpose:
estimate2 = reshape(estimates, ydim,xdim).'; %'
Here's an example of what picture and the corresponding matrix X looks like for xdim = ydim = n_timepoints = 2:
>> clc, picture, full(X)
picture(:,:,1) =
-0.5643 -2.0504
-0.1656 0.4497
picture(:,:,2) =
0.6397 0.7782
0.5830 -0.3138
ans =
-0.5643 0 0 0
0.6397 0 0 0
0 -2.0504 0 0
0 0.7782 0 0
0 0 -0.1656 0
0 0 0.5830 0
0 0 0 0.4497
0 0 0 -0.3138
You can see why sparse is necessary -- it's mostly zeros, but will grow large quickly. The full matrix would quickly consume all your RAM, while the sparse one will not consume much more than the original picture matrix does.
With this matrix X, the new problem
X·b = Y
now contains all the problems
X1 · b1 = Y1
X2 · b2 = Y2
...
where
b = [b1; b2; b3; ...]
Y = [Y1; Y2; Y3; ...]
so, the single command
X\Y
will solve all your systems at once.
This offloads all the hard work to a set of highly specialized, compiled to machine-specific code, optimized-in-every-way algorithms, rather than the interpreted, generic, always-two-steps-away from the hardware loops in MATLAB.
It should be straightforward to convert this to a version where X is a matrix; you'll end up with something like what blkdiag does, which can also be used by mldivide in exactly the same way as above.
I had a wee play around with an idea, and I decided to stick it as a separate answer, as its a completely different approach to my other idea, and I don't actually condone what I'm about to do. I think this is the fastest approach so far:
Orignal (unoptimised): 13.507176 seconds.
Fast Cholesky-decomposition method: 0.424464 seconds
First, we've got a function to quickly do the X'*X multiplication. We can speed things up here because the result will always be symmetric.
function XX = sym_mult(X)
n = size(X,2);
XX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XX(j,i,:) = XX(i,j,:);
end
end
end
The we have a function to do LDL Cholesky decomposition of a 3D matrix (we can do this because the (X'*X) matrix will always be symmetric) and then do forward and backwards substitution to solve the LDL inversion equation
function Y = fast_chol(X,XY)
n=size(X,2);
L = zeros(n,n,size(X,3));
D = zeros(n,n,size(X,3));
B = zeros(n,1,size(X,3));
Y = zeros(n,1,size(X,3));
% These loops compute the LDL decomposition of the 3D matrix
for i=1:n
D(i,i,:) = X(i,i,:);
L(i,i,:) = 1;
for j=1:i-1
L(i,j,:) = X(i,j,:);
for k=1:(j-1)
L(i,j,:) = L(i,j,:) - L(i,k,:).*L(j,k,:).*D(k,k,:);
end
D(i,j,:) = L(i,j,:);
L(i,j,:) = L(i,j,:)./D(j,j,:);
if i~=j
D(i,i,:) = D(i,i,:) - L(i,j,:).^2.*D(j,j,:);
end
end
end
for i=1:n
B(i,1,:) = XY(i,:);
for j=1:(i-1)
B(i,1,:) = B(i,1,:)-D(i,j,:).*B(j,1,:);
end
B(i,1,:) = B(i,1,:)./D(i,i,:);
end
for i=n:-1:1
Y(i,1,:) = B(i,1,:);
for j=n:-1:(i+1)
Y(i,1,:) = Y(i,1,:)-L(j,i,:).*Y(j,1,:);
end
end
Finally, we have the main script which calls all of this
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
tic % start timing
picture = reshape(pr, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est2 = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
% Now we calculate the X'*X matrix
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX);
% Call our fast Cholesky decomposition routine
B = fast_chol(XTX,picture_y);
est2 = B(1,:);
est2 = reshape(est2,[xdim,ydim]);
toc
Again, this should work equally well for a Nx3 X matrix, or however big you want.
I use octave, thus I can't say anything about the resulting performance in Matlab, but would expect this code to be slightly faster:
pictureT=picture'
est=arrayfun(#(x)( (pictureT(x,:)*picture(:,x))^-1*pictureT(x,:)*randn(n_ti
mepoints,1)),1:size(picture,2));

MATLAB How to vectorize these for loops?

I've searched a lot but didn't find any solution to my problem, could you please help me vectorizing (or just a way to make it way faster) these loops ?
% n is the size of C
h = 1/(n-1)
dt = 1e-6;
a = 1e-2;
F=zeros(n,n);
F2=zeros(n,n);
C2=zeros(n,n);
t = 0.0;
for iter=1:12000
F2=F.^3-F;
for i=1:n
for j=1:n
F2(i,j)=F2(i,j)-(C(ij(i-1),j)+C(ij(i+1),j)+C(i,ij(j-1))+C(i,ij(j+1))-4*C(i,j)).*(a.^2)./(h.^2);
end
end
F=F2;
for i=1:n
for j=1:n
C2(i,j)=C(i,j)+(F(ij(i-1),j)+F(ij(i+1),j)+F(i,ij(j-1))+F(i,ij(j+1))-4*F(i,j)).*dt./(h^2);
end
end
C=C2;
t = t + dt;
end
function i=ij(i) %Just to have a matrix as loop (the n+1 th cases are the 1 th and 0 the 0th are nth)
if i==0
i=n;
return
elseif i==n+1
i=1;
end
return
end
thanks a lot
EDIT: Found an answer, it was totally ridiculous and I was searching way too far
%n is still the size of C
h = 1/((n-1))
dt = 1e-6;
a = 1e-2;
F=zeros(n,n);
var1=(a^2)/(h^2); %to make a bit less calculus
var2=dt/(h^2); % the same
t = 0.0;
for iter=1:12000
F=C.^3-C-var1*(C([n 1:n-1],1:n) + C([2:n 1], 1:n) + C(1:n, [n 1:n-1]) + C(1:n, [2:n 1]) - 4*C);
C = C + var2*(F([n 1:n-1], 1:n) + F([2:n 1], 1:n) + F(1:n, [n 1:n-1]) + F(1:n,[2:n 1]) - 4*F);
t = t + dt;
end
Found an answer, it was totally ridiculous and I was searching way too far
%n is still the size of C
h = 1/((n-1))
dt = 1e-6;
a = 1e-2;
F=zeros(n,n);
var1=(a^2)/(h^2); %to make a bit less calculus
var2=dt/(h^2); % the same
prev = [n 1:n-1];
next = [2:n 1];
t = 0.0;
for iter=1:12000
F = C.*C.*C - C - var1*(C(:,next)+C(:,prev)+C(next,:)+C(prev,:)-4*C);
C = C + var2*(F(:,next)+F(:,prev)+F(next,:)+F(prev,:)-4*F);
t = t + dt;
end
The behavior of the inner loop looks like a 2-dimensional circular convolution. That's the same as multiplication in the FFT domain. Subtraction is invariant across a linear operation such as FFT.
You'll want to use the fft2 and ifft2 functions.
Once you do that, I think you'll find that the repeated convolution can be eliminated by raising the convolution kernel (element-wise) to the power iter. If that optimization is correct, I'm predicting a speedup of 5 orders of magnitude.
You can replace for example C(ij(i-1),j) by using circshift(C,[1,0]) or circshift(C,[1,0]) (i can't figure out witch one of two is correct)
http://www.mathworks.com/help/matlab/ref/circshift.htm