Divide and Conquer SVD in MATLAB - matlab

I'm trying to implement Divide and Conquer SVD of an upper bidiagonal matrix B, but my code is not working. The error is:
"Unable to perform assignment because the size of the left side is
3-by-3 and the size of the right side is 2-by-2.
V_bar(1:k,1:k) = V1;"
Can somebody help me fix it? Thanks.
function [U,S,V] = DivideConquer_SVD(B)
[m,n] = size(B);
k = floor(m/2);
if k == 0
U = 1;
V = 1;
S = B;
return;
else
% Divide the input matrix
alpha = B(k,k);
beta = B(k,k+1);
e1 = zeros(m,1);
e2 = zeros(m,1);
e1(k) = 1;
e2(k+1) = 1;
B1 = B(1:k-1,1:k);
B2 = B(k+1:m,k+1:m);
%recursive computations
[U1,S1,V1] = DivideConquer_SVD(B1);
[U2,S2,V2] = DivideConquer_SVD(B2);
U_bar = zeros(m);
U_bar(1:k-1,1:k-1) = U1;
U_bar(k,k) = 1;
U_bar((k+1):m,(k+1):m) = U2;
D = zeros(m);
D(1:k-1,1:k) = S1;
D((k+1):m,(k+1):m) = S2;
V_bar = zeros(m);
V_bar(1:k,1:k) = V1;
V_bar((k+1):m,(k+1):m) = V2;
u = alpha*e1'*V_bar + beta*e2'*V_bar;
u = u';
D_tilde = D*D + u*u';
% compute eigenvalues and eigenvectors of D^2+uu'
[L1,Q1] = eig(D_tilde);
eigs = diag(L1);
S = zeros(m,n)
S(1:(m+1):end) = eigs
U_tilde = Q1;
V_tilde = Q1;
%Compute eigenvectors of the original input matrix T
U = U_bar*U_tilde;
V = V_bar*V_tilde;
return;
end

With limited mathematical knowledge, you need to help me a bit more -- as I cannot judge if the approach is correct in a mathematical way (with no theory given;) ). Anyway, I couldn't even reproduce the error e.g with this matrix, which The MathWorks use to illustrate their LU matrix factorization
A = [10 -7 0
-3 2 6
5 -1 5];
So I tried to structure your code a bit and gave some hints. Extend this to make your code clearer for those people (like me) who are not too familiar with matrix decomposition.
function [U,S,V] = DivideConquer_SVD(B)
% m x n matrix
[m,n] = size(B);
k = floor(m/2);
if k == 0
disp('if') % for debugging
U = 1;
V = 1;
S = B;
% return; % net necessary as you don't do anything afterwards anyway
else
disp('else') % for debugging
% Divide the input matrix
alpha = B(k,k); % element on diagonal
beta = B(k,k+1); % element on off-diagonal
e1 = zeros(m,1);
e2 = zeros(m,1);
e1(k) = 1;
e2(k+1) = 1;
% divide matrix
B1 = B(1:k-1,1:k); % upper left quadrant
B2 = B(k+1:m,k+1:m); % lower right quadrant
% recusrsive function call
[U1,S1,V1] = DivideConquer_SVD(B1);
[U2,S2,V2] = DivideConquer_SVD(B2);
U_bar = zeros(m);
U_bar(1:k-1,1:k-1) = U1;
U_bar(k,k) = 1;
U_bar((k+1):m,(k+1):m) = U2;
D = zeros(m);
D(1:k-1,1:k) = S1;
D((k+1):m,(k+1):m) = S2;
V_bar = zeros(m);
V_bar(1:k,1:k) = V1;
V_bar((k+1):m,(k+1):m) = V2;
u = (alpha*e1.'*V_bar + beta*e2.'*V_bar).'; % (little show-off tip: '
% is the complex transpose operator; .' is the "normal" transpose
% operator. It's good practice to distinguish between them but there
% is no difference for real matrices anyway)
D_tilde = D*D + u*u.';
% compute eigenvalues and eigenvectors of D^2+uu'
[L1,Q1] = eig(D_tilde);
eigs = diag(L1);
S = zeros(m,n);
S(1:(m+1):end) = eigs;
U_tilde = Q1;
V_tilde = Q1;
% Compute eigenvectors of the original input matrix T
U = U_bar*U_tilde;
V = V_bar*V_tilde;
% return; % net necessary as you don't do anything afterwards anyway
end % for
end % function

Related

1D finite element method in the Hermite basis (P3C1) - Problem of solution calculation

I am currently working on solving the problem $-\alpha u'' + \beta u = f$ with Neumann conditions on the edge, with the finite element method in MATLAB.
I managed to set up a code that works for P1 and P2 Lagragne finite elements (i.e: linear and quadratic) and the results are good!
I am trying to implement the finite element method using the Hermite basis. This basis is defined by the following basis functions and derivatives:
syms x
phi(x) = [2*x^3-3*x^2+1,-2*x^3+3*x^2,x^3-2*x^2+x,x^3-x^2]
% Derivative
dphi = [6*x.^2-6*x,-6*x.^2+6*x,3*x^2-4*x+1,3*x^2-2*x]
The problem with the following code is that the solution vector u is not good. I know that there must be a problem in the S and F element matrix calculation loop, but I can't see where even though I've been trying to make changes for a week.
Can you give me your opinion? Hopefully someone can see my error.
Thanks a lot,
% -alpha*u'' + beta*u = f
% u'(a) = bd1, u'(b) = bd2;
a = 0;
b = 1;
f = #(x) (1);
alpha = 1;
beta = 1;
% Neuamnn boundary conditions
bn1 = 1;
bn2 = 0;
syms ue(x)
DE = -alpha*diff(ue,x,2) + beta*ue == f;
du = diff(ue,x);
BC = [du(a)==bn1, du(b)==bn2];
ue = dsolve(DE, BC);
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
N = 2;
nnod = N*(2+2); % Number of nodes
neq = nnod*1; % Number of equations, one degree of freedom per node
xnod = linspace(a,b,nnod);
nodes = [(1:3:nnod-3)', (2:3:nnod-2)', (3:3:nnod-1)', (4:3:nnod)'];
phi = #(xi)[2*xi.^3-3*xi.^2+1,2*xi.^3+3*xi.^2,xi.^3-2*xi.^2+xi,xi.^3-xi.^2];
dphi = #(xi)[6*xi.^2-6*xi,-6*xi.^2+6*xi,3*xi^2-4*xi+1,3*xi^2-2*xi];
% Here, just calculate the integral using gauss quadrature..
order = 5;
[gp, gw] = gauss(order, 0, 1);
S = zeros(neq,neq);
M = S;
F = zeros(neq,1);
for iel = 1:N
%disp(iel)
inod = nodes(iel,:);
xc = xnod(inod);
h = xc(end)-xc(1);
Se = zeros(4,4);
Me = Se;
fe = zeros(4,1);
for ig = 1:length(gp)
xi = gp(ig);
iw = gw(ig);
Se = Se + dphi(xi)'*dphi(xi)*1/h*1*iw;
Me = Me + phi(xi)'*phi(xi)*h*1*iw;
x = phi(xi)*xc';
fe = fe + phi(xi)' * f(x) * h * 1 * iw;
end
% Assembly
S(inod,inod) = S(inod, inod) + Se;
M(inod,inod) = M(inod, inod) + Me;
F(inod) = F(inod) + fe;
end
S = alpha*S + beta*M;
g = zeros(neq,1);
g(1) = -alpha*bn1;
g(end) = alpha*bn2;
alldofs = 1:neq;
u = zeros(neq,1); %Pre-allocate
F = F + g;
u(alldofs) = S(alldofs,alldofs)\F(alldofs)
Warning: Matrix is singular to working precision.
u = 8×1
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
hold on
plot(xnod, u, 'bo')
for iel = 1:N
inod = nodes(iel,:);
xc = xnod(inod);
U = u(inod);
xi = linspace(0,1,100)';
Ue = phi(xi)*U;
Xe = phi(xi)*xc';
plot(Xe,Ue,'b -')
end
% Gauss function for calculate the integral
function [x, w, A] = gauss(n, a, b)
n = 1:(n - 1);
beta = 1 ./ sqrt(4 - 1 ./ (n .* n));
J = diag(beta, 1) + diag(beta, -1);
[V, D] = eig(J);
x = diag(D);
A = b - a;
w = V(1, :) .* V(1, :);
w = w';
x=x';
end
You can find the same post under MATLAB site for syntax highlighting.
Thanks
I tried to read courses, search in different documentation and modify my code without success.

Frank - Wolfe Algorithm in matlab

I'm trying to solve the following question :
maximize x^2-5x+y^2-3y
x+y <= 8
x<=2
x,y>= 0
By using Frank Wolf algorithm ( according to http://web.mit.edu/15.053/www/AMP-Chapter-13.pdf ).
But after running of the following program:
syms x y t;
f = x^2-5*x+y^2-3*y;
fdx = diff(f,1,x); % f'x
fdy = diff(f,1,y); % y'x
x0 = [0 0]; %initial point
A = [1 1;1 0]; %constrains matrix
b = [8;2];
lb = zeros(1,2);
eps = 0.00001;
i = 1;
X = [inf inf];
Z = zeros(2,200); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < 200)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
ys = linprog((-[c1;c2]),A,b,[],[],lb,[],[],options);
%parametric representation of line
xt = (1-t)*x0(1)+t*ys(1,1);
yt = (1-t)*x0(2)+t*ys(2,1);
%f(x=xt,y=yt)
ft = subs(f,x,xt);
ft = subs(ft,y,yt);
%f't(x=xt,y=yt)
ftd = diff(ft,t,1);
%f't(x=xt,y=yt)=0 -> for max point
[t1] = solve(ftd); % (t==theta)
X = double(x0);%%%%%%%%%%%%%%%%%
% [ xt(t=t1) yt(t=t1)]
xnext(1) = subs(xt,t,t1) ;
xnext(2) = subs(yt,t,t1) ;
x0 = double(xnext);
Z(1,i) = x0(1);
Z(2,i) = x0(2);
i = i + 1;
end
x_point = Z(1,:);
y_point = Z(2,:);
% Draw result
scatter(x_point,y_point);
hold on;
% Print results
fprintf('The answer is:\n');
fprintf('x = %.3f \n',x0(1));
fprintf('y = %.3f \n',x0(2));
res = x0(1)^2 - 5*x0(1) + x0(2)^2 - 3*x0(2);
fprintf('f(x0) = %.3f\n',res);
I get the following result:
x = 3.020
y = 0.571
f(x0) = -7.367
And this no matter how many iterations I running this program (1,50 or 200).
Even if I choose a different starting point (For example, x0=(1,6) ), I get a negative answer to most.
I know that is an approximation, but the result should be positive (for x0 final, in this case).
My question is : what's wrong with my implementation?
Thanks in advance.
i changed a few things, it still doesn't look right but hopefully this is getting you in the right direction. It looks like the intial x0 points make a difference to how the algorithm converges.
Also make sure to check what i is after running the program, to determine if it ran to completion or exceeded the maximum iterations
lb = zeros(1,2);
ub = [2,8]; %if x+y<=8 and x,y>0 than both x,y < 8
eps = 0.00001;
i_max = 100;
i = 1;
X = [inf inf];
Z = zeros(2,i_max); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < i_max)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
[ys, ~ , exit_flag] = linprog((-[c1;c2]),A,b,[],[],lb,ub,x0,options);
so here is the explanation of the changes
ub, uses our upper bound. After i added a ub, the result immediately changed
x0, start this iteration from the previous point
exit_flag this allows you to check exit_flag after execution (it always seems to be 1 indicating it solved the problem correctly)

Application of Neural Network in MATLAB

I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!

Matlab NaN and Inf issue

So, I'm implementing the EM algorithm in Matlab, but my matrices quickly end up contaminated by NaN and Inf values. I think it might be caused by matrix inversions, but I'm not sure that's the only reason.
Here is the code:
function [F, Q, R, x_T, P_T] = em_algo(y, G)
% y_t = G_t'*x_t + v_t 1*1 = 1*p p*1
% x_t = F*x_t-1 + w_t p*1 = p*p p*1
% G is T*p
p = size(G,2); % p = nb assets ; G = T*p
q = size(y,2); % q = nb observations ; y = T*q
T = size(y,1); % y is T*1
F = eye(p); % = Transition matrix p*p
Q = eye(p); % innovation (v) covariance matrix p*p
R = eye(q); % noise (w) covariance matrix q x q
x_T_old = zeros(p,T);
mu0 = zeros(p,1);
Sigma = eye(p); % Initial state covariance matrix p*p
converged = 0;
i = 0;
max_iter = 60; % only for testing purposes
while ~converged
if i > max_iter
break;
end
% E step = smoothing
fprintf('Iteration %d\n',i);
[x_T,P_T,P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y);
%x_T
% M step
A = zeros(p,p);
B = zeros(p,p);
C = zeros(p,p);
R = eye(q);
for t = 2:T % eq (9) in EM paper
A = A + (P_T(:,:,t-1) + (x_T(:,t-1)*x_T(:,t-1)'));
end
for t = 2:T % eq (10)
%B = B + (P_Tm2(:,:,t-1) + (x_T(:,t)*x_T(:,t-1)'));
B = B + (P_Tm2(:,:,t) + (x_T(:,t)*x_T(:,t-1)'));
end
for t = 1:T %eq (11)
C = C + (P_T(:,:,t) + (x_T(:,t)*x_T(:,t)'));
end
F = B*inv(A); %eq (12)
Q = (1/T)*(C - (B*inv(A)*B')); % eq (13) pxp
for t = 1:T
bias = y(t) - (G(t,:)*x_T(:,t));
R = R + ((bias*bias') + (G(t,:)*P_T(:,:,t)*G(t,:)'));
end
R = (1/T)*R;
if i>1
err = norm(x_T-x_T_old)/norm(x_T_old);
if err < 1e-4
converged = 1;
end
end
x_T_old = x_T;
i = i+1;
end
fprintf('EM algorithm iterated %d times\n',i);
end
This iterates until convergence (which never happens due to my issue) and calls smoother.m at each iteration:
function [x_T, P_T, P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y)
% G is T*p
p = size(mu0,1); % mu0 is p*1
T = size(y,1); % y is T*1
J = zeros(p,p,T);
K = zeros(p,T); % gain matrix
x = zeros(p,T);
x(:,1) = mu0;
x_m1 = zeros(p,T);
x_T = zeros(p,T); % x values when we know all the data
% Notation : x = xt given t ; x_m1 = xt given t-1 (m1 stands for minus
% one)
P = zeros(p,p,T);% array of cov(xt|y1...yt), eq (6) in Shumway & Stoffer 1982
P(:,:,1) = Sigma;
P_m1 = zeros(p,p,T); % Same notation ; = cov(xt, xt-1|y1...yt) , eq (7)
P_T = zeros(p,p,T);
P_Tm2 = zeros(p,p,T); % cov(xT, xT-1|y1...yT)
for t = 2:T %starts at t = 2 because at each time t we need info about t-1
x_m1(:,t) = F*x(:,t-1); % eq A3 ; pxp * px1 = px1
P_m1(:,:,t) = (F*P(:,:,t-1)*F') + Q; % A4 ; pxp * pxp = pxp
if nnz(isnan(P_m1(:,:,t)))
error('NaNs in P_m1 at time t = %d',t);
end
if nnz(isinf(P_m1(:,:,t)))
error('Infs in P_m1 at time t = %d',t);
end
K(:,t) = P_m1(:,:,t)*G(t,:)'*pinv((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1
%K(:,t) = P_m1(:,:,t)*G(t,:)'/((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1
% The matrix inversion seems to generate NaN values which quickly
% contaminate all the other matrices. There is no warning about
% (close to) singular matrices or whatever. The use of pinv()
% instead of inv() seems to solve the problem... but I don't think
% it's the appropriate way to deal with it, there must be something
% wrong elsewhere
if nnz(isnan(K(:,t)))
error('NaNs in K at time t = %d',t);
end
x(:,t) = x_m1(:,t) + (K(:,t)*(y(t)-(G(t,:)*x_m1(:,t)))); %A6
P(:,:,t) = P_m1(:,:,t) - (K(:,t)*G(t,:)*P_m1(:,:,t)); %A7
end
x_T(:,T) = x(:,T);
P_T(:,:,T) = P(:,:,T);
for t = T:-1:2 % we stop at 2 since we need to use t-1
%P_m1 seem to get really huge (x10^22...), might lead to "Inf"
%values which in turn might screw pinv()
%% inv() caused NaN value to appear, pinv seems to solve the issue
J(:,:,t-1) = P(:,:,t-1)*F'*pinv(P_m1(:,:,t)); % A8 pxp * pxp * pxp
%J(:,:,t-1) = P(:,:,t-1)*F'/(P_m1(:,:,t)); % A8 pxp * pxp * pxp
x_T(:,t-1) = x(:,t-1) + J(:,:,t-1)*(x_T(:,t)-(F*x(:,t-1))); %A9 % Becomes NaN during 8th iteration!
P_T(:,:,t-1) = P(:,:,t-1) + J(:,:,t-1)*(P_T(:,:,t)-P_m1(:,:,t))*J(:,:,t-1)'; %A10
nans = [nnz(isnan(J)) nnz(isnan(P_m1)) nnz(isnan(F)) nnz(isnan(x_T)) nnz(isnan(x_m1))];
if nnz(nans)
error('NaN invasion at time t = %d',t);
end
end
P_Tm2(:,:,T) = (eye(p) - K(:,T)*G(T,:))*F*P(:,:,T-1); % %A12
for t = T:-1:3 % stop at 3 because use of t-2
P_Tm2(:,:,t-1) = P_m1(:,:,t-1)*J(:,:,t-2)' + J(:,:,t-1)*(P_Tm2(:,:,t)-F*P(:,:,t-1))*J(:,:,t-2)'; % A11
end
end
The NaNs and Infs start popping around the ~8th iteration.
I guess in there somewhere I'm doing something unholy with my matrices, but I really have no clue about what's wrong. I trust your expertise.
Thanks in advance for the help.
Rody :
Here is how I generate the data (it's not "real world" data yet, just some test data generated to check that nothing goes wront) :
T = 500;
nbassets = 3;
G = .1 + randn(T,nbassets); % random walk trajectories
y = (1:T).';
y = 1.01.^y; % 1 * T % Exponential 1% returns curve
Dan :
You're right. I indeed lack the math background to really understand how the formulas are derived. I know it doesn't help, but I'm not sure I can remedy that for the time being. :/
Rody : Yes indeed, I arrived at the same conclusion. But I really have no clue what makes it go wrong like that.
Here is a link to the paper :
http://www.stat.pitt.edu/stoffer/em.pdf
The formulas for the smoother are all at the very end, in the appendix. Thanks for your time so far.
As the user appears to have inserted the answer into his question I will post it here:
As mentioned by #Rody the cause of the problem was that the use of inv created NaN or Inf values.
The user 'solved' this by using pinv instead.

Cubic Spline Program

I'm trying to write a cubic spline interpolation program. I have written the program but, the graph is not coming out correctly. The spline uses natural boundary conditions(second dervative at start/end node are 0). The code is in Matlab and is shown below,
clear all
%Function to Interpolate
k = 10; %Number of Support Nodes-1
xs(1) = -1;
for j = 1:k
xs(j+1) = -1 +2*j/k; %Support Nodes(Equidistant)
end;
fs = 1./(25.*xs.^2+1); %Support Ordinates
x = [-0.99:2/(2*k):0.99]; %Places to Evaluate Function
fx = 1./(25.*x.^2+1); %Function Evaluated at x
%Cubic Spline Code(Coefficients to Calculate 2nd Derivatives)
f(1) = 2*(xs(3)-xs(1));
g(1) = xs(3)-xs(2);
r(1) = (6/(xs(3)-xs(2)))*(fs(3)-fs(2)) + (6/(xs(2)-xs(1)))*(fs(1)-fs(2));
e(1) = 0;
for i = 2:k-2
e(i) = xs(i+1)-xs(i);
f(i) = 2*(xs(i+2)-xs(i));
g(i) = xs(i+2)-xs(i+1);
r(i) = (6/(xs(i+2)-xs(i+1)))*(fs(i+2)-fs(i+1)) + ...
(6/(xs(i+1)-xs(i)))*(fs(i)-fs(i+1));
end
e(k-1) = xs(k)-xs(k-1);
f(k-1) = 2*(xs(k+1)-xs(k-1));
r(k-1) = (6/(xs(k+1)-xs(k)))*(fs(k+1)-fs(k)) + ...
(6/(xs(k)-xs(k-1)))*(fs(k-1)-fs(k));
%Tridiagonal System
i = 1;
A = zeros(k-1,k-1);
while i < size(A)+1;
A(i,i) = f(i);
if i < size(A);
A(i,i+1) = g(i);
A(i+1,i) = e(i);
end
i = i+1;
end
for i = 2:k-1 %Decomposition
e(i) = e(i)/f(i-1);
f(i) = f(i)-e(i)*g(i-1);
end
for i = 2:k-1 %Forward Substitution
r(i) = r(i)-e(i)*r(i-1);
end
xn(k-1)= r(k-1)/f(k-1);
for i = k-2:-1:1 %Back Substitution
xn(i) = (r(i)-g(i)*xn(i+1))/f(i);
end
%Interpolation
if (max(xs) <= max(x))
error('Outside Range');
end
if (min(xs) >= min(x))
error('Outside Range');
end
P = zeros(size(length(x),length(x)));
i = 1;
for Counter = 1:length(x)
for j = 1:k-1
a(j) = x(Counter)- xs(j);
end
i = find(a == min(a(a>=0)));
if i == 1
c1 = 0;
c2 = xn(1)/6/(xs(2)-xs(1));
c3 = fs(1)/(xs(2)-xs(1));
c4 = fs(2)/(xs(2)-xs(1))-xn(1)*(xs(2)-xs(1))/6;
t1 = c1*(xs(2)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(1))^3;
t3 = c3*(xs(2)-x(Counter));
t4 = c4*(x(Counter)-xs(1));
P(Counter) = t1 +t2 +t3 +t4;
else
if i < k-1
c1 = xn(i-1+1)/6/(xs(i+1)-xs(i-1+1));
c2 = xn(i+1)/6/(xs(i+1)-xs(i-1+1));
c3 = fs(i-1+1)/(xs(i+1)-xs(i-1+1))-xn(i-1+1)*(xs(i+1)-xs(i-1+1))/6;
c4 = fs(i+1)/(xs(i+1)-xs(i-1+1))-xn(i+1)*(xs(i+1)-xs(i-1+1))/6;
t1 = c1*(xs(i+1)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(i-1+1))^3;
t3 = c3*(xs(i+1)-x(Counter));
t4 = c4*(x(Counter)-xs(i-1+1));
P(Counter) = t1 +t2 +t3 +t4;
else
c1 = xn(i-1+1)/6/(xs(i+1)-xs(i-1+1));
c2 = 0;
c3 = fs(i-1+1)/(xs(i+1)-xs(i-1+1))-xn(i-1+1)*(xs(i+1)-xs(i-1+1))/6;
c4 = fs(i+1)/(xs(i+1)-xs(i-1+1));
t1 = c1*(xs(i+1)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(i-1+1))^3;
t3 = c3*(xs(i+1)-x(Counter));
t4 = c4*(x(Counter)-xs(i-1+1));
P(Counter) = t1 +t2 +t3 +t4;
end
end
end
P = P';
P(length(x)) = NaN;
plot(x,P,x,fx)
When I run the code, the interpolation function is not symmetric and, it doesn't converge correctly. Can anyone offer any suggestions about problems in my code? Thanks.
I wrote a cubic spline package in Mathematica a long time ago. Here is my translation of that package into Matlab. Note I haven't looked at cubic splines in about 7 years, so I'm basing this off my own documentation. You should check everything I say.
The basic problem is we are given n data points (x(1), y(1)) , ... , (x(n), y(n)) and we wish to calculate a piecewise cubic interpolant. The interpolant is defined as
S(x) = { Sk(x) when x(k) <= x <= x(k+1)
{ 0 otherwise
Here Sk(x) is a cubic polynomial of the form
Sk(x) = sk0 + sk1*(x-x(k)) + sk2*(x-x(k))^2 + sk3*(x-x(k))^3
The properties of the spline are:
The spline pass through the data point Sk(x(k)) = y(k)
The spline is continuous at the end-points and thus continuous everywhere in the interpolation interval Sk(x(k+1)) = Sk+1(x(k+1))
The spline has continuous first derivative Sk'(x(k+1)) = Sk+1'(x(k+1))
The spline has continuous second derivative Sk''(x(k+1)) = Sk+1''(x(k+1))
To construct a cubic spline from a set of data point we need to solve for the coefficients
sk0, sk1, sk2 and sk3 for each of the n-1 cubic polynomials. That is a total of 4*(n-1) = 4*n - 4 unknowns. Property 1 supplies n constraints, and properties 2,3,4 each supply an additional n-2 constraints. Thus we have n + 3*(n-2) = 4*n - 6 constraints and 4*n - 4 unknowns. This leaves two degrees of freedom. We fix these degrees of freedom by setting the second derivative equal to zero at the start and end nodes.
Let m(k) = Sk''(x(k)) , h(k) = x(k+1) - x(k) and d(k) = (y(k+1) - y(k))/h(k). The following
three-term recurrence relation holds
h(k-1)*m(k-1) + 2*(h(k-1) + h(k))*m(k) + h(k)*m(k+1) = 6*(d(k) - d(k-1))
The m(k) are unknowns we wish to solve for. The h(k) and d(k) are defined by the input data.
This three-term recurrence relation defines a tridiagonal linear system. Once the m(k) are determined the coefficients for Sk are given by
sk0 = y(k)
sk1 = d(k) - h(k)*(2*m(k) + m(k-1))/6
sk2 = m(k)/2
sk3 = m(k+1) - m(k)/(6*h(k))
Okay that is all the math you need to know to completely define the algorithm to compute a cubic spline. Here it is in Matlab:
function [s0,s1,s2,s3]=cubic_spline(x,y)
if any(size(x) ~= size(y)) || size(x,2) ~= 1
error('inputs x and y must be column vectors of equal length');
end
n = length(x)
h = x(2:n) - x(1:n-1);
d = (y(2:n) - y(1:n-1))./h;
lower = h(1:end-1);
main = 2*(h(1:end-1) + h(2:end));
upper = h(2:end);
T = spdiags([lower main upper], [-1 0 1], n-2, n-2);
rhs = 6*(d(2:end)-d(1:end-1));
m = T\rhs;
% Use natural boundary conditions where second derivative
% is zero at the endpoints
m = [ 0; m; 0];
s0 = y;
s1 = d - h.*(2*m(1:end-1) + m(2:end))/6;
s2 = m/2;
s3 =(m(2:end)-m(1:end-1))./(6*h);
Here is some code to plot a cubic spline:
function plot_cubic_spline(x,s0,s1,s2,s3)
n = length(x);
inner_points = 20;
for i=1:n-1
xx = linspace(x(i),x(i+1),inner_points);
xi = repmat(x(i),1,inner_points);
yy = s0(i) + s1(i)*(xx-xi) + ...
s2(i)*(xx-xi).^2 + s3(i)*(xx - xi).^3;
plot(xx,yy,'b')
plot(x(i),0,'r');
end
Here is a function that constructs a cubic spline and plots in on the famous Runge function:
function cubic_driver(num_points)
runge = #(x) 1./(1+ 25*x.^2);
x = linspace(-1,1,num_points);
y = runge(x);
[s0,s1,s2,s3] = cubic_spline(x',y');
plot_points = 1000;
xx = linspace(-1,1,plot_points);
yy = runge(xx);
plot(xx,yy,'g');
hold on;
plot_cubic_spline(x,s0,s1,s2,s3);
You can see it in action by running the following at the Matlab prompt
>> cubic_driver(5)
>> clf
>> cubic_driver(10)
>> clf
>> cubic_driver(20)
By the time you have twenty nodes your interpolant is visually indistinguishable from the Runge function.
Some comments on the Matlab code: I don't use any for or while loops. I am able to vectorize all operations. I quickly form the sparse tridiagonal matrix with spdiags. I solve it using the backslash operator. I counting on Tim Davis's UMFPACK to handle the decomposition and forward and backward solves.
Hope that helps. The code is available as a gist on github https://gist.github.com/1269709
There was a bug in spline function, generated (n-2) by (n-2) should be symmetric:
lower = h(2:end);
main = 2*(h(1:end-1) + h(2:end));
upper = h(1:end-1);
http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f3-3.pdf