I'm trying to write a cubic spline interpolation program. I have written the program but, the graph is not coming out correctly. The spline uses natural boundary conditions(second dervative at start/end node are 0). The code is in Matlab and is shown below,
clear all
%Function to Interpolate
k = 10; %Number of Support Nodes-1
xs(1) = -1;
for j = 1:k
xs(j+1) = -1 +2*j/k; %Support Nodes(Equidistant)
end;
fs = 1./(25.*xs.^2+1); %Support Ordinates
x = [-0.99:2/(2*k):0.99]; %Places to Evaluate Function
fx = 1./(25.*x.^2+1); %Function Evaluated at x
%Cubic Spline Code(Coefficients to Calculate 2nd Derivatives)
f(1) = 2*(xs(3)-xs(1));
g(1) = xs(3)-xs(2);
r(1) = (6/(xs(3)-xs(2)))*(fs(3)-fs(2)) + (6/(xs(2)-xs(1)))*(fs(1)-fs(2));
e(1) = 0;
for i = 2:k-2
e(i) = xs(i+1)-xs(i);
f(i) = 2*(xs(i+2)-xs(i));
g(i) = xs(i+2)-xs(i+1);
r(i) = (6/(xs(i+2)-xs(i+1)))*(fs(i+2)-fs(i+1)) + ...
(6/(xs(i+1)-xs(i)))*(fs(i)-fs(i+1));
end
e(k-1) = xs(k)-xs(k-1);
f(k-1) = 2*(xs(k+1)-xs(k-1));
r(k-1) = (6/(xs(k+1)-xs(k)))*(fs(k+1)-fs(k)) + ...
(6/(xs(k)-xs(k-1)))*(fs(k-1)-fs(k));
%Tridiagonal System
i = 1;
A = zeros(k-1,k-1);
while i < size(A)+1;
A(i,i) = f(i);
if i < size(A);
A(i,i+1) = g(i);
A(i+1,i) = e(i);
end
i = i+1;
end
for i = 2:k-1 %Decomposition
e(i) = e(i)/f(i-1);
f(i) = f(i)-e(i)*g(i-1);
end
for i = 2:k-1 %Forward Substitution
r(i) = r(i)-e(i)*r(i-1);
end
xn(k-1)= r(k-1)/f(k-1);
for i = k-2:-1:1 %Back Substitution
xn(i) = (r(i)-g(i)*xn(i+1))/f(i);
end
%Interpolation
if (max(xs) <= max(x))
error('Outside Range');
end
if (min(xs) >= min(x))
error('Outside Range');
end
P = zeros(size(length(x),length(x)));
i = 1;
for Counter = 1:length(x)
for j = 1:k-1
a(j) = x(Counter)- xs(j);
end
i = find(a == min(a(a>=0)));
if i == 1
c1 = 0;
c2 = xn(1)/6/(xs(2)-xs(1));
c3 = fs(1)/(xs(2)-xs(1));
c4 = fs(2)/(xs(2)-xs(1))-xn(1)*(xs(2)-xs(1))/6;
t1 = c1*(xs(2)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(1))^3;
t3 = c3*(xs(2)-x(Counter));
t4 = c4*(x(Counter)-xs(1));
P(Counter) = t1 +t2 +t3 +t4;
else
if i < k-1
c1 = xn(i-1+1)/6/(xs(i+1)-xs(i-1+1));
c2 = xn(i+1)/6/(xs(i+1)-xs(i-1+1));
c3 = fs(i-1+1)/(xs(i+1)-xs(i-1+1))-xn(i-1+1)*(xs(i+1)-xs(i-1+1))/6;
c4 = fs(i+1)/(xs(i+1)-xs(i-1+1))-xn(i+1)*(xs(i+1)-xs(i-1+1))/6;
t1 = c1*(xs(i+1)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(i-1+1))^3;
t3 = c3*(xs(i+1)-x(Counter));
t4 = c4*(x(Counter)-xs(i-1+1));
P(Counter) = t1 +t2 +t3 +t4;
else
c1 = xn(i-1+1)/6/(xs(i+1)-xs(i-1+1));
c2 = 0;
c3 = fs(i-1+1)/(xs(i+1)-xs(i-1+1))-xn(i-1+1)*(xs(i+1)-xs(i-1+1))/6;
c4 = fs(i+1)/(xs(i+1)-xs(i-1+1));
t1 = c1*(xs(i+1)-x(Counter))^3;
t2 = c2*(x(Counter)-xs(i-1+1))^3;
t3 = c3*(xs(i+1)-x(Counter));
t4 = c4*(x(Counter)-xs(i-1+1));
P(Counter) = t1 +t2 +t3 +t4;
end
end
end
P = P';
P(length(x)) = NaN;
plot(x,P,x,fx)
When I run the code, the interpolation function is not symmetric and, it doesn't converge correctly. Can anyone offer any suggestions about problems in my code? Thanks.
I wrote a cubic spline package in Mathematica a long time ago. Here is my translation of that package into Matlab. Note I haven't looked at cubic splines in about 7 years, so I'm basing this off my own documentation. You should check everything I say.
The basic problem is we are given n data points (x(1), y(1)) , ... , (x(n), y(n)) and we wish to calculate a piecewise cubic interpolant. The interpolant is defined as
S(x) = { Sk(x) when x(k) <= x <= x(k+1)
{ 0 otherwise
Here Sk(x) is a cubic polynomial of the form
Sk(x) = sk0 + sk1*(x-x(k)) + sk2*(x-x(k))^2 + sk3*(x-x(k))^3
The properties of the spline are:
The spline pass through the data point Sk(x(k)) = y(k)
The spline is continuous at the end-points and thus continuous everywhere in the interpolation interval Sk(x(k+1)) = Sk+1(x(k+1))
The spline has continuous first derivative Sk'(x(k+1)) = Sk+1'(x(k+1))
The spline has continuous second derivative Sk''(x(k+1)) = Sk+1''(x(k+1))
To construct a cubic spline from a set of data point we need to solve for the coefficients
sk0, sk1, sk2 and sk3 for each of the n-1 cubic polynomials. That is a total of 4*(n-1) = 4*n - 4 unknowns. Property 1 supplies n constraints, and properties 2,3,4 each supply an additional n-2 constraints. Thus we have n + 3*(n-2) = 4*n - 6 constraints and 4*n - 4 unknowns. This leaves two degrees of freedom. We fix these degrees of freedom by setting the second derivative equal to zero at the start and end nodes.
Let m(k) = Sk''(x(k)) , h(k) = x(k+1) - x(k) and d(k) = (y(k+1) - y(k))/h(k). The following
three-term recurrence relation holds
h(k-1)*m(k-1) + 2*(h(k-1) + h(k))*m(k) + h(k)*m(k+1) = 6*(d(k) - d(k-1))
The m(k) are unknowns we wish to solve for. The h(k) and d(k) are defined by the input data.
This three-term recurrence relation defines a tridiagonal linear system. Once the m(k) are determined the coefficients for Sk are given by
sk0 = y(k)
sk1 = d(k) - h(k)*(2*m(k) + m(k-1))/6
sk2 = m(k)/2
sk3 = m(k+1) - m(k)/(6*h(k))
Okay that is all the math you need to know to completely define the algorithm to compute a cubic spline. Here it is in Matlab:
function [s0,s1,s2,s3]=cubic_spline(x,y)
if any(size(x) ~= size(y)) || size(x,2) ~= 1
error('inputs x and y must be column vectors of equal length');
end
n = length(x)
h = x(2:n) - x(1:n-1);
d = (y(2:n) - y(1:n-1))./h;
lower = h(1:end-1);
main = 2*(h(1:end-1) + h(2:end));
upper = h(2:end);
T = spdiags([lower main upper], [-1 0 1], n-2, n-2);
rhs = 6*(d(2:end)-d(1:end-1));
m = T\rhs;
% Use natural boundary conditions where second derivative
% is zero at the endpoints
m = [ 0; m; 0];
s0 = y;
s1 = d - h.*(2*m(1:end-1) + m(2:end))/6;
s2 = m/2;
s3 =(m(2:end)-m(1:end-1))./(6*h);
Here is some code to plot a cubic spline:
function plot_cubic_spline(x,s0,s1,s2,s3)
n = length(x);
inner_points = 20;
for i=1:n-1
xx = linspace(x(i),x(i+1),inner_points);
xi = repmat(x(i),1,inner_points);
yy = s0(i) + s1(i)*(xx-xi) + ...
s2(i)*(xx-xi).^2 + s3(i)*(xx - xi).^3;
plot(xx,yy,'b')
plot(x(i),0,'r');
end
Here is a function that constructs a cubic spline and plots in on the famous Runge function:
function cubic_driver(num_points)
runge = #(x) 1./(1+ 25*x.^2);
x = linspace(-1,1,num_points);
y = runge(x);
[s0,s1,s2,s3] = cubic_spline(x',y');
plot_points = 1000;
xx = linspace(-1,1,plot_points);
yy = runge(xx);
plot(xx,yy,'g');
hold on;
plot_cubic_spline(x,s0,s1,s2,s3);
You can see it in action by running the following at the Matlab prompt
>> cubic_driver(5)
>> clf
>> cubic_driver(10)
>> clf
>> cubic_driver(20)
By the time you have twenty nodes your interpolant is visually indistinguishable from the Runge function.
Some comments on the Matlab code: I don't use any for or while loops. I am able to vectorize all operations. I quickly form the sparse tridiagonal matrix with spdiags. I solve it using the backslash operator. I counting on Tim Davis's UMFPACK to handle the decomposition and forward and backward solves.
Hope that helps. The code is available as a gist on github https://gist.github.com/1269709
There was a bug in spline function, generated (n-2) by (n-2) should be symmetric:
lower = h(2:end);
main = 2*(h(1:end-1) + h(2:end));
upper = h(1:end-1);
http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f3-3.pdf
Related
I am currently working on solving the problem $-\alpha u'' + \beta u = f$ with Neumann conditions on the edge, with the finite element method in MATLAB.
I managed to set up a code that works for P1 and P2 Lagragne finite elements (i.e: linear and quadratic) and the results are good!
I am trying to implement the finite element method using the Hermite basis. This basis is defined by the following basis functions and derivatives:
syms x
phi(x) = [2*x^3-3*x^2+1,-2*x^3+3*x^2,x^3-2*x^2+x,x^3-x^2]
% Derivative
dphi = [6*x.^2-6*x,-6*x.^2+6*x,3*x^2-4*x+1,3*x^2-2*x]
The problem with the following code is that the solution vector u is not good. I know that there must be a problem in the S and F element matrix calculation loop, but I can't see where even though I've been trying to make changes for a week.
Can you give me your opinion? Hopefully someone can see my error.
Thanks a lot,
% -alpha*u'' + beta*u = f
% u'(a) = bd1, u'(b) = bd2;
a = 0;
b = 1;
f = #(x) (1);
alpha = 1;
beta = 1;
% Neuamnn boundary conditions
bn1 = 1;
bn2 = 0;
syms ue(x)
DE = -alpha*diff(ue,x,2) + beta*ue == f;
du = diff(ue,x);
BC = [du(a)==bn1, du(b)==bn2];
ue = dsolve(DE, BC);
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
N = 2;
nnod = N*(2+2); % Number of nodes
neq = nnod*1; % Number of equations, one degree of freedom per node
xnod = linspace(a,b,nnod);
nodes = [(1:3:nnod-3)', (2:3:nnod-2)', (3:3:nnod-1)', (4:3:nnod)'];
phi = #(xi)[2*xi.^3-3*xi.^2+1,2*xi.^3+3*xi.^2,xi.^3-2*xi.^2+xi,xi.^3-xi.^2];
dphi = #(xi)[6*xi.^2-6*xi,-6*xi.^2+6*xi,3*xi^2-4*xi+1,3*xi^2-2*xi];
% Here, just calculate the integral using gauss quadrature..
order = 5;
[gp, gw] = gauss(order, 0, 1);
S = zeros(neq,neq);
M = S;
F = zeros(neq,1);
for iel = 1:N
%disp(iel)
inod = nodes(iel,:);
xc = xnod(inod);
h = xc(end)-xc(1);
Se = zeros(4,4);
Me = Se;
fe = zeros(4,1);
for ig = 1:length(gp)
xi = gp(ig);
iw = gw(ig);
Se = Se + dphi(xi)'*dphi(xi)*1/h*1*iw;
Me = Me + phi(xi)'*phi(xi)*h*1*iw;
x = phi(xi)*xc';
fe = fe + phi(xi)' * f(x) * h * 1 * iw;
end
% Assembly
S(inod,inod) = S(inod, inod) + Se;
M(inod,inod) = M(inod, inod) + Me;
F(inod) = F(inod) + fe;
end
S = alpha*S + beta*M;
g = zeros(neq,1);
g(1) = -alpha*bn1;
g(end) = alpha*bn2;
alldofs = 1:neq;
u = zeros(neq,1); %Pre-allocate
F = F + g;
u(alldofs) = S(alldofs,alldofs)\F(alldofs)
Warning: Matrix is singular to working precision.
u = 8×1
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
hold on
plot(xnod, u, 'bo')
for iel = 1:N
inod = nodes(iel,:);
xc = xnod(inod);
U = u(inod);
xi = linspace(0,1,100)';
Ue = phi(xi)*U;
Xe = phi(xi)*xc';
plot(Xe,Ue,'b -')
end
% Gauss function for calculate the integral
function [x, w, A] = gauss(n, a, b)
n = 1:(n - 1);
beta = 1 ./ sqrt(4 - 1 ./ (n .* n));
J = diag(beta, 1) + diag(beta, -1);
[V, D] = eig(J);
x = diag(D);
A = b - a;
w = V(1, :) .* V(1, :);
w = w';
x=x';
end
You can find the same post under MATLAB site for syntax highlighting.
Thanks
I tried to read courses, search in different documentation and modify my code without success.
For my studies I had to write a PDE solver for the Poisson equation on a disc shaped domain using the finite difference method.
I already passed the Lab exercise. There is one issue in my code I couldn't fix. Function fun1 with the boundary value problem gun2 is somehow oscillating at the boundary. When I use fun2 everything seems fine...
Both functions use at the boundary gun2. What is the problem?
function z = fun1(x,y)
r = sqrt(x.^2+y.^2);
z = zeros(size(x));
if( r < 0.25)
z = -10^8*exp(1./(r.^2-1/16));
end
end
function z = fun2(x,y)
z = 100*sin(2*pi*x).*sin(2*pi*y);
end
function z = gun2(x,y)
z = x.^2+y.^2;
end
function [u,A] = poisson2(funame,guname,M)
if nargin < 3
M = 50;
end
%Mesh Grid Generation
h = 2/(M + 1);
x = -1:h:1;
y = -1:h:1;
[X,Y] = meshgrid(x,y);
CI = ((X.^2 +Y.^2) < 1);
%Boundary Elements
Sum= zeros(size(CI));
%Sum over the neighbours
for i = -1:1
Sum = Sum + circshift(CI,[i,0]) + circshift(CI,[0,i]) ;
end
%if sum of neighbours larger 3 -> inner note!
CI = (Sum > 3);
%else boundary
CB = (Sum < 3 & Sum ~= 0);
Sum= zeros(size(CI));
%Sum over the boundary neighbour nodes....
for i = -1:1
Sum = Sum + circshift(CB,[i,0]) + circshift(CB,[0,i]);
end
%If the sum is equal 2 -> Diagonal boundary
CB = CB + (Sum == 2 & CB == 0 & CI == 0);
%Converting X Y to polar coordinates
Phi = atan(Y./X);
%Converting Phi R back to cartesian coordinates, only at the boundarys
for j = 1:M+2
for i = 1:M+2
if (CB(i,j)~=0)
if j > (M+2)/2
sig = 1;
else
sig = -1;
end
X(i,j) = sig*1*cos(Phi(i,j));
Y(i,j) = sig*1*sin(Phi(i,j));
end
end
end
%Numberize the internal notes u1,u2,......,un
CI = CI.*reshape(cumsum(CI(:)),size(CI));
%Number of internal notes
Ni = nnz(CI);
f = zeros(Ni,1);
k = 1;
A = spalloc(Ni,Ni,5*Ni);
%Create matix A!
for j=2:M+1
for i =2:M+1
if(CI(i,j) ~= 0)
hN = h;hS = h; hW = h; hE = h;
f(k) = fun(X(i,j),Y(i,j));
if(CB(i+1,j) ~= 0)
hN = abs(1-sqrt(X(i,j)^2+Y(i,j)^2));
f(k) = f(k) + gun(X(i,j),Y(i+1,j))*2/(hN^2+hN*h);
A(k,CI(i-1,j)) = -2/(h^2+h*hN);
else
if(CB(i-1,j) ~= 0) %in negative y is a boundry
hS = abs(1-sqrt(X(i,j)^2+Y(i,j)^2));
f(k) = f(k) + gun(X(i,j),Y(i-1,j))*2/(hS^2+h*hS);
A(k,CI(i+1,j)) = -2/(h^2+h*hS);
else
A(k,CI(i-1,j)) = -1/h^2;
A(k,CI(i+1,j)) = -1/h^2;
end
end
if(CB(i,j+1) ~= 0)
hE = abs(1-sqrt(X(i,j)^2+Y(i,j)^2));
f(k) = f(k) + gun(X(i,j+1),Y(i,j))*2/(hE^2+hE*h);
A(k,CI(i,j-1)) = -2/(h^2+h*hE);
else
if(CB(i,j-1) ~= 0)
hW = abs(1-sqrt(X(i,j)^2+Y(i,j)^2));
f(k) = f(k) + gun(X(i,j-1),Y(i,j))*2/(hW^2+h*hW);
A(k,CI(i,j+1)) = -2/(h^2+h*hW);
else
A(k,CI(i,j-1)) = -1/h^2;
A(k,CI(i,j+1)) = -1/h^2;
end
end
A(k,k) = (2/(hE*hW)+2/(hN*hS));
k = k + 1;
end
end
end
%Solve linear system
u = A\f;
U = zeros(M+2,M+2);
p = 1;
%re-arange u
for j = 1:M+2
for i = 1:M+2
if ( CI(i,j) ~= 0)
U(i,j) = u(p);
p = p+1;
else
if ( CB(i,j) ~= 0)
U(i,j) = gun(X(i,j),Y(i,j));
else
U(i,j) = NaN;
end
end
end
end
surf(X,Y,U);
end
I'm keeping this answer short for now, but may extend when the question contains more info.
My first guess is that what you are seeing is just numerical errors. Looking at the scales of the two graphs, the peaks in the first graph are relatively small compared to the signal in the second graph. Maybe there is a similar issue in the second that is just not visible because the signal is much bigger. You could try to increase the number of nodes and observe what happens with the result.
You should always expect to see numerical errors in such simulations. It's only a matter of trying to get their magnitude as small as possible (or as small as needed).
I'm trying to solve the following question :
maximize x^2-5x+y^2-3y
x+y <= 8
x<=2
x,y>= 0
By using Frank Wolf algorithm ( according to http://web.mit.edu/15.053/www/AMP-Chapter-13.pdf ).
But after running of the following program:
syms x y t;
f = x^2-5*x+y^2-3*y;
fdx = diff(f,1,x); % f'x
fdy = diff(f,1,y); % y'x
x0 = [0 0]; %initial point
A = [1 1;1 0]; %constrains matrix
b = [8;2];
lb = zeros(1,2);
eps = 0.00001;
i = 1;
X = [inf inf];
Z = zeros(2,200); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < 200)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
ys = linprog((-[c1;c2]),A,b,[],[],lb,[],[],options);
%parametric representation of line
xt = (1-t)*x0(1)+t*ys(1,1);
yt = (1-t)*x0(2)+t*ys(2,1);
%f(x=xt,y=yt)
ft = subs(f,x,xt);
ft = subs(ft,y,yt);
%f't(x=xt,y=yt)
ftd = diff(ft,t,1);
%f't(x=xt,y=yt)=0 -> for max point
[t1] = solve(ftd); % (t==theta)
X = double(x0);%%%%%%%%%%%%%%%%%
% [ xt(t=t1) yt(t=t1)]
xnext(1) = subs(xt,t,t1) ;
xnext(2) = subs(yt,t,t1) ;
x0 = double(xnext);
Z(1,i) = x0(1);
Z(2,i) = x0(2);
i = i + 1;
end
x_point = Z(1,:);
y_point = Z(2,:);
% Draw result
scatter(x_point,y_point);
hold on;
% Print results
fprintf('The answer is:\n');
fprintf('x = %.3f \n',x0(1));
fprintf('y = %.3f \n',x0(2));
res = x0(1)^2 - 5*x0(1) + x0(2)^2 - 3*x0(2);
fprintf('f(x0) = %.3f\n',res);
I get the following result:
x = 3.020
y = 0.571
f(x0) = -7.367
And this no matter how many iterations I running this program (1,50 or 200).
Even if I choose a different starting point (For example, x0=(1,6) ), I get a negative answer to most.
I know that is an approximation, but the result should be positive (for x0 final, in this case).
My question is : what's wrong with my implementation?
Thanks in advance.
i changed a few things, it still doesn't look right but hopefully this is getting you in the right direction. It looks like the intial x0 points make a difference to how the algorithm converges.
Also make sure to check what i is after running the program, to determine if it ran to completion or exceeded the maximum iterations
lb = zeros(1,2);
ub = [2,8]; %if x+y<=8 and x,y>0 than both x,y < 8
eps = 0.00001;
i_max = 100;
i = 1;
X = [inf inf];
Z = zeros(2,i_max); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < i_max)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
[ys, ~ , exit_flag] = linprog((-[c1;c2]),A,b,[],[],lb,ub,x0,options);
so here is the explanation of the changes
ub, uses our upper bound. After i added a ub, the result immediately changed
x0, start this iteration from the previous point
exit_flag this allows you to check exit_flag after execution (it always seems to be 1 indicating it solved the problem correctly)
I have a code in MATLAB which works with very small numbers, for example, I have values that are on the order of 10^{-25}, however when MATLAB does the calculations, the values themselves are rounded to 0. Note, I am not referring to format to display these extra decimals, but rather the number itself is changed to 0. I think the reason is because MATLAB, by default, uses up to 15 digits after the decimal point for its calculations. How can I change this so that numbers that are very very small are retained as they are in the calculations?
EDIT:
My code is the following:
clc;
clear;
format long;
% Import data
P = xlsread('Data.xlsx', 'P');
d = xlsread('Data.xlsx', 'd');
CM = xlsread('Data.xlsx', 'Cov');
Original_PD = P; %Store original PD
LM_rows = size(P,1)+1; %Expected LM rows
LM_columns = size(P,2); %Expected LM columns
LM_FINAL = zeros(LM_rows,LM_columns); %Dimensions of LM_FINAL
for ii = 1:size(P,2)
P = Original_PD(:,ii);
% c1, c2, ..., cn, c0, f
interval = cell(size(P,1)+2,1);
for i = 1:size(P,1)
interval{i,1} = NaN(size(P,1),2);
interval{i,1}(:,1) = -Inf;
interval{i,1}(:,2) = d;
interval{i,1}(i,1) = d(i,1);
interval{i,1}(i,2) = Inf;
end
interval{i+1,1} = [-Inf*ones(size(P,1),1) d];
interval{i+2,1} = [d Inf*ones(size(P,1),1)];
c = NaN(size(interval,1),1);
for i = 1:size(c,1)
c(i,1) = mvncdf(interval{i,1}(:,1),interval{i,1}(:,2),0,CM);
end
c0 = c(size(P,1)+1,1);
f = c(size(P,1)+2,1);
c = c(1:size(P,1),:);
b0 = exp(1);
b = exp(1)*P;
syms x;
eqn = f*x;
for i = 1:size(P,1)
eqn = eqn*(c0/c(i,1)*x + (b(i,1)-b0)/c(i,1));
end
eqn = c0*x^(size(P,1)+1) + eqn - b0*x^size(P,1);
x0 = solve(eqn);
x0 = double(x0);
for i = 1:size(x0)
id(i,1) = isreal(x0(i,1));
end
x0 = x0(id,:);
x0 = x0(x0 > 0,:);
clear x;
for i = 1:size(P,1)
x(i,:) = (b(i,1) - b0)./(c(i,1)*x0) + c0/c(i,1);
end
% x = [x0 x1 ... xn]
x = [x0'; x];
x = x(:,sum(x <= 0,1) == 0);
% lamda
lamda = -log(x);
LM_FINAL(:,ii) = lamda;
end
The problem is in this step:
for i = 1:size(P,1)
x(i,:) = (b(i,1) - b0)./(c(i,1)*x0) + c0/c(i,1);
end
where the "difference" gets very close to 0. How can I stop this rounding from occurring at this step?
For example, when i = 10, I have the following values:
b_10 = 0.006639735483297
b_0 = 2.71828182845904
c_10 = 0.000190641848119641
c_0 = 0.356210110252579
x_0 = 7.61247930625269
After doing the calculations we get: -1868.47805854794 + 1868.47805854794 which yields a difference of -2.27373675443232E-12, that gets rounded to 0 by MATLAB.
EDIT 2:
Here is my data file which is used for the code. After you run the code (should take about a minute and half to finish running), row 11 in the variable x shows 0 (even after double clicking to check it's real value), when it shouldn't.
The problem you're having is because the IEEE standard for floating points can't distinguish your numbers from zero because they don't utilize sufficient bits.
Have a look at John D'Errico's Big Decimal Class and Variable Precision Integer Arithmetic. Another option would be to use the Big Integer Class from Java but that might be more challenging if you are unfamiliar with using Java and othe rexternal libraries in MATLAB.
Can you give an example of the calculations in which you are using 1e-25 and getting zero? Here's what I get for a floating point called small_num and one of John's high-precision-floats called small_hpf when assigning them and multiplying by pi.
>> small_num = 1e-25
small_num =
1.0000e-25
>> small_hpf = hpf(1e-25)
small_hpf =
1.000000000000000038494869749191839081371989361591338301396127644e-25
>> small_num * pi
ans =
3.1416e-25
>> small_hpf * pi
ans =
3.141592653589793236933163473501228686498684350685747717239459106e-25
I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!