I am having a system of 6 ode.
What i am trying to manage, is to solve it, for several different values of one of its parameters, parameter 'p', using a for loop at the integration.
I have managed to do it using ode45, however,
using a non-built-in integrator, it seems i am making a mistake and I don't get back the same values, or graphs.
I would appreciate any feedback on my code on what went wrong
Using ODE45
p = -100:+1:350;
time = 0:.05:3;
initial = [0 0 0 0 0 0];
x = NaN(length(time),length(initial),length(p));
for i=1:length(p)
[t,x(:,:,i)] = ode45(#ode,time,initial,[],p(i));
end
figure(1)
plot3(squeeze(x(:,1,:)),squeeze(x(:,2,:)),squeeze(x(:,3,:)))
function dx = ode(~,x,p)
bla bla
end
Using the non-built-in integrator
figure
hold all
for p = -100:+1:350
inicond = [0 0 0 0 0 0];
dt = 0.01;
time = 0:dt:10;
[y] = integrator(#ode,inicond,time,dt,p);
x1 = y(:,1);
x2 = y(:,2);
x3 = y(:,3);
figure(1)
plot3(x1,x2,x3)
axis equal
legend('-DynamicLegend')
end
function [y] = integrator(ode,inicond,time,dt,p)
y = NaN(length(time),length(inicond));
y(1,:) = inicond;
for j=2:length(time)
k1 = dt*ode(y(j-1,:),p);
k2 = dt*ode(y(j-1,:)+k1/2,p);
k3 = dt*ode(y(j-1,:)+k2/2,p);
k4 = dt*ode(y(j-1,:)+k3,p);
y(j,:) = y(j-1,:) + k1/6+k2/3+k3/3+k4/6;
end
end
function [dydt] = ode(y,p)
bla bla
end
Using the non-built-in function to solve Van der Pol for multiple 'epsilon'values: This works., while for the previous it doesnt..
figure
hold all
for epsilon = 0:.2:5
inicond = [0.2 0.8];
dt = 0.1;%timestep for integration
time = 0:dt:100;
[x] = integrator(#VanDerPol,inicond,time,dt,epsilon);
xdot = x(:,2);
x = x(:,1);
figure(1)
plot(x,xdot,'DisplayName',sprintf('epsilon = %1.0f',epsilon))
figure(2)
plot3(time, x,xdot)
axis equal
legend('-DynamicLegend')
end
function [x] = integrator(VanDerPol,inicond,time,dt,epsilon)
x = NaN(length(time),length(inicond));
x(1,:) = inicond;
%%% ACTUAL CALL OF INTEGRATOR %%%%%%%%%%%%%%
for j=2:length(time)
k1 = dt*feval(VanDerPol,x(j-1,:),epsilon);
k2 = dt*feval(VanDerPol,x(j-1,:)+k1/2,epsilon);
k3 = dt*feval(VanDerPol,x(j-1,:)+k2/2,epsilon);
k4 = dt*feval(VanDerPol,x(j-1,:)+k3,epsilon);
x(j,:) = x(j-1,:) + k1/6+k2/3+k3/3+k4/6;
end
end
function [dxdt] = VanDerPol(x,epsilon)
dxdt=NaN(1,2);
dxdt(1,1) = x(:,2);
dxdt(1,2) = epsilon*(1 - x(:,1)^2)*x(:,2) - x(:,1);
end
I am afraid that with the second way, the values of all the repetitions might not be stored correctly and for this reason the values of the occurring matrices after integration, hardly vary one from another
Related
Consider the following problem:
I am now in the third part of this question. I wrote the vectorial loop equations (q=teta2, x=teta3 and y=teta4):
fval(1,1) = r2*cos(q)+r3*cos(x)-r4*cos(y)-r1;
fval(2,1) = r2*sin(q)+r3*sin(x)-r4*sin(y);
I have these 2 functions, and all variables except x and y are given. I found the roots with help of this video.
Now I need to plot graphs of q versus x and q versus y when q is at [0,2pi] with delta q of 2.5 degree. What should I do to plot the graphs?
Below is my attempt so far:
function [fval,jac] = lorenzSystem(X)
%Define variables
x = X(1);
y = X(2);
q = pi/2;
r2 = 15
r3 = 50
r4 = 45
r1 = 40
%Define f(x)
fval(1,1)=r2*cos(q)+r3*cos(x)-r4*cos(y)-r1;
fval(2,1)=r2*sin(q)+r3*sin(x)-r4*sin(y);
%Define Jacobian
jac = [-r3*sin(X(1)), r4*sin(X(2));
r3*cos(X(1)), -r4*cos(X(2))];
%% Multivariate NR
%Initial conditions:
X0 = [0.5;1];
maxIter = 50;
tolX = 1e-6;
X = X0;
Xold = X0;
for i = 1:maxIter
[f,j] = lorenzSystem(X);
X = X - inv(j)*f;
err(:,i) = abs(X-Xold);
Xold = X;
if (err(:,i)<tolX)
break;
end
end
Please take a look at my solution below, and study how it differs from your own.
function [th2,th3,th4] = q65270276()
[th2,th3,th4] = lorenzSystem();
hF = figure(); hAx = axes(hF);
plot(hAx, deg2rad(th2), deg2rad(th3), deg2rad(th2), deg2rad(th4));
xlabel(hAx, '\theta_2')
xticks(hAx, 0:pi/3:2*pi);
xticklabels(hAx, {'$0$','$\frac{\pi}{3}$','$\frac{2\pi}{3}$','$\pi$','$\frac{4\pi}{3}$','$\frac{5\pi}{3}$','$2\pi$'});
hAx.TickLabelInterpreter = 'latex';
yticks(hAx, 0:pi/6:pi);
yticklabels(hAx, {'$0$','$\frac{\pi}{6}$','$\frac{\pi}{3}$','$\frac{\pi}{2}$','$\frac{2\pi}{3}$','$\frac{5\pi}{6}$','$\pi$'});
set(hAx, 'XLim', [0 2*pi], 'YLim', [0 pi], 'FontSize', 16);
grid(hAx, 'on');
legend(hAx, '\theta_3', '\theta_4')
end
function [th2,th3,th4] = lorenzSystem()
th2 = (0:2.5:360).';
[th3,th4] = deal(zeros(size(th2)));
% Define geometry:
r1 = 40;
r2 = 15;
r3 = 50;
r4 = 45;
% Define the residual:
res = #(q,X)[r2*cosd(q)+r3*cosd(X(1))-r4*cosd(X(2))-r1; ... Δx=0
r2*sind(q)+r3*sind(X(1))-r4*sind(X(2))]; % Δy=0
% Define the Jacobian:
J = #(X)[-r3*sind(X(1)), r4*sind(X(2));
r3*cosd(X(1)), -r4*cosd(X(2))];
X0 = [acosd((45^2-25^2-50^2)/(-2*25*50)); 180-acosd((50^2-25^2-45^2)/(-2*25*45))]; % Accurate guess
maxIter = 500;
tolX = 1e-6;
for idx = 1:numel(th2)
X = X0;
Xold = X0;
err = zeros(maxIter, 1); % Preallocation
for it = 1:maxIter
% Update the guess
f = res( th2(idx), Xold );
X = Xold - J(Xold) \ f;
% X = X - pinv(J(X)) * res( q(idx), X ); % May help when J(X) is close to singular
% Determine convergence
err(it) = (X-Xold).' * (X-Xold);
if err(it) < tolX
break
end
% Update history
Xold = X;
end
% Unpack and store θ₃, θ₄
th3(idx) = X(1);
th4(idx) = X(2);
% Update X0 for faster convergence of the next case:
X0 = X;
end
end
Several notes:
All computations are performed in degrees.
The specific plotting code I used is less interesting, what matters is that I defined all θ₂ in advance, then looped over them to find θ₃ and θ₄ (without recursion, as was done in your own implementation).
The initial guess (actually, analytical solution) for the very first case (θ₂=0) can be found by solving the problem manually (i.e. "on paper") using the law of cosines. The solver also works for other guesses, but you might need to increase maxIter. Also, for certain guesses (e.g. X(1)==X(2)), the Jacobian is ill-conditioned, in which case you can use pinv.
If my computation is correct, this is the result:
I'm making a Matlab program for cell growth,
I have a problem with some function in main.m and I found some functions that I thought are written wrong way:
function y = f(c)
y = 0.5*(1-tanh(4*c-2));
function y = h(c)
y = 0.5*f(c);
function y = g(c)
global beta
y = beta*exp(beta*c);
%y=1+0.2*c;
main.m
%%main program
clear; clc;
global alpha beta gamma
%set parameter values
alpha = 0.9; beta = 0.5; gamma = 10;
dx = 1; X = 210; dt = 0.04; T = 16;
c0 = 1;
%set up arrays
x = [dx:dx:X]; Nx = round(X/dx); Nt = round(T/dt);
p = zeros(1,Nx); nextp = zeros(1,Nx);
q = zeros(1,Nx); nextq = zeros(1,Nx);
n = zeros(1,Nx); nextn = zeros(1,Nx);
u = zeros(1,Nx); v = zeros(1,Nx); r = zeros(1,Nx); c = zeros(1,Nx);
P = zeros(Nt,Nx); Q = zeros(Nt,Nx); N = zeros(Nt,Nx);
%set initial values
p = exp(-0.1.*x);
function y = f(c)
y = 0.5*(1-tanh(4*c-2));
function y = h(c)
y = 0.5*f(c);
function y = g(c)
global beta
y = beta*exp(beta*c);
%y=1+0.2*c;
%start FDM time-stepping
for k=1:Nt
r = p + q;
c = (c0.*gamma./(gamma+p)).*(1-alpha.*(p+q+n));
for i=2:Nx-1
u(i)=((p(i+1)-p(i-1))*r(i)*(r(i+1)-r(i-1))+ 4*p(i)*r(i)*...
(r(i+1)-2*r(i)+r(i-1))-p(i)*(r(i+1)-r(i-1))^2)/(2*...
(dx*r(i))^2);
v(i)=((q(i+1)-q(i-1))*r(i)*(r(i+1)-r(i-1))+ 4*q(i)*r(i)*...
(r(i+1)-2*r(i)+r(i-1))-q(i)*(r(i+1)-r(i-1))^2)/(2*...
(dx*r(i))^2);
end
nextp=p+dt.*(u+g(c).*p.*(1-(p+q+n))-f(c).*p);
nextq=q+dt.*(v+f(c).*p-h(c).*q);
nextn=n+dt.*(h(c).*q);
p=nextp;
q=nextq;
n=nextn;
P(k,:)=p; Q(k,:)=q; N(k,:)=n;
end
figure(1)
for n=1:500:Nt
plot(P(n,:),'LineWidth',1.2); hold on;
end
axis([0 270 0 0.6]);
figure(2)
for n=1:500:Nt
plot(Q(n,:),'LineWidth',1.2); hold on;
end
axis([0 270 0 0.6]);
figure(3)
for n=1:500:Nt
plot(N(n,:),'LineWidth',1.2); hold on;
end
axis([0 270 0 1]);
animation.m
%create image for cells
rand('state', sum(100*clock));
prefix='t';
Nm=0;
figure(1)
for n=1:250:Nt
Nm=Nm+1;
for i=1:Nx
tP=round(P(n,i)),tQ=round(Q(n,i)),tN=round(N(n,i));
for m=1:tP
theta=2*pi*rand();
plot(i*sin(theta),i*cos(theta),'b.'); hold on;
end
for m=1:tQ
theta=2*pi*rand();
plot(i*sin(theta),i*cos(theta),'r.'); hold on;
end
for m=1:tN
theta=2*pi*rand();
plot(i*sin(theta),i*cos(theta),'k.'); hold on;
end
axis square
axis([-300 300 -300 300])
end
print('-djpeg','-r100',sprintf('%s_%s',prefix,num2str(Nm)));
end
clear MM
for i=1:Nm
[XX,map]=imread(sprintf('%s_%s',prefix,num2str(i)),'jpeg');
imagesc(XX);
MM(i)=getframe;
pause(0.1);
end
Please help me solve this problem..
You're main.m is a script, not a function (it doesn't start with the word function), see function and Scripts vs. Functions).
In short: function declaration is not allowed in command-line or scripts only in function files.
Do as oro777 advises,
create a seperate file for each function.
Another way is to turn your main.m into a function too, by starting it with function main. Then decleration of additional functions is allowed in the same file, but these are all local functions which means they will only be available from within that file. You can not call them from outside the file they were declared in.
You should write the output error so we can see which function is badly written. At first view, I would say you should use longer function names because with only one letter, it may easily shadow with a variable name and I would advise not to use global variables.
I see that your functions are defined in the same file as the main program, I would advise to create one function per file. For example a f_function m-file with the following code.
function y = f_function(c)
y = 0.5*(1-tanh(4*c-2));
Do the same for your two other functions and place your newly created m-files in the same folder as your main program.
I'm trying to solve the following question :
maximize x^2-5x+y^2-3y
x+y <= 8
x<=2
x,y>= 0
By using Frank Wolf algorithm ( according to http://web.mit.edu/15.053/www/AMP-Chapter-13.pdf ).
But after running of the following program:
syms x y t;
f = x^2-5*x+y^2-3*y;
fdx = diff(f,1,x); % f'x
fdy = diff(f,1,y); % y'x
x0 = [0 0]; %initial point
A = [1 1;1 0]; %constrains matrix
b = [8;2];
lb = zeros(1,2);
eps = 0.00001;
i = 1;
X = [inf inf];
Z = zeros(2,200); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < 200)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
ys = linprog((-[c1;c2]),A,b,[],[],lb,[],[],options);
%parametric representation of line
xt = (1-t)*x0(1)+t*ys(1,1);
yt = (1-t)*x0(2)+t*ys(2,1);
%f(x=xt,y=yt)
ft = subs(f,x,xt);
ft = subs(ft,y,yt);
%f't(x=xt,y=yt)
ftd = diff(ft,t,1);
%f't(x=xt,y=yt)=0 -> for max point
[t1] = solve(ftd); % (t==theta)
X = double(x0);%%%%%%%%%%%%%%%%%
% [ xt(t=t1) yt(t=t1)]
xnext(1) = subs(xt,t,t1) ;
xnext(2) = subs(yt,t,t1) ;
x0 = double(xnext);
Z(1,i) = x0(1);
Z(2,i) = x0(2);
i = i + 1;
end
x_point = Z(1,:);
y_point = Z(2,:);
% Draw result
scatter(x_point,y_point);
hold on;
% Print results
fprintf('The answer is:\n');
fprintf('x = %.3f \n',x0(1));
fprintf('y = %.3f \n',x0(2));
res = x0(1)^2 - 5*x0(1) + x0(2)^2 - 3*x0(2);
fprintf('f(x0) = %.3f\n',res);
I get the following result:
x = 3.020
y = 0.571
f(x0) = -7.367
And this no matter how many iterations I running this program (1,50 or 200).
Even if I choose a different starting point (For example, x0=(1,6) ), I get a negative answer to most.
I know that is an approximation, but the result should be positive (for x0 final, in this case).
My question is : what's wrong with my implementation?
Thanks in advance.
i changed a few things, it still doesn't look right but hopefully this is getting you in the right direction. It looks like the intial x0 points make a difference to how the algorithm converges.
Also make sure to check what i is after running the program, to determine if it ran to completion or exceeded the maximum iterations
lb = zeros(1,2);
ub = [2,8]; %if x+y<=8 and x,y>0 than both x,y < 8
eps = 0.00001;
i_max = 100;
i = 1;
X = [inf inf];
Z = zeros(2,i_max); %result for end points (x1,x2)
rr = zeros(1,200);
options = optimset('Display','none');
while( all(abs(X-x0)>[eps,eps]) && i < i_max)
%f'x(x0)
c1 = subs(fdx,x,x0(1));
c1 = subs(c1,y,x0(2));
%f'y(x0)
c2 = subs(fdy,x,x0(1));
c2 = subs(c2,y,x0(2));
%optimization point of linear taylor function
[ys, ~ , exit_flag] = linprog((-[c1;c2]),A,b,[],[],lb,ub,x0,options);
so here is the explanation of the changes
ub, uses our upper bound. After i added a ub, the result immediately changed
x0, start this iteration from the previous point
exit_flag this allows you to check exit_flag after execution (it always seems to be 1 indicating it solved the problem correctly)
So I have three differential equations relating to diabetes, I have to plot the 2 out of the three, them being G and I in a subplot. For some reason when I attempt to run it the command window prints out: "Not enough input arguments" This is the criteria for it:
function dx = problem1(t,x)
P1 = 0.028735 ;
P2 = 0.028344 ;
P3 = 5.035 * 10^(-5) ;
Vi = 12 ;
n = 5/54 ;
D_t = 3*exp(-0.05*t) ;
U_t = 3 ;
Gb = 4.5;
Xb = 15;
Ib = 15;
G = x(1);
X = x(2);
I = x(3);
dx = zeros(3,1);
dx(1) = -P1*(G-Gb) - (X-Xb)*G + D_t ;
dx(2) = -P2*(X-Xb) + P3*(I-Ib) ;
dx(3) = -n*I + U_t/Vi ;
[T,X] = ode15s(#problem1,[0 60*24],[4.5 15 15]) ;
subplot(3,1,1);
plot(T,X(:,1)); % Plot G
subplot(3,1,2); % Second subplot
plot(T,X(:,2)); % Plot I
The error is thrown when you run the function and MATLAB attempts to evaluate D_t = 3*exp(-0.05*t);. Since no value of t was given, MATLAB throws an error saying that the up-to-that-point unused t variable must be specified.
The main problem with the code is in the function's design. Namely, ode15s needs a function that accepts a t and an x and returns dx; however, as it is currently laid out, the call to ode15s is embedded within problem1 which itself requires a t and x. It is a chicken-or-egg problem.
All of the input is correct aside from this design problem and can easily be corrected using a separate function for the ODE's definition:
function problem1
[T,X] = ode15s(#ODE,[0 60*24],[4.5 15 15]) ;
subplot(3,1,1);
plot(T,X(:,1)); % Plot G
subplot(3,1,2); % Second subplot
plot(T,X(:,2)); % Plot I
end
function dx = ODE(t,x)
P1 = 0.028735 ;
P2 = 0.028344 ;
P3 = 5.035 * 10^(-5) ;
Vi = 12 ;
n = 5/54 ;
D_t = 3*exp(-0.05*t) ;
U_t = 3 ;
Gb = 4.5;
Xb = 15;
Ib = 15;
G = x(1);
X = x(2);
I = x(3);
dx = zeros(3,1);
dx(1) = -P1*(G-Gb) - (X-Xb)*G + D_t ;
dx(2) = -P2*(X-Xb) + P3*(I-Ib) ;
dx(3) = -n*I + U_t/Vi ;
end
Notes:
The first line function problem1 is short-hand for function [] = problem1(). I prefer the latter form myself, but I'm in the minority.
The function handle passed to ode15s #ODE is short-hand for #(t,x) ODE(t,x). I prefer the latter form myself, but it is no less or more valid as long as you are not parametrizing functions.
You can also use a nested function and give the problem1 function access to the model constants, but I opted for a separate function here.
I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!