Creating a table from a variable inside a for loop - matlab

I am writing a for loop to calculate the value of four different variables. The first variable is M. M increases from 10^2 to 10^5,
M = [10^2,10^3,10^4,10^5];
The other three variables needed for the table are shown in the code below.
confmc
confcv
confmcSize/confcvSize
I first create a for loop to iterate through the four different values of M. I then create the table outside of the for loop.
How could I adjust the implementation so that the table displays all four values of M?
randn('state',100)
%%%%%% Problem and method parameters %%%%%%%%%
S = 5; E = 6; sigma = 0.3; r = 0.05; T = 1;
Dt = 1e-2; N = T/Dt; M = [10^2,10^3,10^4,10^5];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for k=1:numel(M)
%%%%%%%%% Geom Asian exact mean %%%%%%%%%%%%
sigsqT= sigma^2*T*(N+1)*(2*N+1)/(6*N*N);
muT = 0.5*sigsqT + (r - 0.5*sigma^2)*T*(N+1)/(2*N);
d1 = (log(S/E) + (muT + 0.5*sigsqT))/(sqrt(sigsqT));
d2 = d1 - sqrt(sigsqT);
N1 = 0.5*(1+erf(d1/sqrt(2)));
N2 = 0.5*(1+erf(d2/sqrt(2)));
geo = exp(-r*T)*( S*exp(muT)*N1 - E*N2 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Spath = S*cumprod(exp((r-0.5*sigma^2)*Dt+sigma*sqrt(Dt)*randn(M(k),N)),2);
% Standard Monte Carlo
arithave = mean(Spath,2);
Parith = exp(-r*T)*max(arithave-E,0); % payoffs
Pmean = mean(Parith);
Pstd = std(Parith);
confmc = [Pmean-1.96*Pstd/sqrt(M(k)), Pmean+1.96*Pstd/sqrt(M(k))];
confmcSize = [(Pmean+1.96*Pstd/sqrt(M(k)))-(Pmean-1.96*Pstd/sqrt(M(k)))];
% Control Variate
geoave = exp((1/N)*sum(log(Spath),2));
Pgeo = exp(-r*T)*max(geoave-E,0); % geo payoffs
Z = Parith + geo - Pgeo; % control variate version
Zmean = mean(Z);
Zstd = std(Z);
confcv = [Zmean-1.96*Zstd/sqrt(M(k)), Zmean+1.96*Zstd/sqrt(M(k))];
confcvSize = [(Zmean+1.96*Zstd/sqrt(M(k)))-(Zmean-1.96*Zstd/sqrt(M(k)))];
end
T = table(M,confmc,confcv,confmcSize/confcvSize)
The current code returns
T =
1×4 table
M confmc confcv Var4
_____ ____________________ ____________________ ______
1e+05 0.096756 0.1007 0.097306 0.097789 8.1622
How could I change my implementation so that all four values of M are computed?

I just modified few things.Take a look at the following code.
randn('state',100)
%%%%%% Problem and method parameters %%%%%%%%%
S = 5; E = 6; sigma = 0.3; r = 0.05; T = 1;
Dt = 1e-2; N = T/Dt; M = [10^2,10^3,10^4,10^5];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
confmc = zeros(numel(M), 2);
confcv = zeros(numel(M), 2);
confmcSize = zeros(numel(M), 1);
confcvSize = zeros(numel(M), 1);
for k=1:numel(M)
%%%%%%%%% Geom Asian exact mean %%%%%%%%%%%%
sigsqT= sigma^2*T*(N+1)*(2*N+1)/(6*N*N);
muT = 0.5*sigsqT + (r - 0.5*sigma^2)*T*(N+1)/(2*N);
d1 = (log(S/E) + (muT + 0.5*sigsqT))/(sqrt(sigsqT));
d2 = d1 - sqrt(sigsqT);
N1 = 0.5*(1+erf(d1/sqrt(2)));
N2 = 0.5*(1+erf(d2/sqrt(2)));
geo = exp(-r*T)*( S*exp(muT)*N1 - E*N2 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Spath = S*cumprod(exp((r-0.5*sigma^2)*Dt+sigma*sqrt(Dt)*randn(M(k),N)),2);
% Standard Monte Carlo
arithave = mean(Spath,2);
Parith = exp(-r*T)*max(arithave-E,0); % payoffs
Pmean = mean(Parith);
Pstd = std(Parith);
confmc(k,:) = [Pmean-1.96*Pstd/sqrt(M(k)), Pmean+1.96*Pstd/sqrt(M(k))];
confmcSize(k,1) = [(Pmean+1.96*Pstd/sqrt(M(k)))-(Pmean-1.96*Pstd/sqrt(M(k)))];
% Control Variate
geoave = exp((1/N)*sum(log(Spath),2));
Pgeo = exp(-r*T)*max(geoave-E,0); % geo payoffs
Z = Parith + geo - Pgeo; % control variate version
Zmean = mean(Z);
Zstd = std(Z);
confcv(k,:) = [Zmean-1.96*Zstd/sqrt(M(k)), Zmean+1.96*Zstd/sqrt(M(k))];
confcvSize(k,1) = [(Zmean+1.96*Zstd/sqrt(M(k)))-(Zmean-1.96*Zstd/sqrt(M(k)))];
end
T = table(M',confmc,confcv,confmcSize./confcvSize)
In short, I just used a matrix instead of a vector or scalar as the members of the table. In your code, the variables (confmc, confcv, confmcSize, confcvSize) were getting overwritten.

Related

How to correct grid search?

Trying to find the optimal hyperparameters for my svm model using a grid search, but it simply returns 1 for the hyperparameters.
function evaluations = inner_kfold_trainer(C,q,k,features_xy,labels)
features_xy_flds = kdivide(features_xy, k);
labels_flds = kdivide(labels, k);
evaluations = zeros(k,3);
for i = 1:k
fprintf('Fold %i of %i\n',i,k);
train_data = cell2mat(features_xy_flds(1:end ~= i));
train_labels = cell2mat(labels_flds(1:end ~= i));
test_data = cell2mat(features_xy_flds(i));
test_labels = cell2mat(labels_flds(i));
%AU1
train_labels = train_labels(:,1);
test_labels = test_labels(:,1);
[k,~] = size(test_labels);
%train
sv = fitcsvm(train_data,train_labels, 'KernelFunction','polynomial', 'PolynomialOrder',q,'BoxConstraint',C);
sv.predict(test_data);
%Calculate evaluative measures
%svm_outputs = zeros(k,1);
sv_predictions = sv.predict(test_data);
[precision,recall,F1] = evaluation(sv_predictions,test_labels);
evaluations(i,1) = precision;
evaluations(i,2) = recall;
evaluations(i,3) = F1;
end
save('eval.mat', 'evaluations');
end
an inner-fold cross validation function
and below the grid function where something seems to be going wrong
function [q,C] = grid_search(features_xy,labels,k)
% n x n grid
n = 3;
q_grid = linspace(1,19,n);
C_grid = linspace(1,59,n);
tic
evals = zeros(n,n,3);
for i = 1:n
for j = 1:n
fprintf('## i=%i, j=%i ##\n', i, j);
svm_results = inner_kfold_trainer(C_grid(i), q_grid(j),k,features_xy,labels);
evals(i,j,:) = mean(svm_results(:,:));
% precision only
%evals(i,j,:) = max(svm_results(:,1));
toc
end
end
f = evals;
% retrieving the best value of the hyper parameters, to use in the outer
% fold
[M1,I1] = max(f);
[~,I2] = max(M1(1,1,:));
index = I1(:,:,I2);
C = C_grid(index(1))
q = q_grid(index(2))
end
When I run grid_search(features_xy,labels,8) for example, I get C=1 and q=1, for any k(the no. of folds) value. Also features_xy is a 500*98 matrix.

My approximate entropy script for MATLAB isn't working

This is my Approximate entropy Calculator in MATLAB. https://en.wikipedia.org/wiki/Approximate_entropy
I'm not sure why it isn't working. It's returning a negative value.Can anyone help me with this? R1 being the data.
FindSize = size(R1);
N = FindSize(1);
% N = input ('insert number of data values');
%if you want to put your own N in, take away the % from the line above
and
%insert the % before the N = FindSize(1)
%m = input ('insert m: integer representing length of data, embedding
dimension ');
m = 2;
%r = input ('insert r: positive real number for filtering, threshold
');
r = 0.2*std(R1);
for x1= R1(1:N-m+1,1)
D1 = pdist2(x1,x1);
C11 = (D1 <= r)/(N-m+1);
c1 = C11(1);
end
for i1 = 1:N-m+1
s1 = sum(log(c1));
end
phi1 = (s1/(N-m+1));
for x2= R1(1:N-m+2,1)
D2 = pdist2(x2,x2);
C21 = (D2 <= r)/(N-m+2);
c2 = C21(1);
end
for i2 = 1:N-m+2
s2 = sum(log(c2));
end
phi2 = (s2/(N-m+2));
Ap = phi1 - phi2;
Apen = Ap(1)
Following the documentation provided by the Wikipedia article, I developed this small function that calculates the approximate entropy:
function res = approximate_entropy(U,m,r)
N = numel(U);
res = zeros(1,2);
for i = [1 2]
off = m + i - 1;
off_N = N - off;
off_N1 = off_N + 1;
x = zeros(off_N1,off);
for j = 1:off
x(:,j) = U(j:off_N+j);
end
C = zeros(off_N1,1);
for j = 1:off_N1
dist = abs(x - repmat(x(j,:),off_N1,1));
C(j) = sum(~any((dist > r),2)) / off_N1;
end
res(i) = sum(log(C)) / off_N1;
end
res = res(1) - res(2);
end
I first tried to replicate the computation shown the article, and the result I obtain matches the result shown in the example:
U = repmat([85 80 89],1,17);
approximate_entropy(U,2,3)
ans =
-1.09965411068114e-05
Then I created another example that shows a case in which approximate entropy produces a meaningful result (the entropy of the first sample is always less than the entropy of the second one):
% starting variables...
s1 = repmat([10 20],1,10);
s1_m = mean(s1);
s1_s = std(s1);
s2_m = 0;
s2_s = 0;
% datasample will not always return a perfect M and S match
% so let's repeat this until equality is achieved...
while ((s1_m ~= s2_m) && (s1_s ~= s2_s))
s2 = datasample([10 20],20,'Replace',true,'Weights',[0.5 0.5]);
s2_m = mean(s2);
s2_s = std(s2);
end
m = 2;
r = 3;
ae1 = approximate_entropy(s1,m,r)
ae2 = approximate_entropy(s2,m,r)
ae1 =
0.00138568170752751
ae2 =
0.680090884817465
Finally, I tried with your sample data:
fid = fopen('O1.txt','r');
U = cell2mat(textscan(fid,'%f'));
fclose(fid);
m = 2;
r = 0.2 * std(U);
approximate_entropy(U,m,r)
ans =
1.08567461184858

Numerical Integration by Simpsons method

I am trying to solve this integration by simpsons method and plot the result in a figure.The figure is taking only the value of P0= -6 from the for loop. When I set I(K,P) it gives the error:
Attempted to access P0(0); index must be a positive integer or logical
How can I solve it?
alpha = 45;
beta = 185;
gamma_e = 116;
% Gain values
G_ei = -18.96;
G_ee = 18.52;
G_sr = -0.26;
G_rs = 16.92;
G_es = 2.55;
G_re = 4.67;
G_se = 0.73;
G_sn = 2.78;
G_esre = G_es*G_sr*G_re;
G_srs = G_sr*G_rs;
G_ese = G_es*G_se;
G_esn = G_es*G_sn;
t_0 = 0.085; % corticothalamic loop delay in second
r_e = 0.086; % Excitatory axon range in metre
f = linspace(-40,40,500); % f = frequency in Hz
w = 2*pi*f; % angular frequency in radian per second
delt_P = 0.5;
L=zeros(1,500);
Q=repmat(L,1);
P=repmat(L,1);
%%%%%%%%%%%%% integration %%%%%%%%%%%%
a = -80*pi;
b = 80*pi;
n=500;
I=repmat(L,1);
P_initial = repmat(L,1);
P_shift = repmat(L,1);
p = repmat(L,1);
for k = 1:length(w)
for P0 = [6 -6]
L_initial = #(w1) (1-((1i*w1)/alpha))^(-1)*(1-((1i*w1)/beta))^(-1);
Q_initial = #(w1) (1/(r_e^2))*((1-((1i*w1)/gamma_e))^(2) - (1/(1-G_ei*L_initial(w1)))*....
(L_initial(w1)*G_ee + (exp(1i*w1*t_0)*(L_initial(w1)^2*G_ese +L_initial(w1)^3*G_esre))/(1-L_initial(w1)^2*G_srs)));
P_initial = #(w1) (pi/r_e^4)* (abs((L_initial(w1)^2*G_esn)/((1-L_initial(w1)^2*G_srs)*....
(1-G_ei*L_initial(w1)))))^2 * abs((atan2((imag(Q_initial(w1))),(real(Q_initial(w1)))))/imag(Q_initial(w1)));
G = 150*exp(- (f - P0).^2./(2*(delt_P).^2));
P2 = #(w1) G(k) + P_initial(w1);
L_shift = #(w1) (1-((1i*(w(k)-w1))/alpha))^(-1)* (1-((1i*(w(k)-w1))/beta))^(-1);
Q_shift = #(w1) (1/(r_e^2))*((1-((1i*(w(k)-w1))/gamma_e))^(2) - (1/(1-G_ei*L_shift(w1)))*...
(L_shift(w1)*G_ee + (exp(1i*(w(k)-w1)*t_0)*(L_shift(w1)^2*G_ese +L_shift(w1)^3*G_esre))/(1-L_shift(w1)^2*G_srs)));
P_shift = #(w1) (pi/r_e^4)* (abs((L_shift(w1)^2*G_esn)/((1-L_shift(w1)^2*G_srs)*(1-G_ei*L_shift(w1)))))^2 *....
abs((atan2((imag(Q_shift(w1))),(real(Q_shift(w1)))))/imag(Q_shift(w1)));
p = #(w1) P2(w1)*P_shift(w1); % Power spectrum formula for P(w1)*p(w-w1)
I(k) = simprl(p,a,b,n);
end
end
figure(1)
plot(f,I,'r--')
figure(2)
plot(f,G,'k')
At the moment you only use the results for P0 = -6 as you save them in I(k). First you save the result for P0 = 6 later you overwrite it and save the other. The results of P0 = 6are neither used nor saved. If you write your code as follows this will be clarifyied.
for k = 1:length(w)
L_shift = #(w1) (1-((1i*(w(k)-w1))/alpha))^(-1)* (1-((1i*(w(k)-w1))/beta))^(-1);
Q_shift = #(w1) (1/(r_e^2))*((1-((1i*(w(k)-w1))/gamma_e))^(2) - (1/(1-G_ei*L_shift(w1)))*...
(L_shift(w1)*G_ee + (exp(1i*(w(k)-w1)*t_0)*(L_shift(w1)^2*G_ese +L_shift(w1)^3*G_esre))/(1-L_shift(w1)^2*G_srs)));
P_shift = #(w1) (pi/r_e^4)* (abs((L_shift(w1)^2*G_esn)/((1-L_shift(w1)^2*G_srs)*(1-G_ei*L_shift(w1)))))^2 *....
abs((atan2((imag(Q_shift(w1))),(real(Q_shift(w1)))))/imag(Q_shift(w1)));
for P0 = [6 -6]
G = 150*exp(- (f - P0).^2./(2*(delt_P).^2));
P2 = #(w1) G(k) + P_initial(w1);
p = #(w1) P2(w1)*P_shift(w1);
I(k) = simprl(p,a,b,n);
end
end
You can't access I(k,P) as I is an vector not an matrix. However this will give you Index exceeds matrix dimensions. You could save the results for P0 = -6 in one variable and P0 = 6 in the other variable as the results in your code do not depent on each other.

Why do I get such a bad loss in my implementation of k-Nearest Neighbor?

I'm trying to implement k-NN in matlab. I have a matrix of 214 x's that have 9 columns of attributes with the 10th column being the label. I want to measure loss with a 0-1 function on 10 cross-validation tests. I have the following code:
function q3(file)
data = knnfile(file);
loss(data(:,1:9),'KFold',data(:,10))
losses = zeros(25,3);
new_data = data;
new_data(:,10) = [];
sdd = std(new_data);
meand = mean(new_data);
for s = 1:214
for q = 1:9
new_data(s,q) = (new_data(s,q) - meand(q)) / sdd(q);
end
end
new_data = [new_data data(:,10)];
for k = 1:25
loss1 = 0;
loss2 = 0;
for j = 0:9
index = floor(214/10)*j+1;
curd1 = data([1:index-1,index+21:end],:);
curd2 = new_data([1:index-1,index+21:end],:);
for l = 0:20
c1 = knn(curd1,k,data(index+l,:));
c2 = knn(curd2,k,new_data(index+l,:));
loss1 = loss1 + (c1 ~= data(index+l,10));
loss2 = loss2 + (c2 ~= new_data(index+l,10));
end
end
losses(k,1) = k;
losses(k,2) = 100*loss1/210;
losses(k,3) = 100*loss2/210;
end
function cluster = knn(Data,k,x)
distances = zeros(193,2);
for i = 1:size(Data,1)
row = Data(i,:);
d = norm(row(1:size(row,2)-1) - x(1:size(x,2)-1));
distances(i,:) = [d row(10)];
end
distances = sortrows(distances,1);
cluster = mode(distances(1:k,2));
I'm getting 40%+ loss with almost no correlation to k and I'm sure that something here is wrong but I'm not quite sure.
Any help would be appreciated!

Application of Neural Network in MATLAB

I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!