I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!
Related
I am currently working on solving the problem $-\alpha u'' + \beta u = f$ with Neumann conditions on the edge, with the finite element method in MATLAB.
I managed to set up a code that works for P1 and P2 Lagragne finite elements (i.e: linear and quadratic) and the results are good!
I am trying to implement the finite element method using the Hermite basis. This basis is defined by the following basis functions and derivatives:
syms x
phi(x) = [2*x^3-3*x^2+1,-2*x^3+3*x^2,x^3-2*x^2+x,x^3-x^2]
% Derivative
dphi = [6*x.^2-6*x,-6*x.^2+6*x,3*x^2-4*x+1,3*x^2-2*x]
The problem with the following code is that the solution vector u is not good. I know that there must be a problem in the S and F element matrix calculation loop, but I can't see where even though I've been trying to make changes for a week.
Can you give me your opinion? Hopefully someone can see my error.
Thanks a lot,
% -alpha*u'' + beta*u = f
% u'(a) = bd1, u'(b) = bd2;
a = 0;
b = 1;
f = #(x) (1);
alpha = 1;
beta = 1;
% Neuamnn boundary conditions
bn1 = 1;
bn2 = 0;
syms ue(x)
DE = -alpha*diff(ue,x,2) + beta*ue == f;
du = diff(ue,x);
BC = [du(a)==bn1, du(b)==bn2];
ue = dsolve(DE, BC);
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
N = 2;
nnod = N*(2+2); % Number of nodes
neq = nnod*1; % Number of equations, one degree of freedom per node
xnod = linspace(a,b,nnod);
nodes = [(1:3:nnod-3)', (2:3:nnod-2)', (3:3:nnod-1)', (4:3:nnod)'];
phi = #(xi)[2*xi.^3-3*xi.^2+1,2*xi.^3+3*xi.^2,xi.^3-2*xi.^2+xi,xi.^3-xi.^2];
dphi = #(xi)[6*xi.^2-6*xi,-6*xi.^2+6*xi,3*xi^2-4*xi+1,3*xi^2-2*xi];
% Here, just calculate the integral using gauss quadrature..
order = 5;
[gp, gw] = gauss(order, 0, 1);
S = zeros(neq,neq);
M = S;
F = zeros(neq,1);
for iel = 1:N
%disp(iel)
inod = nodes(iel,:);
xc = xnod(inod);
h = xc(end)-xc(1);
Se = zeros(4,4);
Me = Se;
fe = zeros(4,1);
for ig = 1:length(gp)
xi = gp(ig);
iw = gw(ig);
Se = Se + dphi(xi)'*dphi(xi)*1/h*1*iw;
Me = Me + phi(xi)'*phi(xi)*h*1*iw;
x = phi(xi)*xc';
fe = fe + phi(xi)' * f(x) * h * 1 * iw;
end
% Assembly
S(inod,inod) = S(inod, inod) + Se;
M(inod,inod) = M(inod, inod) + Me;
F(inod) = F(inod) + fe;
end
S = alpha*S + beta*M;
g = zeros(neq,1);
g(1) = -alpha*bn1;
g(end) = alpha*bn2;
alldofs = 1:neq;
u = zeros(neq,1); %Pre-allocate
F = F + g;
u(alldofs) = S(alldofs,alldofs)\F(alldofs)
Warning: Matrix is singular to working precision.
u = 8×1
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
figure
fplot(ue,[a,b], 'r', 'LineWidth',2)
hold on
plot(xnod, u, 'bo')
for iel = 1:N
inod = nodes(iel,:);
xc = xnod(inod);
U = u(inod);
xi = linspace(0,1,100)';
Ue = phi(xi)*U;
Xe = phi(xi)*xc';
plot(Xe,Ue,'b -')
end
% Gauss function for calculate the integral
function [x, w, A] = gauss(n, a, b)
n = 1:(n - 1);
beta = 1 ./ sqrt(4 - 1 ./ (n .* n));
J = diag(beta, 1) + diag(beta, -1);
[V, D] = eig(J);
x = diag(D);
A = b - a;
w = V(1, :) .* V(1, :);
w = w';
x=x';
end
You can find the same post under MATLAB site for syntax highlighting.
Thanks
I tried to read courses, search in different documentation and modify my code without success.
I'm having serious problems in a simple example of fem assembly.
I just want to assemble the Mass matrix without any coefficient. The geometry is simple:
conn=[1, 2, 3];
p = [0 0; 1 0; 0 1];
I made it like this so that the physical element will be equal to the reference one.
my basis functions:
phi_1 = #(eta) 1 - eta(1) - eta(2);
phi_2 = #(eta) eta(1);
phi_3 = #(eta) eta(2);
phi = {phi_1, phi_2, phi_3};
Jacobian matrix:
J = #(x,y) [x(2) - x(1), x(3) - x(1);
y(2) - y(1), y(3) - y(1)];
The rest of the code:
M = zeros(np,np);
for K = 1:size(conn,1)
l2g = conn(K,:); %local to global mapping
x = p(l2g,1); %node x-coordinate
y = p(l2g,2); %node y-coordinate
jac = J(x,y);
loc_M = localM(jac, phi);
M(l2g, l2g) = M(l2g, l2g) + loc_M; %add element masses to M
end
localM:
function loc_M = localM(J,phi)
d_J = det(J);
loc_M = zeros(3,3);
for i = 1:3
for j = 1:3
loc_M(i,j) = d_J * quadrature(phi{i}, phi{j});
end
end
end
quadrature:
function value = quadrature(phi_i, phi_j)
p = [1/3, 1/3;
0.6, 0.2;
0.2, 0.6;
0.2, 0.2];
w = [-27/96, 25/96, 25/96, 25/96];
res = 0;
for i = 1:size(p,1)
res = res + phi_i(p(i,:)) * phi_j(p(i,:)) * w(i);
end
value = res;
end
For the simple entry (1,1) I obtain 0.833, while computing the integral by hand or on wolfram alpha I get 0.166 (2 times the result of the quadrature).
I tried with different points and weights for quadrature, but really I do not know what I am doing wrong.
I'm trying to implement Divide and Conquer SVD of an upper bidiagonal matrix B, but my code is not working. The error is:
"Unable to perform assignment because the size of the left side is
3-by-3 and the size of the right side is 2-by-2.
V_bar(1:k,1:k) = V1;"
Can somebody help me fix it? Thanks.
function [U,S,V] = DivideConquer_SVD(B)
[m,n] = size(B);
k = floor(m/2);
if k == 0
U = 1;
V = 1;
S = B;
return;
else
% Divide the input matrix
alpha = B(k,k);
beta = B(k,k+1);
e1 = zeros(m,1);
e2 = zeros(m,1);
e1(k) = 1;
e2(k+1) = 1;
B1 = B(1:k-1,1:k);
B2 = B(k+1:m,k+1:m);
%recursive computations
[U1,S1,V1] = DivideConquer_SVD(B1);
[U2,S2,V2] = DivideConquer_SVD(B2);
U_bar = zeros(m);
U_bar(1:k-1,1:k-1) = U1;
U_bar(k,k) = 1;
U_bar((k+1):m,(k+1):m) = U2;
D = zeros(m);
D(1:k-1,1:k) = S1;
D((k+1):m,(k+1):m) = S2;
V_bar = zeros(m);
V_bar(1:k,1:k) = V1;
V_bar((k+1):m,(k+1):m) = V2;
u = alpha*e1'*V_bar + beta*e2'*V_bar;
u = u';
D_tilde = D*D + u*u';
% compute eigenvalues and eigenvectors of D^2+uu'
[L1,Q1] = eig(D_tilde);
eigs = diag(L1);
S = zeros(m,n)
S(1:(m+1):end) = eigs
U_tilde = Q1;
V_tilde = Q1;
%Compute eigenvectors of the original input matrix T
U = U_bar*U_tilde;
V = V_bar*V_tilde;
return;
end
With limited mathematical knowledge, you need to help me a bit more -- as I cannot judge if the approach is correct in a mathematical way (with no theory given;) ). Anyway, I couldn't even reproduce the error e.g with this matrix, which The MathWorks use to illustrate their LU matrix factorization
A = [10 -7 0
-3 2 6
5 -1 5];
So I tried to structure your code a bit and gave some hints. Extend this to make your code clearer for those people (like me) who are not too familiar with matrix decomposition.
function [U,S,V] = DivideConquer_SVD(B)
% m x n matrix
[m,n] = size(B);
k = floor(m/2);
if k == 0
disp('if') % for debugging
U = 1;
V = 1;
S = B;
% return; % net necessary as you don't do anything afterwards anyway
else
disp('else') % for debugging
% Divide the input matrix
alpha = B(k,k); % element on diagonal
beta = B(k,k+1); % element on off-diagonal
e1 = zeros(m,1);
e2 = zeros(m,1);
e1(k) = 1;
e2(k+1) = 1;
% divide matrix
B1 = B(1:k-1,1:k); % upper left quadrant
B2 = B(k+1:m,k+1:m); % lower right quadrant
% recusrsive function call
[U1,S1,V1] = DivideConquer_SVD(B1);
[U2,S2,V2] = DivideConquer_SVD(B2);
U_bar = zeros(m);
U_bar(1:k-1,1:k-1) = U1;
U_bar(k,k) = 1;
U_bar((k+1):m,(k+1):m) = U2;
D = zeros(m);
D(1:k-1,1:k) = S1;
D((k+1):m,(k+1):m) = S2;
V_bar = zeros(m);
V_bar(1:k,1:k) = V1;
V_bar((k+1):m,(k+1):m) = V2;
u = (alpha*e1.'*V_bar + beta*e2.'*V_bar).'; % (little show-off tip: '
% is the complex transpose operator; .' is the "normal" transpose
% operator. It's good practice to distinguish between them but there
% is no difference for real matrices anyway)
D_tilde = D*D + u*u.';
% compute eigenvalues and eigenvectors of D^2+uu'
[L1,Q1] = eig(D_tilde);
eigs = diag(L1);
S = zeros(m,n);
S(1:(m+1):end) = eigs;
U_tilde = Q1;
V_tilde = Q1;
% Compute eigenvectors of the original input matrix T
U = U_bar*U_tilde;
V = V_bar*V_tilde;
% return; % net necessary as you don't do anything afterwards anyway
end % for
end % function
A filter g is called separable if it can be expressed as the multiplication of two vectors grow and gcol . Employing one dimensional filters decreases the two dimensional filter's computational complexity from O(M^2 N^2) to O(2M N^2) where M and N are the width (and height) of the filter mask and the image respectively.
In this stackoverflow link, I wrote the equation of a Gabor filter in the spatial domain, then I wrote a matlab code which serves to create 64 gabor features.
According to the definition of separable filters, the Gabor filters are parallel to the image axes - theta = k*pi/2 where k=0,1,2,etc.. So if theta=pi/2 ==> the equation in this stackoverflow link can be rewritten as:
The equation above is extracted from this article.
Note: theta can be extented to be equal k*pi/4. By comparing to the equation in this stackoverflow link, we can consider that f= 1 / lambda.
By changing my previous code in this stackoverflow link, I wrote a matlab code to make the Gabor filters separable by using the equation above, but I am sure that my code below is not correct especially when I initialized the gbp and glp equations. That is why I need your help. I will appreciate your help very much.
Let's show now my code:
function [fSiz,filters1,filters2,c1OL,numSimpleFilters] = init_gabor(rot, RF_siz)
image=imread('xxx.jpg');
image_gray=rgb2gray(image);
image_gray=imresize(image_gray, [100 100]);
image_double=double(image_gray);
rot = [0 45 90 135]; % we have four orientations
RF_siz = [7:2:37]; %we get 16 scales (7x7 to 37x37 in steps of two pixels)
minFS = 7; % the minimum receptive field
maxFS = 37; % the maximum receptive field
sigma = 0.0036*RF_siz.^2 + 0.35*RF_siz + 0.18; %define the equation of effective width
lambda = sigma/0.8; % it the equation of wavelength (lambda)
G = 0.3; % spatial aspect ratio: 0.23 < gamma < 0.92
numFilterSizes = length(RF_siz); % we get 16
numSimpleFilters = length(rot); % we get 4
numFilters = numFilterSizes*numSimpleFilters; % we get 16x4 = 64 filters
fSiz = zeros(numFilters,1); % It is a vector of size numFilters where each cell contains the size of the filter (7,7,7,7,9,9,9,9,11,11,11,11,......,37,37,37,37)
filters1 = zeros(max(RF_siz),numFilters);
filters2 = zeros(numFilters,max(RF_siz));
for k = 1:numFilterSizes
for r = 1:numSimpleFilters
theta = rot(r)*pi/180;
filtSize = RF_siz(k);
center = ceil(filtSize/2);
filtSizeL = center-1;
filtSizeR = filtSize-filtSizeL-1;
sigmaq = sigma(k)^2;
for x = -filtSizeL:filtSizeR
fx = exp(-(x^2)/(2*sigmaq))*cos(2*pi*x/lambda(k));
f1(x+center,1) = fx;
end
for y = -filtSizeL:filtSizeR
gy = exp(-(y^2)/(2*sigmaq));
f2(1,y+center) = gy;
end
f1 = f1 - mean(mean(f1));
f1 = f1 ./ sqrt(sum(sum(f1.^2)));
f2 = f2 - mean(mean(f2));
f2 = f2 ./ sqrt(sum(sum(f2.^2)));
p = numSimpleFilters*(k-1) + r;
filters1(1:filtSize,p)=f1;
filters2(p,1:filtSize)=f2;
convv1=imfilter(image_double, filters1(1:filtSize,p),'conv');
convv2=imfilter(double(convv1), filters2(p,1:filtSize),'conv');
figure
imagesc(convv2);
colormap(gray);
end
end
I think the code is correct provided your previous version of Gabor filter code is correct too. The only thing is that if theta = k * pi/4;, your formula here should be separated to:
fx = exp(-(x^2)/(2*sigmaq))*cos(2*pi*x/lambda(k));
gy = exp(-(G^2 * y^2)/(2*sigmaq));
To be consistent, you may use
f1(1,x+center) = fx;
f2(y+center,1) = gy;
or keep f1 and f2 as it is but transpose your filters1 and filters2 thereafter.
Everything else looks good to me.
EDIT
My answer above works for theta = k * pi/4;, with other angles, based on your paper,
x = i*cos(theta) - j*sin(theta);
y = i*sin(theta) + j*cos(theta);
fx = exp(-(x^2)/(2*sigmaq))*exp(sqrt(-1)*x*cos(theta));
gy = exp(-(G^2 * y^2)/(2*sigmaq))*exp(sqrt(-1)*y*sin(theta));
The final code will be:
function [fSiz,filters1,filters2,c1OL,numSimpleFilters] = init_gabor(rot, RF_siz)
image=imread('xxx.jpg');
image_gray=rgb2gray(image);
image_gray=imresize(image_gray, [100 100]);
image_double=double(image_gray);
rot = [0 45 90 135];
RF_siz = [7:2:37];
minFS = 7;
maxFS = 37;
sigma = 0.0036*RF_siz.^2 + 0.35*RF_siz + 0.18;
lambda = sigma/0.8;
G = 0.3;
numFilterSizes = length(RF_siz);
numSimpleFilters = length(rot);
numFilters = numFilterSizes*numSimpleFilters;
fSiz = zeros(numFilters,1);
filters1 = zeros(max(RF_siz),numFilters);
filters2 = zeros(numFilters,max(RF_siz));
for k = 1:numFilterSizes
for r = 1:numSimpleFilters
theta = rot(r)*pi/180;
filtSize = RF_siz(k);
center = ceil(filtSize/2);
filtSizeL = center-1;
filtSizeR = filtSize-filtSizeL-1;
sigmaq = sigma(k)^2;
for x = -filtSizeL:filtSizeR
fx = exp(-(x^2)/(2*sigmaq))*exp(sqrt(-1)*x*cos(theta));
f1(1, x+center) = fx;
end
for y = -filtSizeL:filtSizeR
gy=exp(-(y^2)/(2*sigmaq))*exp(sqrt(-1)*y*sin(theta));
f2(y+center,1) = gy;
end
f1 = f1 - mean(mean(f1));
f1 = f1 ./ sqrt(sum(sum(f1.^2)));
f2 = f2 - mean(mean(f2));
f2 = f2 ./ sqrt(sum(sum(f2.^2)));
p = numSimpleFilters*(k-1) + r;
filters1(1:filtSize,p)=f1;
filters2(p,1:filtSize)=f2;
convv1=imfilter(image_double, filters1(1:filtSize,p),'conv');
convv2=imfilter(double(convv1), filters2(p,1:filtSize),'conv');
figure
imagesc(imag(convv2));
colormap(gray);
end
end
I've written some code to implement an algorithm that takes as input a vector q of real numbers, and returns as an output a complex matrix R. The Matlab code below produces a plot showing the input vector q and the output matrix R.
Given only the complex matrix output R, I would like to obtain the input vector q. Can I do this using least-squares optimization? Since there is a recursive running sum in the code (rs_r and rs_i), the calculation for a column of the output matrix is dependent on the calculation of the previous column.
Perhaps a non-linear optimization can be set up to recompose the input vector q from the output matrix R?
Looking at this in another way, I've used an algorithm to compute a matrix R. I want to run the algorithm "in reverse," to get the input vector q from the output matrix R.
If there is no way to recompose the starting values from the output, thereby treating the problem as a "black box," then perhaps the mathematics of the model itself can be used in the optimization? The program evaluates the following equation:
The Utilde(tau, omega) is the output matrix R. The tau (time) variable comprises the columns of the response matrix R, whereas the omega (frequency) variable comprises the rows of the response matrix R. The integration is performed as a recursive running sum from tau = 0 up to the current tau timestep.
Here are the plots created by the program posted below:
Here is the full program code:
N = 1001;
q = zeros(N, 1); % here is the input
q(1:200) = 55;
q(201:300) = 120;
q(301:400) = 70;
q(401:600) = 40;
q(601:800) = 100;
q(801:1001) = 70;
dt = 0.0042;
fs = 1 / dt;
wSize = 101;
Glim = 20;
ginv = 0;
R = get_response(N, q, dt, wSize, Glim, ginv); % R is output matrix
rows = wSize;
cols = N;
figure; plot(q); title('q value input as vector');
ylim([0 200]); xlim([0 1001])
figure; imagesc(abs(R)); title('Matrix output of algorithm')
colorbar
Here is the function that performs the calculation:
function response = get_response(N, Q, dt, wSize, Glim, ginv)
fs = 1 / dt;
Npad = wSize - 1;
N1 = wSize + Npad;
N2 = floor(N1 / 2 + 1);
f = (fs/2)*linspace(0,1,N2);
omega = 2 * pi .* f';
omegah = 2 * pi * f(end);
sigma2 = exp(-(0.23*Glim + 1.63));
sign = 1;
if(ginv == 1)
sign = -1;
end
ratio = omega ./ omegah;
rs_r = zeros(N2, 1);
rs_i = zeros(N2, 1);
termr = zeros(N2, 1);
termi = zeros(N2, 1);
termr_sub1 = zeros(N2, 1);
termi_sub1 = zeros(N2, 1);
response = zeros(N2, N);
% cycle over cols of matrix
for ti = 1:N
term0 = omega ./ (2 .* Q(ti));
gamma = 1 / (pi * Q(ti));
% calculate for the real part
if(ti == 1)
Lambda = ones(N2, 1);
termr_sub1(1) = 0;
termr_sub1(2:end) = term0(2:end) .* (ratio(2:end).^-gamma);
else
termr(1) = 0;
termr(2:end) = term0(2:end) .* (ratio(2:end).^-gamma);
rs_r = rs_r - dt.*(termr + termr_sub1);
termr_sub1 = termr;
Beta = exp( -1 .* -0.5 .* rs_r );
Lambda = (Beta + sigma2) ./ (Beta.^2 + sigma2); % vector
end
% calculate for the complex part
if(ginv == 1)
termi(1) = 0;
termi(2:end) = (ratio(2:end).^(sign .* gamma) - 1) .* omega(2:end);
else
termi = (ratio.^(sign .* gamma) - 1) .* omega;
end
rs_i = rs_i - dt.*(termi + termi_sub1);
termi_sub1 = termi;
integrand = exp( 1i .* -0.5 .* rs_i );
if(ginv == 1)
response(:,ti) = Lambda .* integrand;
else
response(:,ti) = (1 ./ Lambda) .* integrand;
end
end % ti loop
No, you cannot do so unless you know the "model" itself for this process. If you intend to treat the process as a complete black box, then it is impossible in general, although in any specific instance, anything can happen.
Even if you DO know the underlying process, then it may still not work, as any least squares estimator is dependent on the starting values, so if you do not have a good guess there, it may converge to the wrong set of parameters.
It turns out that by using the mathematics of the model, the input can be estimated. This is not true in general, but for my problem it seems to work. The cumulative integral is eliminated by a partial derivative.
N = 1001;
q = zeros(N, 1);
q(1:200) = 55;
q(201:300) = 120;
q(301:400) = 70;
q(401:600) = 40;
q(601:800) = 100;
q(801:1001) = 70;
dt = 0.0042;
fs = 1 / dt;
wSize = 101;
Glim = 20;
ginv = 0;
R = get_response(N, q, dt, wSize, Glim, ginv);
rows = wSize;
cols = N;
cut_val = 200;
imagLogR = imag(log(R));
Mderiv = zeros(rows, cols-2);
for k = 1:rows
val = deriv_3pt(imagLogR(k,:), dt);
val(val > cut_val) = 0;
Mderiv(k,:) = val(1:end-1);
end
disp('Running iteration');
q0 = 10;
q1 = 500;
NN = cols - 2;
qout = zeros(NN, 1);
for k = 1:NN
data = Mderiv(:,k);
qout(k) = fminbnd(#(q) curve_fit_to_get_q(q, dt, rows, data),q0,q1);
end
figure; plot(q); title('q value input as vector');
ylim([0 200]); xlim([0 1001])
figure;
plot(qout); title('Reconstructed q')
ylim([0 200]); xlim([0 1001])
Here are the supporting functions:
function output = deriv_3pt(x, dt)
% Function to compute dx/dt using the 3pt symmetrical rule
% dt is the timestep
N = length(x);
N0 = N - 1;
output = zeros(N0, 1);
denom = 2 * dt;
for k = 2:N0
output(k - 1) = (x(k+1) - x(k-1)) / denom;
end
function sse = curve_fit_to_get_q(q, dt, rows, data)
fs = 1 / dt;
N2 = rows;
f = (fs/2)*linspace(0,1,N2); % vector for frequency along cols
omega = 2 * pi .* f';
omegah = 2 * pi * f(end);
ratio = omega ./ omegah;
gamma = 1 / (pi * q);
termi = ((ratio.^(gamma)) - 1) .* omega;
Error_Vector = termi - data;
sse = sum(Error_Vector.^2);