Related
I'm having serious problems in a simple example of fem assembly.
I just want to assemble the Mass matrix without any coefficient. The geometry is simple:
conn=[1, 2, 3];
p = [0 0; 1 0; 0 1];
I made it like this so that the physical element will be equal to the reference one.
my basis functions:
phi_1 = #(eta) 1 - eta(1) - eta(2);
phi_2 = #(eta) eta(1);
phi_3 = #(eta) eta(2);
phi = {phi_1, phi_2, phi_3};
Jacobian matrix:
J = #(x,y) [x(2) - x(1), x(3) - x(1);
y(2) - y(1), y(3) - y(1)];
The rest of the code:
M = zeros(np,np);
for K = 1:size(conn,1)
l2g = conn(K,:); %local to global mapping
x = p(l2g,1); %node x-coordinate
y = p(l2g,2); %node y-coordinate
jac = J(x,y);
loc_M = localM(jac, phi);
M(l2g, l2g) = M(l2g, l2g) + loc_M; %add element masses to M
end
localM:
function loc_M = localM(J,phi)
d_J = det(J);
loc_M = zeros(3,3);
for i = 1:3
for j = 1:3
loc_M(i,j) = d_J * quadrature(phi{i}, phi{j});
end
end
end
quadrature:
function value = quadrature(phi_i, phi_j)
p = [1/3, 1/3;
0.6, 0.2;
0.2, 0.6;
0.2, 0.2];
w = [-27/96, 25/96, 25/96, 25/96];
res = 0;
for i = 1:size(p,1)
res = res + phi_i(p(i,:)) * phi_j(p(i,:)) * w(i);
end
value = res;
end
For the simple entry (1,1) I obtain 0.833, while computing the integral by hand or on wolfram alpha I get 0.166 (2 times the result of the quadrature).
I tried with different points and weights for quadrature, but really I do not know what I am doing wrong.
EDIT: The code that I have pasted is too long. Basicaly I dont know how to work with the second code, If I know how calculate alpha from the second code I think my problem will be solved. I have tried a lot of input arguments for the second code but it does not work!
I have written following code to solve a convex optimization problem using Gradient descend method:
function [optimumX,optimumF,counter,gNorm,dx] = grad_descent()
x0 = [3 3]';%'//
terminationThreshold = 1e-6;
maxIterations = 100;
dxMin = 1e-6;
gNorm = inf; x = x0; counter = 0; dx = inf;
% ************************************
f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
%alpha = 0.01;
% ************************************
figure(1); clf; ezcontour(f,[-5 5 -5 5]); axis equal; hold on
f2 = #(x) f(x(1),x(2));
% gradient descent algorithm:
while and(gNorm >= terminationThreshold, and(counter <= maxIterations, dx >= dxMin))
g = grad(x);
gNorm = norm(g);
alpha = linesearch_strongwolfe(f,-g, x0, 1);
xNew = x - alpha * g;
% check step
if ~isfinite(xNew)
display(['Number of iterations: ' num2str(counter)])
error('x is inf or NaN')
end
% **************************************
plot([x(1) xNew(1)],[x(2) xNew(2)],'ko-')
refresh
% **************************************
counter = counter + 1;
dx = norm(xNew-x);
x = xNew;
end
optimumX = x;
optimumF = f2(optimumX);
counter = counter - 1;
% define the gradient of the objective
function g = grad(x)
g = [(8*x(1) + 2*x(2) +10)
(2*x(1) + 16*x(2) + 1)];
end
end
As you can see, I have commented out the alpha = 0.01; part. I want to calculate alpha via an other code. Here is the code (This code is not mine)
function alphas = linesearch_strongwolfe(f,d,x0,alpham)
alpha0 = 0;
alphap = alpha0;
c1 = 1e-4;
c2 = 0.5;
alphax = alpham*rand(1);
[fx0,gx0] = feval(f,x0,d);
fxp = fx0;
gxp = gx0;
i=1;
while (1 ~= 2)
xx = x0 + alphax*d;
[fxx,gxx] = feval(f,xx,d);
if (fxx > fx0 + c1*alphax*gx0) | ((i > 1) & (fxx >= fxp)),
alphas = zoom(f,x0,d,alphap,alphax);
return;
end
if abs(gxx) <= -c2*gx0,
alphas = alphax;
return;
end
if gxx >= 0,
alphas = zoom(f,x0,d,alphax,alphap);
return;
end
alphap = alphax;
fxp = fxx;
gxp = gxx;
alphax = alphax + (alpham-alphax)*rand(1);
i = i+1;
end
function alphas = zoom(f,x0,d,alphal,alphah)
c1 = 1e-4;
c2 = 0.5;
[fx0,gx0] = feval(f,x0,d);
while (1~=2),
alphax = 1/2*(alphal+alphah);
xx = x0 + alphax*d;
[fxx,gxx] = feval(f,xx,d);
xl = x0 + alphal*d;
fxl = feval(f,xl,d);
if ((fxx > fx0 + c1*alphax*gx0) | (fxx >= fxl)),
alphah = alphax;
else
if abs(gxx) <= -c2*gx0,
alphas = alphax;
return;
end
if gxx*(alphah-alphal) >= 0,
alphah = alphal;
end
alphal = alphax;
end
end
But I get this error:
Error in linesearch_strongwolfe (line 11) [fx0,gx0] = feval(f,x0,d);
As you can see I have written the f function and its gradient manually.
linesearch_strongwolfe(f,d,x0,alpham) takes a function f, Gradient of f, a vector x0 and a constant alpham. is there anything wrong with my declaration of f? This code works just fine if I put back alpha = 0.01;
As I see it:
x0 = [3; 3]; %2-element column vector
g = grad(x0); %2-element column vector
f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
linesearch_strongwolfe(f,-g, x0, 1); %passing variables
inside the function:
[fx0,gx0] = feval(f,x0,-g); %variable names substituted with input vars
This will in effect call
[fx0,gx0] = f(x0,-g);
but f(x0,-g) is a single 2-element column vector with these inputs. Assingning the output to two variables will not work.
You either have to define f as a proper named function (just like grad) to output 2 variables (one for each component), or edit the code of linesearch_strongwolfe to return a single variable, then slice that into 2 separate variables yourself afterwards.
If you experience a very rare kind of laziness and don't want to define a named function, you can still use an anonymous function at the cost of duplicating code for the two components (at least I couldn't come up with a cleaner solution):
f = #(x1,x2) deal(4.*x1(1)^2 + 2.*x1(1)*x2(1) +8.*x2(1)^2 + 10.*x1(1) + x2(1),...
4.*x1(2)^2 + 2.*x1(2)*x2(2) +8.*x2(2)^2 + 10.*x1(2) + x2(2));
[fx0,gx0] = f(x0,-g); %now works fine
as long as you always have 2 output variables. Note that this is more like a proof of concept, since this is ugly, inefficient, and very susceptible to typos.
I wrote a python function to get the parameters of the following cosine function:
param = Parameters()
param.add( 'amp', value = amp_guess, min = 0.1 * amp_guess, max = amp_guess )
param.add( 'off', value = off_guess, min = -10, max = 10 )
param.add( 'shift', value = shift_guess[0], min = 0, max = 2 * np.pi, )
fit_values = minimize( self.residual, param, args = ( azi_unique, los_unique ) )
def residual( self, param, azi, data ):
"""
Parameters
----------
Returns
-------
"""
amp = param['amp'].value
off = param['off'].value
shift = param['shift'].value
model = off + amp * np.cos( azi - shift )
return model - data
In Matlab how can get the amplitude, offset and shift of the cosine function?
My experience tells me that it's always good to depend as little as possible on toolboxes. For your particular case, the model is simple and doing it manually is pretty straightforward.
Assuming that you have the following model:
y = B + A*cos(w*x + phi)
and that your data is equally-spaced, then:
%// Create some bogus data
A = 8;
B = -4;
w = 0.2;
phi = 1.8;
x = 0 : 0.1 : 8.4*pi;
y = B + A*cos(w*x + phi) + 0.5*randn(size(x));
%// Find kick-ass initial estimates
L = length(y);
N = 2^nextpow2(L);
B0 = (max(y(:))+min(y(:)))/2;
Y = fft(y-B0, N)/L;
f = 5/(x(2)-x(1)) * linspace(0,1,N/2+1);
[A0,I] = max( 2*abs(Y(1:N/2+1)) );
w0 = f(I);
phi0 = 2*imag(Y(I));
%// Refine the fit
sol = fminsearch(#(t) sum( (y(:)-t(1)-t(2)*cos(t(3)*x(:)+t(4))).^2 ), [B0 A0 w0 phi0])
Results:
sol = %// B was -4 A was 8 w was 0.2 phi was 1.8
-4.0097e+000 7.9913e+000 1.9998e-001 1.7961e+000
MATLAB has a function called lsqcurvefit in the optimisation toolbox:
lsqcurvefit(fun,X0,xdata,ydata,lbound,ubound);
where fun is the function to fit, x0 is the initial parameter guess, xdata and ydata are self-explanatory, and lbound and ubound are the lower and upper bounds to the parameters. So, for instance, you might have a function:
% x(1) = amp
% x(2) = shift
% x(3) = offset
% note cosd instead of cos, because your data appears to be in degrees
cosfit = #(x,xdata) x(1) .* cosd(xdata - x(2)) + x(3);
You would then call the lsqcurvefit function as follows:
guess = [7,150,0.5];
lbound = [-10,0,-10]
ubound = [10,360,10]
fit_values = lsqcurvefit(cosfit,guess,azi_unique,los_unique,lbound,ubound);
I asked a question a few days before but I guess it was a little too complicated and I don't expect to get any answer.
My problem is that I need to use ANN for classification. I've read that much better cost function (or loss function as some books specify) is the cross-entropy, that is J(w) = -1/m * sum_i( yi*ln(hw(xi)) + (1-yi)*ln(1 - hw(xi)) ); i indicates the no. data from training matrix X. I tried to apply it in MATLAB but I find it really difficult. There are couple things I don't know:
should I sum each outputs given all training data (i = 1, ... N, where N is number of inputs for training)
is the gradient calculated correctly
is the numerical gradient (gradAapprox) calculated correctly.
I have following MATLAB codes. I realise I may ask for trivial thing but anyway I hope someone can give me some clues how to find the problem. I suspect the problem is to calculate gradients.
Many thanks.
Main script:
close all
clear all
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
% theta = [10 -30 -30];
x = [0 0; 0 1; 1 0; 1 1];
y = [0.9 0.1 0.1 0.1]';
theta0 = 2*rand(9,1)-1;
options = optimset('gradObj','on','Display','iter');
thetaVec = fminunc(#costFunction,theta0,options,x,y);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
NN(x,theta)'
Cost function:
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
persistent index;
% 1 x x
% 1 x x
% 1 x x
% x = 1 x x
% 1 x x
% 1 x x
% 1 x x
m = size(x,1);
if isempty(index) || index > size(x,1)
index = 1;
end
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')];
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Dew = cell(2,1);
DewApprox = cell(2,1);
% Forward propagation
a0 = x(index,:)';
z1 = theta{1}*[1;a0];
a1 = L(z1);
z2 = theta{2}*[1;a1];
a2 = L(z2);
% Back propagation
d2 = 1/m*(a2 - y(index))*L(z2)*(1-L(z2));
Dew{2} = [1;a1]*d2;
d1 = [1;a1].*(1 - [1;a1]).*theta{2}'*d2;
Dew{1} = [1;a0]*d1(2:end)';
% NNRes = NN(x,theta)';
% jVal = -1/m*sum(NNRes-y)*NNRes*(1-NNRes);
jVal = -1/m*(a2 - y(index))*a2*(1-a2);
gradVal = [Dew{1}(:);Dew{2}(:)];
gradApprox = CalcGradApprox(0.0001);
index = index + 1;
function output = CalcGradApprox(epsilon)
output = zeros(size(gradVal));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a2min = NN(x(index,:),thetaMin)';
a2max = NN(x(index,:),thetaMax)';
jValMin = -1/m*(a2min-y(index))*a2min*(1-a2min);
jValMax = -1/m*(a2max-y(index))*a2max*(1-a2max);
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
EDIT:
Below I present the correct version of my costFunction for those who may be interested.
function [jVal,gradVal,gradApprox] = costFunction(thetaVec,x,y)
m = size(x,1);
L = #(x) (1 + exp(-x)).^(-1);
NN = #(x,theta) L(theta{2}*[ones(1,size(x,1));L(theta{1}*[ones(size(x,1),1) x]')]);
theta = cell(2,1);
theta{1} = reshape(thetaVec(1:6),[2 3]);
theta{2} = reshape(thetaVec(7:9),[1 3]);
Delta = cell(2,1);
Delta{1} = zeros(size(theta{1}));
Delta{2} = zeros(size(theta{2}));
D = cell(2,1);
D{1} = zeros(size(theta{1}));
D{2} = zeros(size(theta{2}));
jVal = 0;
for in = 1:size(x,1)
% Forward propagation
a1 = [1;x(in,:)']; % added bias to a0
z2 = theta{1}*a1;
a2 = [1;L(z2)]; % added bias to a1
z3 = theta{2}*a2;
a3 = L(z3);
% Back propagation
d3 = a3 - y(in);
d2 = theta{2}'*d3.*a2.*(1 - a2);
Delta{2} = Delta{2} + d3*a2';
Delta{1} = Delta{1} + d2(2:end)*a1';
jVal = jVal + sum( y(in)*log(a3) + (1-y(in))*log(1-a3) );
end
D{1} = 1/m*Delta{1};
D{2} = 1/m*Delta{2};
jVal = -1/m*jVal;
gradVal = [D{1}(:);D{2}(:)];
gradApprox = CalcGradApprox(x(in,:),0.0001);
% Nested function to calculate gradApprox
function output = CalcGradApprox(x,epsilon)
output = zeros(size(thetaVec));
for n=1:length(thetaVec)
thetaVecMin = thetaVec;
thetaVecMax = thetaVec;
thetaVecMin(n) = thetaVec(n) - epsilon;
thetaVecMax(n) = thetaVec(n) + epsilon;
thetaMin = cell(2,1);
thetaMax = cell(2,1);
thetaMin{1} = reshape(thetaVecMin(1:6),[2 3]);
thetaMin{2} = reshape(thetaVecMin(7:9),[1 3]);
thetaMax{1} = reshape(thetaVecMax(1:6),[2 3]);
thetaMax{2} = reshape(thetaVecMax(7:9),[1 3]);
a3min = NN(x,thetaMin)';
a3max = NN(x,thetaMax)';
jValMin = 0;
jValMax = 0;
for inn=1:size(x,1)
jValMin = jValMin + sum( y(inn)*log(a3min) + (1-y(inn))*log(1-a3min) );
jValMax = jValMax + sum( y(inn)*log(a3max) + (1-y(inn))*log(1-a3max) );
end
jValMin = 1/m*jValMin;
jValMax = 1/m*jValMax;
output(n) = (jValMax - jValMin)/2/epsilon;
end
end
end
I've only had a quick eyeball over your code. Here are some pointers.
Q1
should I sum each outputs given all training data (i = 1, ... N, where
N is number of inputs for training)
If you are talking in relation to the cost function, it is normal to sum and normalise by the number of training examples in order to provide comparison between.
I can't tell from the code whether you have a vectorised implementation which will change the answer. Note that the sum function will only sum up a single dimension at a time - meaning if you have a (M by N) array, sum will result in a 1 by N array.
The cost function should have a scalar output.
Q2
is the gradient calculated correctly
The gradient is not calculated correctly - specifically the deltas look wrong. Try following Andrew Ng's notes [PDF] they are very good.
Q3
is the numerical gradient (gradAapprox) calculated correctly.
This line looks a bit suspect. Does this make more sense?
output(n) = (jValMax - jValMin)/(2*epsilon);
EDIT: I actually can't make heads or tails of your gradient approximation. You should only use forward propagation and small tweaks in the parameters to compute the gradient. Good luck!
Why doesn't color filter below find green peppers?
The code:
function [ outhsv ] = ColorFilter( hsv, h, s )
%COLORFILTER Summary of this function goes here
% Detailed explanation goes here
if nargin < 2
h = [];
end
if nargin < 3
s = [];
end
if size(h,2)==1
h = padarray(h, [0 1], 1/100, 'post');
end
if size(s,2)==1
s = padarray(s, [0 1], 1/100, 'post');
end
if isempty(h)
v_of_h = ones(size(hsv,1), size(hsv,2));
else
v_of_h = WeightFunction( hsv(:,:,1), h(:,1), h(:,2));
end
if isempty(s)
v_of_s = ones(size(hsv,1), size(hsv,2));
else
v_of_s = WeightFunctionOnce( hsv(:,:,2), s(:,1), s(:,2));
end
outhsv = hsv;
outhsv(:,:,3) = hsv(:,:,3) .* v_of_h .* v_of_s;
function y = WeightFunction( x, mu, sigma )
%y = WeightFunctionOnce(x,mu,sigma) + WeightFunctionOnce(x-1,mu,sigma);
y = 1 - (1-WeightFunctionOnce(x,mu,sigma)) .* (1-WeightFunctionOnce(x-1,mu,sigma));
function y = WeightFunctionOnce( x, mu, sigma )
if nargin<2
mu=0;
elseif nargin<3
sigma=1./100.;
end
if any(size(mu) ~= size(sigma))
error('mu and sigma should be of the same size');
end
y = zeros([size(x) numel(mu)]);
for i=1:numel(mu)
y(:,:,i) = exp(-((x - mu(i)) .^ 2 ./ (2 .* sigma(i) .^ 2)));
end
%y = sum(y,3)/size(y,3);
y = 1-prod(1-y,3);
Display code:
hue = 120;
h = [hue/360 0.05];
s = [];
rgb1 = imread('huescale.png');
%rgb1 = imread('peppers.png');
hsv1 = rgb2hsv(rgb1);
hsv2 = ColorFilter(hsv1, h, s);
rgb2 = hsv2rgb(hsv2);
bitmask = hsv1(:,:,1)>(h(1)-h(2)) & hsv1(:,:,1)<(h(1)+h(2));
figure;
subplot(3,1,1); imshow(rgb1);
subplot(3,1,2); imshow(rgb2);
subplot(3,1,3); imshow(bitmask);
result on scale
(works)
result on peppers:
(does not)
Why?
If you looked closer at the H values, those green peppers are kind of yellowish, so you might want to widen the rule a bit.
I would suggest something in between 0.15 and 0.5. You can also combine with saturation channel, say only consider portions of images that are vibrant, i.e., we want to get rid of the onions. Try the following codes to get a preview.
hsv_dat = rgb2hsv(imread('peppers.png'));
imagesc(hsv_dat(:,:,1) > 0.15 & hsv_dat(:,:,1) < 0.5 & hsv_dat(:,:,2) > 0.3)
colormap(gray)
You should get