What values do you supply to the sigmoid activation function? - neural-network

I have my sigmoid activation function:
s(x) = 1/(1+e^(-x))
I have my output neuron:
Expecting = 1
Actual = 1.13
I know the value that comes out of the sigmoid activation function is 1.1254 but I can't figure out which values to plug in to get that result.

x = 1.1254
If you plug this x into your sigmoid function you get:
s(x) = 1 / (1 + e^x)
= 1 / (1 + 2.71828^(-1.1254))
= 0.7550
The derivative of the sigmoid, s'(x) is:
s'(x) = s(x) * (1 - s(x)), or
s'(x) = 0.7550 * (1 - 0.7550)
= 0.1850
As #Engineero points out in the comments, e is the base of natural logarithms and is approximately equal to 2.71828.

Related

MATLAB math operation with the power function is giving me very off results (giving me extremely small numbers)

I wrote the following code to plot a graph. The math operation for value does not make sense.
I have captured a screenshot. It looks like this.
.
The new_x and new_y values are repectively -3 and 5.0249 in this iteration, but the 'value' calculated is some absurdly small number.
I tried the exact same thing in the command window using the same inputs and it works fine...
for i = 1:length(x_vals)
syms f(y);
f(y) = 3 * x_vals(i)^(7) + 2 * y^(5) - x_vals(i)^(3) + y^(3);
df = diff(f,y);
f_func = matlabFunction(f(y));
f_prime = matlabFunction(df);
a = 4;
tolerance = 10.^-6;
root(i) = Newton(f_func, f_prime, a, tolerance);
%value(i) = G(x_vals(i),root(i));
new_x = double(x_vals(i));
new_y = double(root(i));
value = 3 * new_x^7 + 2 * new_y^5 - new_x^3 + new_y^3;
if (value >= -3 - e && value <= -3 + e)
y_vals(i) = root(i);
z_vals(i) = -3;
end
end

Matlab : Is the square operation in real domain the Conjugate operation in complex domain?

I am trying to evaluate the expression z = (x-y)^2 in real domain and its corresponding adaptation in complex domain. For real domain, this expression is implemented as
let
x = 5;
y = 2;
z = (x-y)^2
z =
9
In complex domain, the expression would become (please correct me if wrong )
z_c = (x_c - y_c)(x_c - y_c)* This is implemented in Matlab by
>> x_c = 5 + 0.9i;
y_c = 2 - 0.34i;
z_c = (x_c-y_c)*conj((x_c -y_c))
z_c =
10.5376
The * operator for conjugate in maths is implemented by conj()
The answers are different and am I using the correct operator?
You have many ways to deal with that in MATLAB:
x = 5 + 2i;
y = 2 - 4i;
% Method A
(x - y) * conj(x - y);
% Method B
(x - y)' * (x - y);
% Method C
norm(x - y, 2) ^ 2;
The first method is using the Conjugate Operator.
This method is written assuming both x and y are scalar.
Method B is using the definition of Inner Product (The ' is the Vector Adjoint Operator - Transpose and Conjugate).
It will work for vectors as well.
Method C is using the built in norm() function of MATLAB.
Enjoy.

Fresnel - Kirchoff integral in paraxial approximation

I'd like to integrate this equation numerically.
b = 62.5*10^-6;
a = 4*10^-6;
n_co = 1.473;
n_cl = n_co - 0.016;
n_s = 1.37868*10^-5;
L = 457.9*10^-9;
k = 2*%pi/L;
z = 0.4 ;
x =0.3;
int(exp(-1i*(2*k*(n_cl-n_s)*(sqrt(b^2 - x_p.^2))+2*k*(n_co-n_cl)*(sqrt(a^2-x_p.^2))))*exp(1i*((x-x_p).^2)/2*z),-a,a);
so what should i do to get result numerically. Since it contains exponential and complex functions hard to evaluate.
Thanks in advance
You can integrate a function f in the interval (from, to) with integral(f, from, to). For example, you can compute the value of the integral of exp(-x^2) from x=1 to 5 with:
f = #(x) exp(-x^2);
integral(f, 1, 5)

Neural Networks: Sigmoid Activation Function for continuous output variable

Okay, so I am in the middle of Andrew Ng's machine learning course on coursera and would like to adapt the neural network which was completed as part of assignment 4.
In particular, the neural network which I had completed correctly as part of the assignment was as follows:
Sigmoid activation function: g(z) = 1/(1+e^(-z))
10 output units, each which could take 0 or 1
1 hidden layer
Back-propagation method used to minimize cost function
Cost function:
where L=number of layers, s_l = number of units in layer l, m = number of training examples, K = number of output units
Now I want to adjust the exercise so that there is one continuous output unit that takes any value between [0,1] and I am trying to work out what needs to change, so far I have
Replaced the data with my own, i.e.,such that the output is continuous variable between 0 and 1
Updated references to the number of output units
Updated the cost function in the back-propagation algorithm to:
where a_3 is the value of the output unit determined from forward propagation.
I am certain that something else must change as the gradient checking method shows the gradient determined by back-propagation and that by the numerical approximation no longer match up. I did not change the sigmoid gradient; it is left at f(z)*(1-f(z)) where f(z) is the sigmoid function 1/(1+e^(-z))) nor did I update the numerical approximation of the derivative formula; simply (J(theta+e) - J(theta-e))/(2e).
Can anyone advise of what other steps would be required?
Coded in Matlab as follows:
% FORWARD PROPAGATION
% input layer
a1 = [ones(m,1),X];
% hidden layer
z2 = a1*Theta1';
a2 = sigmoid(z2);
a2 = [ones(m,1),a2];
% output layer
z3 = a2*Theta2';
a3 = sigmoid(z3);
% BACKWARD PROPAGATION
delta3 = a3 - y;
delta2 = delta3*Theta2(:,2:end).*sigmoidGradient(z2);
Theta1_grad = (delta2'*a1)/m;
Theta2_grad = (delta3'*a2)/m;
% COST FUNCTION
J = 1/(2 * m) * sum( (a3-y).^2 );
% Implement regularization with the cost function and gradients.
Theta1_grad(:,2:end) = Theta1_grad(:,2:end) + Theta1(:,2:end)*lambda/m;
Theta2_grad(:,2:end) = Theta2_grad(:,2:end) + Theta2(:,2:end)*lambda/m;
J = J + lambda/(2*m)*( sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2)));
I have since realised that this question is similar to that asked by #Mikhail Erofeev on StackOverflow, however in this case I wish the continuous variable to be between 0 and 1 and therefore use a sigmoid function.
First, your cost function should be:
J = 1/m * sum( (a3-y).^2 );
I think your Theta2_grad = (delta3'*a2)/m;is expected to match the numerical approximation after changed to delta3 = 1/2 * (a3 - y);).
Check this slide for more details.
EDIT:
In case there is some minor discrepancy between our codes, I pasted my code below for your reference. The code has already been compared with numerical approximation function checkNNGradients(lambda);, the Relative Difference is less than 1e-4 (not meets the 1e-11 requirement by Dr.Andrew Ng though)
function [J grad] = nnCostFunctionRegression(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
m = size(X, 1);
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
X = [ones(m, 1) X];
z1 = sigmoid(X * Theta1');
zs = z1;
z1 = [ones(m, 1) z1];
z2 = z1 * Theta2';
ht = sigmoid(z2);
y_recode = zeros(length(y),num_labels);
for i=1:length(y)
y_recode(i,y(i))=1;
end
y = y_recode;
regularization=lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
J=1/(m)*sum(sum((ht - y).^2))+regularization;
delta_3 = 1/2*(ht - y);
delta_2 = delta_3 * Theta2(:,2:end) .* sigmoidGradient(X * Theta1');
delta_cap2 = delta_3' * z1;
delta_cap1 = delta_2' * X;
Theta1_grad = ((1/m) * delta_cap1)+ ((lambda/m) * (Theta1));
Theta2_grad = ((1/m) * delta_cap2)+ ((lambda/m) * (Theta2));
Theta1_grad(:,1) = Theta1_grad(:,1)-((lambda/m) * (Theta1(:,1)));
Theta2_grad(:,1) = Theta2_grad(:,1)-((lambda/m) * (Theta2(:,1)));
grad = [Theta1_grad(:) ; Theta2_grad(:)];
end
If you want to have continuous output try not to use sigmoid activation when computing target value.
a1 = [ones(m, 1) X];
a2 = sigmoid(X * Theta1');
a2 = [ones(m, 1) z1];
a3 = z1 * Theta2';
ht = a3;
Normalize input before using it in nnCostFunction. Everything else remains same.

Matlab: How to solve the system of nonlinear equations with additional parameters?

I would like to create a function that finds the parameters p and q of Bass diffusion model, given the data of two time periods.
The model (equation) is the following:
n(T) = p*m + (q-p)*n(T-1) + q/m*n(T-1)^2
where
n(T) = number of addoptions occuring in period T
n(T-1) = number of cumulative adoptions that occured before T
p = coefficient of innovation
q = coefficient of imitation
m = number of eventual adopters
for example if m = 3.000.000
and the data for the years below is the following:
2000: n(T) = 820, n(T-1) = 0
2005: n(T) = 25000, n(T-1) = 18000
then the following equation system has to be solved (in order to determine the values of p and q):
p*m + (q-p)*0 + q/3.000.000 * 0^2 == 820
p*m + (q-p)*18000 + q/3.000.000 * 18000^2 == 25000
By following Matlab documentation I tried to create a function Bass:
function F = Bass(m, p, q, cummulativeAdoptersBefore)
F = [p*m + (q-p)*cummulativeAdoptersBefore(1) + q/m*cummulativeAdoptersBefore(1).^2;
p*m + (q-p)*cummulativeAdoptersBefore(2) + q/m*cummulativeAdoptersBefore(2).^2];
end
Which should be used in fsolve(#Bass,x0,options) but in this case m, p, q, cummulativeAdoptersBefore(1), and cummulativeAdoptersBefore(2) should be given in x0 and all variables would be considered as unknown instead of just the latter two.
Does anyone know how to solve the system of equations such as above?
Thank you!
fsolve() seeks to minimize the function you supply as argument. Thus, you have to change your equations to
p*m + (q-p)*0 + q/3.000.000 * 0^2 - 820 == 0
p*m + (q-p)*18000 + q/3.000.000 * 18000^2 - 25000 == 0
and in Matlab syntax
function F = Bass(m, p, q, cumulativeAdoptersBefore, cumulativeAdoptersAfter)
F = [p*m + (q-p)*cumulativeAdoptersBefore(1) ...
+ q/m *cumulativeAdoptersBefore(1).^2
- cumulativeAdoptersAfter(1);
p*m + (q-p)*cumulativeAdoptersBefore(2) ...
+ q/m *cumulativeAdoptersBefore(2).^2
- cumulativeAdoptersAfter(2)];
end
Note: There is a typo in your Bass function (multiplication instead of sum).
Now you have a function, which takes more parameters than there are unkowns.
One option is to create an anonymous function, which only takes the unknowns as arguments and to fix the other parameters via a closure.
To fit the unkowns p and q, you could use something like
cumulativeAdoptersBefore = [0, 1800];
cumulativeAdoptersAfter = [820, 25000];
m = 3e6;
x = [0, 0]; %# Probably, this is no good starting guess.
xopt = fsolve(#(x) Bass(m, x(1), x(2), cumulativeAdoptersBefore, cumulativeAdoptersAfter), x0);
So fsolve() sees a function taking only a single argument (a vector with two elements) and it also returns a vector value.