I am trying to initialize the following NN model:
def initialize_parameters(n_x, n_h, n_y):
W1 = np.random.randn(4,2) *0.01
b1 = np.zeros((4,1))
W2 = np.random.randn(1,4) * 0.01
b2 = np.zeros((1,1))
assert (W1.shape == (n_h, n_x))
assert (b1.shape == (n_h, 1))
assert (W2.shape == (n_y, n_h))
assert (b2.shape == (n_y, 1))
parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2}
return parameters
My output comes out as:
W1 = [[-0.00416758 -0.00056267]
[-0.02136196 0.01640271]
[-0.01793436 -0.00841747]
[ 0.00502881 -0.01245288]]
b1 = [[ 0.]
[ 0.]
[ 0.]
[ 0.]]
W2 = [[-0.01057952 -0.00909008 0.00551454 0.02292208]]
b2 = [[ 0.]]
Whereas the correct answer should be:
W1 [[-0.00416758 -0.00056267] [-0.02136196 0.01640271] [-0.01793436 -0.00841747] [ 0.00502881 -0.01245288]]
b1 [[ 0.] [ 0.] [ 0.] [ 0.]]
W2 [[-0.01057952 -0.00909008 0.00551454 0.02292208]]
b2 [[ 0.]]
W1, and b1 are obviously wrong, but I cannot make it work any other way. Newbie here.
It doesn't accept because you hardcoded all values f.ex. 4 and 2 in np.random.randn(4,2) *0.01. Instead use the arguments n_h, n_x.
Related
I am working through Andrew Ng new deep learning Coursera course.
We are implementing the following code :
def forward_propagation_with_dropout(X, parameters, keep_prob = 0.5):
np.random.seed(1)
# retrieve parameters
W1 = parameters["W1"]
b1 = parameters["b1"]
W2 = parameters["W2"]
b2 = parameters["b2"]
W3 = parameters["W3"]
b3 = parameters["b3"]
# LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SIGMOID
Z1 = np.dot(W1, X) + b1
A1 = relu(Z1)
### START CODE HERE ### (approx. 4 lines) # Steps 1-4 below correspond to the Steps 1-4 described above.
D1 = np.random.rand(*A1.shape) # Step 1: initialize matrix D1 = np.random.rand(..., ...)
D1 = (D1 < 0.5) # Step 2: convert entries of D1 to 0 or 1 (using keep_prob as the threshold)
A1 = A1*D1 # Step 3: shut down some neurons of A1
A1 = A1 / keep_prob # Step 4: scale the value of neurons that haven't been shut down
### END CODE HERE ###
Z2 = np.dot(W2, A1) + b2
A2 = relu(Z2)
### START CODE HERE ### (approx. 4 lines)
D2 =np.random.rand(*A2.shape) # Step 1: initialize matrix D2 = np.random.rand(..., ...)
D2 = (D2 < 0.5) # Step 2: convert entries of D2 to 0 or 1 (using keep_prob as the threshold)
A2 = A2 * D2 # Step 3: shut down some neurons of A2
A2 = A2 / keep_prob # Step 4: scale the value of neurons that haven't been shut down
### END CODE HERE ###
Z3 = np.dot(W3, A2) + b3
A3 = sigmoid(Z3)
cache = (Z1, D1, A1, W1, b1, Z2, D2, A2, W2, b2, Z3, A3, W3, b3)
return A3, cache
Calling:
X_assess, parameters = forward_propagation_with_dropout_test_case()
A3, cache = forward_propagation_with_dropout(X_assess, parameters, keep_prob = 0.7)
print ("A3 = " + str(A3))
My output was :
A3 = [[ 0.36974721 0.49683389 0.04565099 0.49683389 0.36974721]]
The expected output should be :
A3 = [[ 0.36974721 0.00305176 0.04565099 0.49683389 0.36974721]]
Only one number difference. Any ideas why ?
I think it is because of the way I shaped D1 and D2.
I think it is because you put D1 = (D1 < 0.5) and D2 = (D2 < 0.5)
You need to put "keep_prob" instead of 0.5
A = rand(4,2);
B = rand(4,3)
Now after performing some operations on B (roots, derivative etc) we get a new matrix B1, whose dimensions are size(B1) = size(B),
The operation I want to perform
B.' * ( A - B1.')
Like when each element of B.' multiplies with A, at that same time, The corresponding element from element B1 gets subtracted from A before multiplication.
The final dimensions need to be of What we would usually get from multiplication of B.' * A
Note - dimensions of intialized matrices change at each runtime so no manual operations
EXAMPLE
Lets say we have
A = 2x2
[ x1, x2 ]
[ y1, y2 ]
and
B = 2X1
[a1]
[b1]
and
B1 = 2x1
[a11]
[b11]
So during a simple multiplication of B.' * A
[(a1 * x1 + b1 * y1), (a1 * x2 + b1 * y2)]
I want to subtract B1 such that
[ (a1 * (x1-a11) + b1 * (y1-b11)), (a1 * (x2-a11) + b1 * (y2-b11))]
Example inputs of different size:
INPUTS
B =
[ a1 b1;
a2 b2;
a3 b3;
a4 b4]
A =
[ x11 x12 x13;
x21 x22 x23;
x31 x32 x33;
x41 x42 x43]
B1 =
[a10 b10;
a20 b20;
a30 b30;
a40 b40]
Result =
[b1(x11-b10)+b2(x21-b20)+b3(x31-b30)+b4(x41-b40) b1(x12-b10)+b2(x22-b20)+b3(x32-b30)+b4(x42-b40) b1(x13-b10)+b2(x23-b20)+b3(x33-b30)+b4(x43-b40);
a1(x11-a10)+a2(x21-a20)+a3(x31-b30)+a4(x41-a40) a1(x12-a10)+a2(x22-a20)+a3(x32-a30)+a4(x42-a40) a1(x13-a10)+a2(x23-a20)+a3(x33-a30)+a4(x43-a40)]
I assumed that size(B,2) >= size(A,2):
A = rand(4,2);
B = rand(4,3);
B1 = rand(size(B)).*B;
res = B' * ( A - B1(:,1:size(A,2)))
See code below
import numpy as np
from scipy.integrate import odeint
# Define constants
U = 1000
a = 5
Ta0 = (37+273)
V_tot = 6
FA0 = 14.7
CA0 = 9.3
deltaCp = 0
Hrx_T = -6900
sumTheta_Cp = 159
#Integration conditions
V = np.linspace(0,V_tot,10)
X0 = 0
T0 = 310
#Define reaction rate constant as function of temperature
def k(T):
return 31.1*np.exp(7906*((T-360)/(T*360)))
def Kc(T):
return 3.03*np.exp(-830.3*((T-333)/(T*333)))
#Define rate law as function of temperature, conversion
def rA(T,X):
return -k(T)*CA0*(1-(1+(1/Kc(T))*X))
#Energy balance equation for tubular reactor
def ode_vector(diff_var,V):
T = diff_var[0]
X = diff_var[1]
dT_dV=(U*a*(Ta0-T)+rA(T,X)*Hrx_T/(FA0*sumTheta_Cp+deltaCp*X))
dX_dV= -rA(T,X)/FA0
d_dV = [dT_dV,dX_dV]
return d_dV
initial_vector = [T0, X0]
solution = odeint(ode_vector,initial_vector,V)
print solution
When I execute this code I get an array of my initial conditions
[[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]
[ 310. 0.]]
Any thoughts on this would be much appreciated.
I have A, b, f matrices as follows :
A = [ 2 1
1 2 ]
B = [ 4; 3 ]
F = [ 1; 1 ]
LB = [ 0; 0 ]
[X, fval, exitflag, output, lambda] = linprog(F, A, B, [], [], LB)
After that, the solution provided by the MATLAB is surprising. It says the value of fval is 1.2169e-013. This is not the solution. Can you help me by identifying the fault?
This is probably very simple but I'm having trouble setting up matrices to solve two linear equations using symbolic objects.
The equations are on the form:
(1) a11*x1 + a12*x2 + b1 = 0
(2) a21*x1 + a22*x2 + b2 = 0
So I have a vector {E}:
[ a11*x1 + a12*x2 + b1 ]
{E} = [ a21*x1 + a22*x2 + b2 ]
I want to get a matrix [A] and a vector {B} so I can solve the equations, i.e.
[A]*{X} + {B} = 0 => {X} = -[A]{B}.
Where
[ x1 ]
{X} = [ x2 ]
[ a11 a12 ]
[A] = [ a21 a22 ]
[ b1 ]
{B} = [ b2 ]
Matrix [A] is just the Jacobian matrix of {E} but what operation do I have to perform on {E} to get {B}, i.e. the terms that don't include an x?
This is what I have done:
x = sym('x', [2 1]);
a = sym('a', [2 2]);
b = sym('b', [2 1]);
E = a*x + b;
A = jacobian(E,x);
n = length(E);
B = -E;
for i = 1:n
for j = 1:n
B(i) = subs(B(i), x(j), 0);
end
end
X = A\B
I'm thinking there must be some function that does this in one line.
So basically my question is: what can I do instead of those for loops?
(I realize this is something very simple and easily found by searching. The problem is I don't know what this is called so I don't know what to look for.)
It is just B = subs(B,x,[0 0])