I am trying to solve a model of 4 equations, where everything is real. It is essentially a pipe branch problem with "1" being upstream of the pump (with some head on it), and "2" and "3" are downstream w/o head. The units are English.
Anyway, I'm just solving the system of 4 equations and 4 unknowns and looking at a plot of head (ha) vs. total flow rate (Q1). Across this space though I only am collecting points in a vertical line, i.e. the head changes but the flowrate doesn't.
I was thinking the problem may be some variables are being held onto, but I treat "ha" the same as "Q1" so that doesn't seem likely... Realistically, this is matching a system curve to a pump curve, and of course that point moves when the system changes (both flowrate and head), so it seems I am doing something wrong....
Here is the script:
clear sol1 sol2 sol3 sol4 S q h
clc; clearvars; close all;
syms Q1 Q2 Q3 ha
D1 = 1.25/12; Z1 = 0; P1 = 62.4*10; A1 = pi*(1/4)*D1*D1; f1 = 0.011; L1 = 50;
D2 = 0.5/12; Z2 = 25; P2 = 0; A2 = pi*(1/4)*D2*D2; f2 = 0.011; L2 = 100; %V2 = Q2 / A2;
D3 = 0.5/12; Z3 = 10; P3 = 0; A3 = pi*(1/4)*D3*D3; f3 = 0.011; L3 = 100; %V3 = Q3 / A3;
gamma = 62.468;
origPumpData = [0 150; 0.4 148; 0.8 145; 1.2 143; 1.6 141; 2 139; 2.4 138; 2.8 136; 3.2 134; 3.6 132; 4 130; 4.4 128; 4.8 127; 5.2 125; 5.6 124; 6 121; 6.4 118; 6.8 117; 7.2 115; 7.6 113; 8 111; 8.4 110; 8.8 108; 9.2 106; 9.6 104; 10 102; 10.4 100; 10.8 98; 11.2 97;
11.6 94; 12.0 93; 12.4 91; 12.8 89; 13.2 87; 13.6 86; 14 84; 14.4 82; 14.8 80; 15.2 78; 15.6 76; 16 74; 16.4 72; 16.6 70];
pumpData(:,1) = origPumpData(:,1)/448.8312;
pumpData(:,2) = origPumpData(:,2);
polyFit = fit(pumpData(:,1),pumpData(:,2),'poly2');
assume(in(Q1,'real') & Q1 > 0.0001 & Q1 < 1 );
assume(in(Q2,'real') & Q2 > 0.0001 & Q2 < 1 );
assume(in(Q3,'real') & Q3 > 0.0001 & Q3 < 1 ) ;
assume(in(ha,'real') & ha > 0.0001 & ha < 1000.0);
eq1 = #(Q1) polyFit.p1*Q1*Q1 + polyFit.p2*Q1 + polyFit.p3;
eq2 = #(P1,Z1,F1,Q1,A1,L1,D1,P2,Z2,F2,Q2,A2,L2,D2) P1/gamma + ((Q1/A1)^2)/64.4 + Z1 - hf(F1,L1,D1,Q1,A1) - hf(F2,L2,D2,Q2,A2) + ha == P2/gamma + ((Q2/A2)^2)/64.4 + Z2;
eq3 = #(P1,Z1,F1,Q1,A1,L1,D1,P3,Z3,F3,Q3,A3,L3,D3) P1/gamma + ((Q1/A1)^2)/64.4 + Z1 - hf(F1,L1,D1,Q1,A1) - hf(F3,L3,D3,Q3,A3) + ha == P3/gamma + ((Q3/A3)^2)/64.4 + Z3;
eq4 = Q1 == Q2 + Q3;
guesses = [0.07116936917086152675558661517184;
0.035125850031368916048762418970837;
0.036043519139492610706824196201003;
303.02361035523126099594683749354];
for i = 1:1:2
disp(i)
disp(Z1)
Z1 = Z1-5
f1 = f1+0.01
f3 = f3 + 0.01
S(i) = vpasolve([eq1(Q1) eq2(P1,Z1,f1,Q1,A1,L1,D1,P2,Z2,f2,Q2,A2,L2,D2) eq3(P1,Z1,f1,Q1,A1,L1,D1,P3,Z3,f3,Q3,A3,L3,D3) eq4], [Q1 Q2 Q3 ha], guesses )
S(i).Q1
S(i).Q2
S(i).Q3
S(i).ha
q(i) = S(i).Q1
h(i) = S(i).ha
guesses = [S(i).Q1 S(i).Q2 S(i).Q3 S(i).ha]'
clear S
end
plot( q, h, '*b');
function hf = hf(f,L,D,Q,A)
hf = f*(L/D)*(1/64.4)*(Q/A)^2;
end
you're solving eq1 == 0(default) and computing Q1, independent of other equations and parameters. that's why you get constant Q1 in every iteration.
I want to use solve() to solve a large system of linear equations. The solve() function needs equations and variables. I use a loop to generate the equations and my variables are contained in a large array. This is a simple code of what I am trying to do:
x = sym('x',[1 3])
eqn = sym('eqn', [1,3])
eqn1 = 2*x1 + x2 + x3 - 2 == 0
eqn2 = 2*x2 -x2 -x1- x3 == 3
eqn3 = 2*x2+ x1 + 3*x3 == -10
Y = solve(eqn, x)
MATLAB does not recognize my variable x1. I have solved the same system using the following code:
syms x1 x2 x3
eqn1 = 2*x1 + x2 + x3 == 2
eqn2 = 2*x2 -x2 -x1 - x3 == 3
eqn3 = 2*x2+ x1 + 3*x3 == -10
X = solve([eqn1, eqn2, eqn3, eqn4], [x1 x2 x3])
structfun(#subs,X)
But this is useless for a very large number of equations. What am I doing wrong?
You don't need symbolic (syms) for that. This is a standard linear system of equations that can be represented as:
Ax = b where A = [2 1 1; -1 1 -1; 1 2 3], x = [x1; x2; x3] and b = [0; 3; -10]
To solve for x, you would first define
A = [2 1 1; -1 1 -1; 1 2 3]
and
b = [0; 3; -10]
and then solve using
x = A\b
PS. There are some odd things in your question, eg. in eq.2 eqn2 = 2*x2 -x2 -x1- x3 == 3 I assume you omitted simplying this to -x1 +x2 -x3 ==3.
PS2. This is pretty standard Matlab, you can find a lot of info under the standard mldivide page in the documentation along with a lot similar questions here on SO.
This is my matlab code to predict [1;1;1] given [1;0;1] :
m = 1;
alpha = .00001;
x = [1;0;1;0;0;0;0;0;0];
y = [1;1;1;0;0;0;0;0;0];
theta1 = [4.7300;3.2800;1.4600;0;0;0;4.7300;3.2800;1.4600];
theta1 = theta1 - (alpha/m .* (x .* theta1-y)' * x)'
theta1 = reshape(theta1(1:9) , 3 , 3)
sigmoid(theta1 * [1; 0; 1])
x = [1;0;1;0;0;0;0;0;0];
y = [1; 1; 1;0;0;0;0;0;0];
theta2 = [8.892;6.167;2.745;8.892;6.167;2.745;8.892;6.167;2.745];
theta2 = theta2 - (alpha/m .* (x .* theta2-y)' * x)'
theta2 = reshape(theta2(1:9) , 3 , 3)
sigmoid(theta2 * [1; 0; 1])
x = [1;0;1;0;0;0;0;0;0];
y = [1; 1; 1;0;0;0;0;0;0];
theta3 = [9.446;6.55;2.916;9.351;6.485;2.886;8.836;6.127;2.727];
theta3 = theta3 - (alpha/m .* (x .* theta3-y)' * x)'
theta3 = reshape(theta3(1:9) , 3 , 3)
sigmoid(theta3 * [1; 0; 1])
I'm computing theta1, theta2, theta3 individually but I think they
should be linked between each computation ?
Though gradient descent appears to be working as :
sigmoid(theta1 * [1; 0; 1]) =
0.9999
0.9986
0.9488
sigmoid(theta2 * [1; 0; 1]) =
1.0000
1.0000
0.9959
sigmoid(theta3 * [1; 0; 1]) =
1.0000
1.0000
0.9965
This shows for each theta value (layer in the network) the prediction is moving closer to [1;1;1]
Update : sigmoid function :
function g = sigmoid(z)
g = 1.0 ./ (1.0 + exp(-z));
end
Update2 :
After extended discussion with user davidhigh who provided key insights have made following changes :
x = [1;0;1];
y = [1;1;1];
theta1 =
4.7300 3.2800 1.4600
0 0 0
4.7300 3.2800 1.4600
theta2 =
8.8920 8.8920 8.8920
6.1670 6.1670 6.1670
2.7450 2.7450 2.7450
theta3 =
9.4460 6.5500 2.9160
9.3510 6.4850 2.8860
8.8360 6.1270 2.7270
The crux of my issue is that I don't feed output of each layer into the next layer, once I made this change I get better result :
z1 = sigmoid(theta1 * x)
z1 =
0.9980
0.5000
0.9980
z2 = sigmoid(theta2 * z1)
z2 =
1.0000
1.0000
0.9989
z3 = sigmoid(theta3 * z2)
z3 =
1.0000
1.0000
1.0000
z3 is the predicted value which correctly is [1;1;1;] whereas previously it is approx [1;1;1;]
It might seem a simple question. I need it, though. Let's assume we have two equations:
2 * y + x + 1 = 0
and
y - 2 * x = 0
I would like to find their bisection which can be calculated from this equation:
|x + 2 * y + 1| |-2 *x + y |
------------------- = -----------------
(sqrt(2^2 + 1^2)) (sqrt(1^2 + 2^2))
To make the long story short, we only need to solve this below system of equation:
2 * y + x + 1 = -2 *x + y
and
2 * y + x + 1 = 2 *x - y
However, using solve function of MATLAB:
syms x y
eqn1 = 2 * y + x + 1 == -2 *x + y ;
eqn2 = 2 * y + x + 1 == 2 *x - y ;
[x, y] = solve (eqn1 , eqn2, x, y) ;
Will give me:
x = -1/5 and y = -2/5
But, I am looking for the result equations, which is:
y = -3 * x - 1 and 3 * y = 2 * x - 1
So, does anyone know how I can get the above line equation instead of the result point? Thanks,
The following should solve both equations with y on the left-hand-side:
y1 = solve(eqn1,y)
y2 = solve(eqn2,y)
Result:
y1 =
- 3*x - 1
y2 =
x/3 - 1/3
As an aside, it would be much faster to solve this system by thinking of it it as a matrix inversion problem Ax=b rather than using MATLAB's symbolic tools:
A = [1 2; -2 1];
b = [-1; 0];
x = A\b
Result:
x =
-0.2000
-0.4000
So, I'm hoping this is a real dumb thing I'm doing, and there's an easy answer. I'm trying to train a 2x3x1 neural network to do the XOR problem. It wasn't working, so I decided to dig in to see what was happening. Finally, I decided to assign the weights my self. This was the weight vector I came up with:
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
(In Matlab notation). I deliberately tried to make no two weights be the same (barring the zeros)
And, my code, really simple in matlab is
function layer2 = xornn(iters)
if nargin < 1
iters = 50
end
function s = sigmoid(X)
s = 1.0 ./ (1.0 + exp(-X));
end
T = [0 1 1 0];
X = [0 0 1 1; 0 1 0 1; 1 1 1 1];
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
for i = [1:iters]
layer1 = [sigmoid(theta1 * X); 1 1 1 1];
layer2 = sigmoid(theta2 * layer1)
delta2 = T - layer2;
delta1 = layer1 .* (1-layer1) .* (theta2' * delta2);
% remove the bias from delta 1. There's no real point in a delta on the bias.
delta1 = delta1(1:3,:);
theta2d = delta2 * layer1';
theta1d = delta1 * X';
theta1 = theta1 - 0.1 * theta1d;
theta2 = theta2 - 0.1 * theta2d;
end
end
I believe that's right. I tested various parameters (of the thetas) with the finite differences method to see if they were right, and they seemed to be.
But, when I run it, it eventually just all boils down to returning all zeros. If I do xornn(1) (for 1 iteration) I get
0.0027 0.9966 0.9904 0.0008
But, if I do xornn(35)
0.0026 0.9949 0.9572 0.0007
(It's started a descent in the wrong direction) and by the time I get to xornn(45) I get
0.0018 0.0975 0.0000 0.0003
If I run it for 10,000 iterations, it just returns all 0's.
What is going on? Must I add regularization? I would have thought such a simple network wouldn't need it. But, regardless, why does it move away from an obvious good solution that I have hand fed it?
Thanks!
AAARRGGHHH! The solution was simply a matter of changing
theta1 = theta1 - 0.1 * theta1d;
theta2 = theta2 - 0.1 * theta2d;
to
theta1 = theta1 + 0.1 * theta1d;
theta2 = theta2 + 0.1 * theta2d;
sigh
Now tho, I need to figure out how I'm computing the negative derivative somehow when what I thought I was computing was the ... Never mind. I'll post here anyway, just in case it helps someone else.
So, z = is the sum of inputs to the sigmoid, and y is the output of the sigmoid.
C = -(T * Log[y] + (1-T) * Log[(1-y))
dC/dy = -((T/y) - (1-T)/(1-y))
= -((T(1-y)-y(1-T))/(y(1-y)))
= -((T-Ty-y+Ty)/(y(1-y)))
= -((T-y)/(y(1-y)))
= ((y-T)/(y(1-y))) # This is the source of all my woes.
dy/dz = y(1-y)
dC/dz = ((y-T)/(y(1-y))) * y(1-y)
= (y-T)
So, the problem, is that I accidentally was computing T-y, because I forgot about the negative sign in front of the cost function. Then, I was subtracting what I thought was the gradient, but was in fact the negative gradient. And, there. That's the problem.
Once I did that:
function layer2 = xornn(iters)
if nargin < 1
iters = 50
end
function s = sigmoid(X)
s = 1.0 ./ (1.0 + exp(-X));
end
T = [0 1 1 0];
X = [0 0 1 1; 0 1 0 1; 1 1 1 1];
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
for i = [1:iters]
layer1 = [sigmoid(theta1 * X); 1 1 1 1];
layer2 = sigmoid(theta2 * layer1)
delta2 = T - layer2;
delta1 = layer1 .* (1-layer1) .* (theta2' * delta2);
% remove the bias from delta 1. There's no real point in a delta on the bias.
delta1 = delta1(1:3,:);
theta2d = delta2 * layer1';
theta1d = delta1 * X';
theta1 = theta1 + 0.1 * theta1d;
theta2 = theta2 + 0.1 * theta2d;
end
end
xornn(50) returns 0.0028 0.9972 0.9948 0.0009 and
xornn(10000) returns 0.0016 0.9989 0.9993 0.0005
Phew! Maybe this will help someone else in debugging their version..