I'm trying to build a neural network that will take in the solutions to a system of ODE's and predict the parameters of the system. I'm using Julia and in particular, the DiffEqFlux package. The structure of a network is a few simple Dense layers chained together that predict some intermediate parameters (in this case, some chemical reaction free energies), which then feed into some deterministic (non-trained) layers that convert those parameters into the ones that go into the system of equations (in this case, reaction rate constants). I've tried two different approaches from here:
Chain the ODE solve directly on as the last layer of the network. In this case, the loss function is just comparing the inputs to the outputs.
Have the ODE solve in the loss function, so the network output is just the parameters.
However, in neither case can I get Flux.train! to actually run.
A silly little example for the first option that gives the same error I'm getting (I've tried to keep as many things parallel to my actual case as possible, i.e. the solver, etc., although I did omit the intermediate deterministic layers since they don't seem to make a difference) is shown below.
using Flux, DiffEqFlux, DifferentialEquations
# let's use Chris' favorite example, Lotka-Volterra
function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
u0 = [1.0,1.0]
tspan = (0.0,10.0)
# generate a couple sets of solutions to train on
training_params = [[1.5,1.0,3.0,1.0], [1.4,1.1,3.1,0.9]]
training_sols = [solve(ODEProblem(lotka_volterra, u0, tspan, tp)).u[end] for tp in training_params]
model = Chain(Dense(2,3), Dense(3,4), p -> diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end])
# in this case we just want outputs to match inputs
# (actual parameters we're after are outputs of next-to-last layer)
training_data = zip(training_sols, training_sols)
# mean squared error loss
loss(x,y) = Flux.mse(model(x), y)
p = Flux.params(model[1:2])
Flux.train!(loss, p, training_data, ADAM(0.001))
# gives TypeError: in typeassert, expected Float64, got ForwardDiff.Dual{Nothing, Float64, 8}
I've tried all three solver layers, diffeq_adjoint, diffeq_rd, and diffeq_fd, none of which work, but all of which give different errors that I'm having trouble parsing.
For the other option (which I'd actually prefer, but either way would work), just replace the model and loss function definitions as:
model = Chain(Dense(2,3), Dense(3,4))
function loss(x,y)
p = model(x)
sol = diffeq_adjoint(p, ODEProblem(lotka_volterra, u0, tspan, p), Rodas4())[:,end]
Flux.mse(sol, y)
end
The same error is thrown as above.
I've been hacking at this for over a week now and am completely stumped; any ideas?
You're running into https://github.com/JuliaDiffEq/DiffEqFlux.jl/issues/31, i.e. forward-mode AD for the Jacobian doesn't play nice with Flux.jl right now. To get around this, use Rodas4(autodiff=false) instead.
Related
I have a differential equation: dx/dt = a * x. Using Matlab Simulink, I need to solve this equation and output it using Scope block.
The problem is, I don't know how to specify an initial condition value t = 0.
So far I have managed to create a solution that looks like this:
I know that inside integrator, there is a possiblity to set "Initial condition" but I can't figure out how that affects the final result. How can I simply set the value of x at t = 0; e.i. x(0) = 6?
Let's work this problem through analytically first so we know if the model is correct.
dx/dt = a*x % Seperable differential equation
=> (1/x) dx = a dt % Now we can integrate
=> ln(x) = a*t + c % We can determine c using the initial condition x(0)
=> ln(x0) = a*0 + c
=> ln(x0) = c
=> x = exp(a*t + ln(x0)) % Subbing into 3rd line and taking exp of both sides
=> x = x0 * exp(a*t)
So now we have an idea. Let's look at this for t = 0 .. 1, x0 = 6, a = 5:
% Plot x vs t using plain MATLAB
x0 = 6; a = 5;
t = 0:1e-2:1; x = x0*exp(a*t);
plot(t,x)
Now let's make a Simulink model which acts as a numerical integrator. We don't actually need the Integrator block for this application, we simply want to add the change at each time step!
To run this, we must first set a couple of things up. In Simulation > Model Configuration Parameters, we must set the time step to match the time step we've used to switch between dx/dt and dx (2nd Gain block).
Lastly, we must set the initial condition for x0, this can be done in the Memory block
Setting the end time to 1s and running the model, we see the expected result in the Scope. Because it matches our analytical solution, we know it is correct.
Now we understand what's going on, we can re-introduce the integration block to make the model more flexible. Using the integrator means that dt is automatically calculated, and we don't need to micro-manage the Gain block, in fact we can get rid of it. We still need a Memory block though. We now also need intial conditions in both the integrator, and the memory block. Put scopes in different locations and just complete the first few time steps to work out why!
Note that the initial conditions are less clear when using the integrator block.
The way to think about the integrator blocks is either completely in the Laplace picture, or as representing the equivalent integral equation for the IVP
y'(t)=f(t,y(t)), y(0) = y_0
is equivalent to
y(t) = y_0 + int(s=0 to t) f(s,y(s)) ds
The feed-back loop in the block diagram realizes almost literally this fixed-point equation for the solution function.
So there is no need for complicated constructions and extra blocks.
I have a program that runs ode15s a few thousand times in order to find a particular solution. However, I'm getting many integration tolerance errors such as the following:
"Warning: Failure at t=5.144337e+02. Unable to meet integration tolerances without reducing the step size below the smallest value allowed (1.818989e-12) at time t."
Such warnings cause the program to slow down drastically, and sometimes even grind to a complete halt. The following is some test code that re-produces the error:
%Simulation constants
G = 6.672e-11; %Gravitational constant
M = 6.39e23; %Mass of Mars (kg)
g = 9.81; %Gravitational acceleration on Earth (m/s^2);
T1 = 845000/3; %Total engine thrust, 1 engine (N)
Isp = 282; %Engine specific impulse (s)
mdot1 = T1/(g*Isp); %Engine mass flow rate (kg/s)
xinit_on2 = [72368.8347685214;
3384891.40103322;
-598.36623436025;
-1440.49702235844;
16330.430678033]
tspan_on2 = [436.600093957202, 520.311296453027];
[t3,x3] = ode15s(#(t,x) engine_on_2(t, x, G, g, M, Isp, T1), tspan_on2, xinit_on2)
where the function engine_on_2 contains the system of ODEs that model the descent of a rocket, and is given by,
function xdot = engine_on_2(t, x, G, g, M, Isp, T1)
gamma = atan2(x(4),x(3)); %flight-path angle
xdot = [x(3); %xdot1: x-velocity
x(4); %xdot2: y-velocity
-(G*M*x(1))/((x(1)^2+x(2)^2)^(3/2))-(T1/x(5))*cos(gamma); %xdot3: x-acceleration
-(G*M*x(2))/((x(1)^2+x(2)^2)^(3/2))-(T1/x(5))*sin(gamma); %xdot4: y-acceleration
-T1/(g*Isp)]; %xdot5: engine mass flow rate
end
Having done some testing, it seems that I am getting the integration tolerance warnings because of the use of the atan2 function in gamma = atan2(x(4),x(3)) which is used to calculate the flight-path angle of the rocket. If I change atan2 to another function (for example cos or sin) I don't get any integration tolerance warnings anymore (although, due to such a change, my solutions are obviously incorrect). As such, I was wondering if I am using atan2 incorrectly, or if there is a way to implement it differently so that I do not get the integration tolerance errors anymore. Furthermore, could it be that I am incorrect and that it is something other than atan2 that is causing the errors?
Use the odeset function to create an options structure that you then pass to the solver. RelTol and AbsTol can be adjusted in the ODE solver to eliminate your error. I was able to run your code using this addition without any errors:
options = odeset('RelTol',1e-13,'AbsTol',1e-20)
[t3,x3] = ode15s(#(t,x) engine_on_2(t, x, G, g, M, Isp, T1), tspan_on2, xinit_on2, options)
See the options are passed to the ODE solver as a 4th input parameter. Note the RelTol maxes out just above 1e-13 but hopefully that's fine for your application. Also you can try any of the other ODE solvers which can get rid of your error but from my playing around ode15s seems quite fast.
I want to solve a system of linear equalities of the type Ax = b+u, where A and b are known. I used a function in MATLAB like this:
x = #(u) gmres(A,b+u);
Then I used fmincon, where a value for u is given to this expression and x is computed. For example
J = #(u) (x(u)' * x(u) - x^*)^2
and
[J^*,u] = fmincon(J,...);
withe the dots as matrices and vectors for the equalities and inequalities.
My problem is, that MATLAB delivers always an output with information about the command gmres. But I have no idea, how I can stop this (it makes the Program much slower).
I hope you know an answer.
Patsch
It's a little hidden in the documentation, but it does say
No messages are displayed if the flag output is specified.
So you need to call gmres with at least two outputs. You can do this by making a wrapper function
function x = gmresnomsg(varargin)
[x,~] = gmres(varargin{:});
end
and use that for your handle creation
x = #(u) gmresnomsg(A,b+u);
I need to obtain the values of a, b and c in the following equations so that the step response of the system matches that of the figure below.
x_dot = a*x + b+u;
y = c*x;
Where x_dot is the first derivative of x.
I have been trying to achieve this through Matlab and have so far achieved the following, using just arbitrary values for a, b and c for testing purposes:
clc;
close all;
clear all;
a=1;
b=2;
c=3;
tspan = [0:0.01:12];
x_dot = a*x+b*xu;
x = (a*x^2)/2 + b*u*x;
y = c*x;
f = #(t,x) [a*x(1)+b*x(2); c*x(1)];
[t, xa] = ode45(f,tspan,[0,0]);
plot(t,xa(:,1));
This certainly sounds like a parameter estimation problem as already hinted. You want to minimise the error between the outcome modelled using your ode and the values in your graph (fitting your three parameters a,b and c to the data).
A first step is to write an error function that takes the ode output values and compares how close it is to the data values (sum of least squares error for instance).
Then you have to search through a range of a,b,c values (which may be a large search space) and you pick the set of (a,b,c) which minimise your error function (i.e. get as close to your graph as possible).
Many search/optimisation strategies exist (.. e.g. genetic algorithm/etc.).
Please Note that the parameters are elements of real numbers(which includes negative values and extremely large or small values), the large search space is usually what makes these problems difficult to solve.
Also I think you have to be careful of initial conditions e.g. [0,0] doesn't seem to lead to interesting results. (try a = -0.5,b=0.2 and c=-0.00000001, with IC of [0,10] as below)
clc;
close all;
clear all;
a=-0.5;
b=0.2;
c=-0.00000001;
tspan = [0:0.01:12];
f = #(t,x) [a*x(1)+b*x(2); c*x(1)];
[t, xa] = ode45(f,tspan,[0,10]);
plot(t,xa);
hold on
plot(t,4)
Here 10 is the starting point of the green line and blue line starts at 0. What I would also note is that the IC change the results.. so there any many possible solutions for a,b,c given IC.
Looks interesting.. good luck.
This question is somewhat related to a previous question of mine, where I didn't quite get the right solution. Link: Earlier SO-thread
I am solving PDEs which are time variant with one spatial dimension (e.g. the heat equation - see link below). I'm using the numerical method of lines, i.e. discretizing the spatial derivatives yielding a system of ODEs which are readily solved in Modelica (using the Dymola tool). My problems arise when I simulate the system, or when I plot the results, to be precise. The equations themselves appear to be solved correctly, but I want to express the spatial changes in all the discretized state variables at specific points in time rather than the individual time-varying behavior of each discrete state.
The strategy leading up to my problems is illustrated in this Youtube tutorial, which by the way is not made by me. As you can see at the very end of the tutorial, the time-varying behavior of the temperature is plotted for all the discrete points in the rod, individually. What I would like is a plot showing the temperature through the rod at a specific time, that is the temperature as a function of the spatial coordinate. My strategy to achieve this, which I'm struggling with, is: Given a state vector of N entries:
Real[N] T "Temperature";
..I would use the plotArray Dymola function as shown below.
plotArray( {i for i in 1:N}, {T[i] for i in 1:N} )
Intuitively, this would yield a plot showing the temperature as a function of the spatial coordiate, or the number in the line of discrete units, to be precise. Although this command yields a result, all T-values appear to be 0 in the plot, which is definitely not the case. My question is: How can I successfully obtain and plot the temperatures at all the discrete points at a given time? Thanks in advance for your help.
The code for the problem is as indicated below.
model conduction
parameter Real rho = 1;
parameter Real Cp = 1;
parameter Real L = 1;
parameter Real k = 1;
parameter Real Tlo = 0;
parameter Real Thi = 100;
parameter Real Tinit = 30;
parameter Integer N = 10 "Number of discrete segments";
Real T[N-1] "Temperatures";
Real deltaX = L/N;
initial equation
for i in 1:N-1 loop
T[i] = Tinit;
end for;
equation
rho*Cp*der(T[1]) = k*( T[2] - 2*T[1] + Thi) /deltaX^2;
rho*Cp*der(T[N-1]) = k*( Tlo - 2*T[N-1] + T[N-2]) /deltaX^2;
for i in 2:N-2 loop
rho*Cp*der(T[i]) = k*( T[i+1] - 2*T[i] + T[i-1]) /deltaX^2;
end for
annotation (uses(Modelica(version="3.2")));
end conduction;
Additional edit: The simulations show clearly that for example T[3], that is the temperature of discrete segment no. 3, starts out from 30 and ends up at 70 degrees. When I write T[3] in my command window, however, I get T3 = 0.0 in return. Why is that? This is at the heart of the problem, because the plotArray function would be working if I managed to extract the actual variable values at specific times and not just 0.0.
Suggested solution: This is a rather tedious solution to achieve what I want, and I hope someone knows a better solution. When I run the simulation in Dymola, the software generates a .mat-file containing the values of the variables throughout the time of the simulation. I am able to load this file into MATLAB and manually extract the variables of my choice for plotting. For the problem above, I wrote the following command:
plot( [1:9]' , data_2(2:2:18 , 10)' )
This command will plot the temperatures (as the temperatures are stored together with their derivates in the data_2 array in the .mat-file) against the respetive number of the discrete segment/element. I was really hoping to do this inside Dymola, that is avoid using MATLAB for this. For this specific problem, the amount of variables was low on account of the simplicity of this problem, but I can easily image a .mat-file which is signifanctly harder to navigate through manually like I just did.
Although you do not mention it explicitly I assume that you enter your plotArray command in Dymola's command window. That won't work directly, since the variables you see there do not include your simulation results: If I simulate your model, and then enter T[:] in Dymola's command window, then the printed result is
T[:]
= {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0}
I'm not a Dymola expert, and the only solution I've found (to actively store and load the desired simulation results) is quite cumbersome:
simulateModel("conduction", resultFile="conduction.mat")
n = readTrajectorySize("conduction.mat")
X = readTrajectory("conduction.mat", {"Time"}, n)
Y = readTrajectory("conduction.mat", {"T[1]", "T[2]", "T[3]"}, n)
plotArrays(X[1, :], transpose(Y))