How to correctly set the 'rtol' and 'atol' in scipy integration module 'solve_ivp' for solving a system of ODE with unknown analytic solution? - scipy

I was trying to reproduce some results of ode45 solver in Python using solve_ivp. Though all parameters, initial conditions, step size, and 'atol' and 'rtol' (which are 1e-6 and 1e-3) are same, I am getting different solutions. Both of the solutions are converging to a periodic solution but of different kind. As solve_ivp uses same rk4(5) method as ode45, this discrepancy in the final result is not quite understable. How can we know which one is the correct solution?
The code is included below
import sys
import numpy as np
from scipy.integrate import solve_ivp
#from scipy import integrate
import matplotlib.pyplot as plt
from matplotlib.patches import Circle
# Pendulum rod lengths (m), bob masses (kg).
L1, L2, mu, a1 = 1, 1, 1/5, 1
m1, m2, B = 1, 1, 0.1
# The gravitational acceleration (m.s-2).
g = 9.81
# The forcing frequency,forcing amplitude
w, a_m =10, 4.5
A=(a_m*w**2)/g
A1=a_m/g
def deriv(t, y, mu, a1, B, w, A): # beware of the order of the aruments
"""Return the first derivatives of y = theta1, z1, theta2, z2, z3."""
a, c, b, d, e = y
#c, s = np.cos(theta1-theta2), np.sin(theta1-theta2)
adot = c
cdot = (-(1-A*np.sin(e))*(((1+mu)*np.sin(a))-(mu*np.cos(a-b)*np.sin(b)))-((mu/a1)*((d**2)+(a1*np.cos(a-b)*c**2))*np.sin(a-b))-(2*B*(1+(np.sin(a-b))**2)*c)-((2*B*A/w)*(2*np.sin(a)-(np.cos(a-b)*np.sin(b)))*np.cos(e)))/(1+mu*(np.sin(a-b))**2)
bdot = d
ddot = ((-a1*(1+mu)*(1-A*np.sin(e))*(np.sin(b)-(np.cos(a-b)*np.sin(a))))+(((a1*(1+mu)*c**2)+(mu*np.cos(a-b)*d**2))*np.sin(a-b))-((2*B/mu)*(((1+mu*(np.sin(a-b))**2)*d)+(a1*(1-mu)*np.cos(a-b)*c)))-((2*B*a1*A/(w*mu))*(((1+mu)*np.sin(b))-(2*mu*np.cos(a-b)*np.sin(a)))*np.cos(e)))/(1+mu*(np.sin(a-b))**2)
edot = w
return adot, cdot, bdot, ddot, edot
# Initial conditions: theta1, dtheta1/dt, theta2, dtheta2/dt.
y0 = np.array([3.15, -0.1, 3.13, 0.1, 0])
# Do the numerical integration of the equations of motion
sol = integrate.solve_ivp(deriv,[0,40000], y0, args=(mu, a1, B, w, A), method='RK45',t_eval=np.arange(0, 40000, 0.005), dense_output=True, rtol=1e-3, atol=1e-6)
T = sol.t
Y = sol.y
I am expecting similar result from ode45 in MATLAB and solve_ivp in Python. How can I exactly reproduce the result from ode45 in python? What is the reason of discrepancy?

Even if ode45and RK45use the same underlying scheme, they do not necessarily use the same exact strategy regarding the evolution of the time step and its adaptation to match the error tolerance. Thus, it is difficult to know which one is better.
The only thing you could is simply trying lower tolerances, e.g. 1e-10. Then, both solutions should end up being virtually identical... Here, your current error tolerance might be insufficiently low, so that small discrepancies in the fine details of both algorithms create a visible difference in the solution.

Related

get strange result from scipy.signal.lsim when run twice

i run scipy.signal.lsim 10 times, it seems that the x0 only be used in the first time, why?
t=np.linspace(0.0,100,100*100)
transfun=[]
for i in range(10):
transfun.append(signal.lti([1],[1+i,1]))
y=[]
for i in range(10):
y.append(np.sin(2*np.pi*300*t)+np.random.normal(0,1,10000)+50)
sensor_output=[]
for i in range(10):
tout, yout, xout =signal.lsim(transfun[i],y[i],t,X0=[50.0])
sensor_output.append(yout)
fig=plt.figure()
for i in range(10):
plt.subplot(10,1,i+1)
plt.plot(t,y[i])
plt.plot(t,sensor_output[i])
plt.show()
lsim takes initial state vector as an argument, not initial output.
Transfer functions don't really have state vectors, but under the hood lsim is converting the transfer function to a state-space realization (which does have a state vector), and using that to simulate the system.
One problem is that, for a given transfer function, there's no unique realization. lsim doesn't say how it converts transfer functions to state-space realizations, but given your results I took a guess which happened to work (see below), but it's not robust.
To solve this for general transfer functions (i.e., not just first-order), you'd need to work with a specific state-space realization, and also specify more than just initial output, or the problem is under-constrained (I guess a typical approach would be to require d(y)/dt = 0, and similarly for all higher derivatives).
Below is a quick-and-dirty fix for your problem, and a sketch of how to do this for first-order state-space realizations.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
nex = 3
t = np.linspace(0, 40, 1001)
taus = [1, 5, 10]
transfun = [signal.lti([1],[tau,1])
for tau in taus]
u = np.tile(50, t.shape)
yinit = 49
sensor_output = [signal.lsim(tf,u,t,X0=[yinit*tau])[1]
for tf, tau in zip(transfun, taus)]
fig=plt.figure()
for i in range(nex):
plt.subplot(nex,1,i+1)
plt.plot(t,u)
plt.plot(t,sensor_output[i])
plt.savefig('img1.png')
# different SS realizations of the same TF need different x0
# to get the same initial y0
g = signal.tf2ss([1], [10, 1])
# create SS system with the same TF
k = 1.234
g2 = (g[0], k*g[1], g[2]/k, g[3])
# desired initial value
y0 = 321
#solve for initial state for the two SS systems
x0 = y0 / g[2]
x0_2 = y0 / g2[2]
output = [signal.lsim(g,u,t,X0=x0)[1],
signal.lsim(g2,u,t,X0=x0_2)[1]]
fig=plt.figure()
for i,out in enumerate(output):
plt.subplot(len(output),1,i+1)
plt.plot(t,u)
plt.plot(t,out)
plt.savefig('img2.png')
plt.show()

scipy.special yields fluctuating result for confluent hypergeometric function

The scipy implementation of the confluent hypergeometric function gives me wrong results. This is a minimal code:
import matplotlib.pyplot as plt
import numpy as np
from scipy import special
x=np.arange(0,1,.001)
f=special.hyp1f1(30,60,-1/x)
plt.scatter(x,f,s=.05)
When I run it, it produces the following plot:
output of scipy.special.hyp1f1
I wonder if there is a way to fix these fluctuations, which are definitely not correct. In fact, the function should be strictly positive in that range.
Starting from the explanation at scipy.special.hyp1f1, here is an attempt to approximate the function with a polynomial.
Apparently, hyp1f1(-1/x) works nice between x=0 and about x=0.2. Note that at x exactly 0, the function isn't properly defined. The approximation with a 5th degree polynomial is much too large for x<0.4. With a 80th degree polynomial, the approximation seems correct starting at x>0.025 but quickly gets out of bounds for smaller x. (With more than 90 terms the polynomial can't be calculated in this way anymore.)
Probably the best solution would be to use a high degree polynomial for x>=0.1 and the original hyp1f1 when x is smaller.
import matplotlib.pyplot as plt
import numpy as np
from scipy import special
x = np.linspace(0.001, 1, 1000)
f = special.hyp1f1(30, 60, -1 / x)
plt.scatter(x, f, s=1, color='r', label='hyp1f1')
for terms in range(80, 1, -10):
k10 = np.arange(terms)
c10 = special.poch(30, k10) / (special.poch(60, k10) * special.factorial(k10))
poly10 = np.poly1d(c10[::-1])
plt.scatter(x, poly10(-1 / x), s=1, label=f'{terms} terms', color=plt.cm.Set1(terms / 80))
plt.ylim(-3.5, 3.7)
plt.legend(scatterpoints=10, ncol=3)
plt.show()
Zoomed in:

Scipy.curve_fit() vs. Matlab fit() weighted nonlinear least squares

I have a Matlab reference routine that I am trying to convert to numpy/scipy. I have encountered a curve fitting problem that does I cannot solve in Python. So here is a simple example which demonstrates the problem. The data is completely synthetic and not part of the problem.
Let's say I'm trying to fit a straight-line model of noisy data -
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539]
For the unweighted solution in Matlab, I would code
g = #(m, b, x)(m*x + b)
f = fittype(g)
bestfit = fit(x, y, g)
which produces a solution of bestfit.m = 1.048, bestfit.b = -0.09219
Running this data through scipy.optimize.curve_fit() produces identical results.
If instead the fit uses a decay function to reduce the impact of data points
dw = [0.7290, 0.5120, 0.3430, 0.2160, 0.1250, 0.0640, 0.0270, 0.0080, 0.0010, 0]
weightedfit = fit(x, y, g, 'Weights', dw)
This produces a slope if 0.944 and offset 0.1484.
I have not figured out how to conjure this result from scipy.optimize.curve_fit using the sigma parameter. If I pass the weights as provided to Matlab, the '0' causes a divide by zero exception. Clearly Matlab and scipy are thinking very differently about the meaning of the weights in the underlying optimization routine. Is there a simple way of converting between the two that allows me to provide a weighting function which produces identical results?
Ok, so after further investigation I can offer the answer, at least for this simple example.
import numpy as np
import scipy as sp
import scipy.optimize
def modelFun(x, m, b):
return m * x + b
def testFit():
w = np.diag([1.0, 1/0.7290, 1/0.5120, 1/0.3430, 1/0.2160, 1/0.1250, 1/0.0640, 1/0.0270, 1/0.0080, 1/0.0010])
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539])
popt = sp.optimize.curve_fit(modelFun, x, y, sigma=w)
print(popt[0])
print(popt[1])
Which produces the desired result.
In order to force sp.optimize.curve_fit to minimize the same chisq metric as Matlab using the curve fitting toolbox, you must do two things:
Use the reciprocal of the weight factors
Create a diagonal matrix from the new weight factors. According to the scipy reference:
sigma None or M-length sequence or MxM array, optional
Determines the uncertainty in ydata. If we define residuals as r =
ydata - f(xdata, *popt), then the interpretation of sigma depends on
its number of dimensions:
A 1-d sigma should contain values of standard deviations of errors in
ydata. In this case, the optimized function is chisq = sum((r / sigma)
** 2).
A 2-d sigma should contain the covariance matrix of errors in ydata.
In this case, the optimized function is chisq = r.T # inv(sigma) # r.
New in version 0.19.
None (default) is equivalent of 1-d sigma filled with ones.

How to define a loss function in pytorch with dependency to partial derivatives of the model w.r.t input?

After reading about how to solve an ODE with neural networks following the paper Neural Ordinary Differential Equations and the blog that uses the library JAX I tried to do the same thing with "plain" Pytorch but found a point rather "obscure": How to properly use the partial derivative of a function (in this case the model) w.r.t one of the input parameters.
To resume the problem at hand as shown in 2 it is intended to solve the ODE y' = -2*x*y with the condition y(x=0) = 1 in the domain -2 <= x <= 2. Instead of using finite differences the solution is replaced by a NN as y(x) = NN(x) with a single layer with 10 nodes.
I managed to (more or less) replicate the blog with the following code
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the problem
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
x = torch.sigmoid(self.lin1(x))
x = torch.sigmoid(self.lin2(x))
return x
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
dydx, = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)
eq = dydx + 2.* x * y # y' = - 2x*y
ic = model(torch.tensor([0.])) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
loss_func = ODE
# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad
# Define reference grid
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor
# Iterative learning
epochs = 1000
for epoch in range(epochs):
opt.zero_grad()
y_trial = model(x_data)
loss = loss_func(x_data, y_trial)
loss.backward()
opt.step()
if epoch % 100 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()
From here I manage to get the results as shown in the fig.
enter image description here
The problems is that at the definition of the ODE functional, instead of passing (x,y) I would rather prefer to pass something like (x,fun) (where fun is my model) such that the partial derivative and specific evaluations of the model can be done with a call . So, something like
def ODE(x,fun):
dydx, = "grad of fun w.r.t x as a function"
eq = dydx(x) + 2.* x * fun(x) # y' = - 2x*y
ic = fun( torch.tensor([0.]) ) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
Any ideas? Thanks in advance
EDIT:
After some trials I found a way to pass the model as an input but found another strange behavior... The new problem is to solve the ODE y'' = -2 with the BC y(x=-2) = -1 and y(x=2) = 1, for which the analytical solution is y(x) = -x^2+x/2+4
Let's modify a bit the previous code as:
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the equation
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
y = torch.sigmoid(self.lin1(x))
z = torch.sigmoid(self.lin2(y))
return z
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
y = fun(x)
dydx = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x,
grad_outputs=dydx.data.new(dydx.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
eq = d2ydx2 + torch.tensor([ 2.]) # y'' = - 2
bc1 = fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
bc2 = fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) = 1
return torch.mean(eq**2) + bc1**2 + bc2**2
loss_func = ODE
So, here I passed the model as argument and managed to derive twice... so far so good. BUT, using the sigmoid function for this case is not only not necessary but also gives a result that is far from the analytical one.
If I change the NN for:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,1)
self.lin2 = nn.Linear(1,1)
def forward(self, x):
y = self.lin1(x)
z = self.lin2(y)
return z
In which case I would expect to optimize a double pass through two linear functions that would retrieve a 2nd order function ... I get the error:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Adding the option to the definition of dydx doesn't solve the problem, and adding it to d2ydx2 gives a NoneType definition.
Is there something wrong with the layers as they are?
Quick Solution:
add allow_unused=True to .grad functions. So, change
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]
To
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True, allow_unused=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]
More explanation:
See what allow_unused do:
allow_unused (bool, optional): If ``False``, specifying inputs that were not
used when computing outputs (and therefore their grad is always zero)
is an error. Defaults to ``False``.
So, if you try to differentiate w.r.t to a variable that is not in being used to compute the value, it will give an error. Also, note that error only occurs when you use linear layers.
This is because when you use linear layers, you have y=W1*W2*x + b = Wx+b and dy/dx is not a function of x, it is simply W. So when you try to differentiate dy/dx w.r.t x it throws an error. This error goes away as soon as you use sigmoid because then dy/dx will be a function of x. To avoid the error, either make sure dy/dx is a function of x or use allow_unused=True

Regression not possible for same y value

I want to run a regression analysis on below data, here x1 and x2 produce y value. But in that case, y value is fixed in all time. So regression will not happen. But why? Need explanation.
Your training set shows that the coefficients are all ~0 and the constant is 5. There's no more information in that dataset, you don't need regression to show that.
You did not specify what kind of regression you are running. Depending on the type of regression you are using, you will need the matrices to be invertible and not be related linearly.
It seems to work using normal equation (with expected results):
import numpy as np
import matplotlib.pyplot as plt
input = np.array([
[2,3,5],
[1,2,5],
[4,2,5],
[1,7,5],
[1,9,5]
])
m = len(input)
X = np.array([np.ones(m), input[:, 0],input[:, 1]]).T # Add Constant to X
y = np.array(input[:, 2]).reshape(-1, 1) # Get the dependant values
betaHat = np.linalg.solve(X.T.dot(X), X.T.dot(y)) # Calculate coefficients
print(betaHat) # Show Constant and coefficients (in that order)
[[ 5.00000000e+00]
[ 5.29208238e-16]
[ 4.32685981e-17]]