I have written a function that, given parameters, can apply a piecewise linear fit, with arbitrarily many piecewise sections, to some data.
I am trying to fit the function to my data using scipy.optimize.curve_fit, but I am receiving an "OptimizeWarning: Covariance of the parameters could not be estimated" error. I believe this may be because of the nested lambda functions I am using to define the piecewise sections.
Is there an easy way to tweak my code to get round this, or a different scipy optimisation function that might be more suitable?
#The piecewise function
def piecewise_linear(x, *params):
N=len(params)/2
if N.is_integer():N=int(N)
else:raise(ValueError())
c=params[0]
xbounds=params[1:N]
grads=params[N:]
#First we define our conditions, which are true if x is a member of a given
#bin.
conditions=[]
#first and last bins are a special case:
cond0=lambda x: x<xbounds[0]
condl=lambda x: x>=xbounds[-1]
conditions.append(cond0(x))
for i in range(len(xbounds)-1):
cond=lambda x : (x >= xbounds[i]) & (x < xbounds[i+1])
conditions.append(cond(x))
conditions.append(condl(x))
#Next we define our linear regression function for each bin. The offset
#for each bin depends on where the previous bin ends, so we define
#the regression functions recursively:
functions=[]
func0 = lambda x: grads[0]*x +c
functions.append(func0)
for i in range(len(grads)-1):
func = (lambda j: lambda x: grads[j+1]*(x-xbounds[j])\
+functions[j](xbounds[j]))(i)
functions.append(func)
return np.piecewise(x,conditions,functions)
#Some data
x=np.arange(100)
y=np.array([*np.arange(0,19,1),*np.arange(20,59,2),\
*np.arange(60,20,-1),*np.arange(21,42,1)]) + np.random.randn(100)
#A first guess of parameters
cguess=0
boundguess=[20,30,50]
gradguess=[1,1,1,1]
p0=[cguess,*boundguess,*gradguess]
fit=scipy.optimize.curve_fit(piecewise_linear,x,y,p0=p0)
Here is example code that fits two straight lines to a curved data set with a breakpoint, where the line parameters and breakpoint are all fitted. This example uses scipy's Differential Evolution genetic algorithm to determine initial parameter estimates for the regression. That module uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search. In this example those search bounds are derived from the data itself. Note that it is much easier to find ranges for the initial parameter estimates than to give specific values.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
xData = numpy.array([19.1647, 18.0189, 16.9550, 15.7683, 14.7044, 13.6269, 12.6040, 11.4309, 10.2987, 9.23465, 8.18440, 7.89789, 7.62498, 7.36571, 7.01106, 6.71094, 6.46548, 6.27436, 6.16543, 6.05569, 5.91904, 5.78247, 5.53661, 4.85425, 4.29468, 3.74888, 3.16206, 2.58882, 1.93371, 1.52426, 1.14211, 0.719035, 0.377708, 0.0226971, -0.223181, -0.537231, -0.878491, -1.27484, -1.45266, -1.57583, -1.61717])
yData = numpy.array([0.644557, 0.641059, 0.637555, 0.634059, 0.634135, 0.631825, 0.631899, 0.627209, 0.622516, 0.617818, 0.616103, 0.613736, 0.610175, 0.606613, 0.605445, 0.603676, 0.604887, 0.600127, 0.604909, 0.588207, 0.581056, 0.576292, 0.566761, 0.555472, 0.545367, 0.538842, 0.529336, 0.518635, 0.506747, 0.499018, 0.491885, 0.484754, 0.475230, 0.464514, 0.454387, 0.444861, 0.437128, 0.415076, 0.401363, 0.390034, 0.378698])
def func(xArray, breakpoint, slopeA, offsetA, slopeB, offsetB):
returnArray = []
for x in xArray:
if x < breakpoint:
returnArray.append(slopeA * x + offsetA)
else:
returnArray.append(slopeB * x + offsetB)
return returnArray
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
slope = 10.0 * (maxY - minY) / (maxX - minX) # times 10 for safety margin
parameterBounds = []
parameterBounds.append([minX, maxX]) # search bounds for breakpoint
parameterBounds.append([-slope, slope]) # search bounds for slopeA
parameterBounds.append([minY, maxY]) # search bounds for offsetA
parameterBounds.append([-slope, slope]) # search bounds for slopeB
parameterBounds.append([minY, maxY]) # search bounds for offsetB
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# call curve_fit without passing bounds from genetic algorithm
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
Related
This link indicates that the analogue of Matlab's lsqnonlin is LsqFit.jl in Julia. However, matlab uses the notation
x = lsqnonlin(fun,x0)
But Julia uses notation
fit = curve_fit(model, xdata, ydata, p0)
So what is the analogue of the x in Matlab notation, if I'm using LsqFit in Julia? In Julia command
fieldnames(fit)
Gives
5-element Array{Symbol,1}:
:dof
:param
:resid
:jacobian
:converged
where converged is a boolean type, and fit.param corresponds to the parameters vector p.
So where is the independent variable x?Want to solve for x in F(x)=0 where x and F are column vectors and F is a nonlinear function.
fit = curve_fit(model, xdata, ydata, p0)
is more akin to a call like
[x,resnorm,residual,exitflag,output,lambda,jacobian] = lsqnonlin(___)
in Matlab. The resulting fit object contains some similar solver metadata:
# fit is a composite type (LsqFitResult), with some interesting values:
# fit.dof: degrees of freedom
# fit.param: best fit parameters
# fit.resid: residuals = vector of residuals
# fit.jacobian: estimated Jacobian at solution
So, fit.param should be the same as x.
Here is an MWE demonstrating the proper usage for finding y(x) that minimizes squares for function F(y(x)) = 0. It is modified in that the p0 "parameters" are now actually the y-values. The x-values are meant to be the underlying grid (so notation F(x)=0 is not correct for curve_fit), and it should be that we are solving F(y(x))=0 as follows.
using LsqFit
using PyPlot
# example model, could be nlin function or diff eq
# we will solve DE y' - y with init conds y'(0) = 1
# by using function F(y) = y' - y as follows
F(xs, ys) = [1; diff(ys)/xs[2];] - ys; # column vector
# note: function F above assumes xs[1] = 0 so that xs[2] = dx
# xs: independent variables, these values never change
dx = 0.02; # make smaller to get more accurate
xs = [0:dx:2;]; # column vector
len = length(xs);
# ys: initial guess for solution function y(x), will be optimized
ys = exp(xs)+rand(len,1)-0.5; # column vector
# modified usage for sum of squares to minimize F
fit = curve_fit(F,xs,0.0*xs[:],ys[:])
# note: original usage was
# fit = curve_fit(model, xdata, ydata, p0)
figure(1)
scatter(xs,ys,marker = "o") # initial noisy guess in blue
plot(xs,fit.param,color = "red") # optimized solution in red
plot(xs,exp(xs),linestyle = "--",color = "green") # exact sol
axis("tight")
display("sum of squares")
sum(abs(F(xs,fit.param)).^2)
In Julia I used the package LeastSquaresOptim to replace Matlab's lsqnonlin function and find the best (three) parameters abc for a custom cost-function:
using LeastSquaresOptim
# create closure to fun since we need more args in cost-function
function fun(abc) myCostFunction(abc, otherArg1, otherArg2, otherArg3) end
p0 = [0.0, 0.0, 0.0] # mind the float values for starting point!
p = optimize(fun, p0, LevenbergMarquardt())
a = p.minimizer[1]
b = p.minimizer[2]
c = p.minimizer[3]
After reading about how to solve an ODE with neural networks following the paper Neural Ordinary Differential Equations and the blog that uses the library JAX I tried to do the same thing with "plain" Pytorch but found a point rather "obscure": How to properly use the partial derivative of a function (in this case the model) w.r.t one of the input parameters.
To resume the problem at hand as shown in 2 it is intended to solve the ODE y' = -2*x*y with the condition y(x=0) = 1 in the domain -2 <= x <= 2. Instead of using finite differences the solution is replaced by a NN as y(x) = NN(x) with a single layer with 10 nodes.
I managed to (more or less) replicate the blog with the following code
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the problem
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
x = torch.sigmoid(self.lin1(x))
x = torch.sigmoid(self.lin2(x))
return x
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
dydx, = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)
eq = dydx + 2.* x * y # y' = - 2x*y
ic = model(torch.tensor([0.])) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
loss_func = ODE
# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad
# Define reference grid
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor
# Iterative learning
epochs = 1000
for epoch in range(epochs):
opt.zero_grad()
y_trial = model(x_data)
loss = loss_func(x_data, y_trial)
loss.backward()
opt.step()
if epoch % 100 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()
From here I manage to get the results as shown in the fig.
enter image description here
The problems is that at the definition of the ODE functional, instead of passing (x,y) I would rather prefer to pass something like (x,fun) (where fun is my model) such that the partial derivative and specific evaluations of the model can be done with a call . So, something like
def ODE(x,fun):
dydx, = "grad of fun w.r.t x as a function"
eq = dydx(x) + 2.* x * fun(x) # y' = - 2x*y
ic = fun( torch.tensor([0.]) ) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
Any ideas? Thanks in advance
EDIT:
After some trials I found a way to pass the model as an input but found another strange behavior... The new problem is to solve the ODE y'' = -2 with the BC y(x=-2) = -1 and y(x=2) = 1, for which the analytical solution is y(x) = -x^2+x/2+4
Let's modify a bit the previous code as:
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the equation
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
y = torch.sigmoid(self.lin1(x))
z = torch.sigmoid(self.lin2(y))
return z
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
y = fun(x)
dydx = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x,
grad_outputs=dydx.data.new(dydx.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
eq = d2ydx2 + torch.tensor([ 2.]) # y'' = - 2
bc1 = fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
bc2 = fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) = 1
return torch.mean(eq**2) + bc1**2 + bc2**2
loss_func = ODE
So, here I passed the model as argument and managed to derive twice... so far so good. BUT, using the sigmoid function for this case is not only not necessary but also gives a result that is far from the analytical one.
If I change the NN for:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,1)
self.lin2 = nn.Linear(1,1)
def forward(self, x):
y = self.lin1(x)
z = self.lin2(y)
return z
In which case I would expect to optimize a double pass through two linear functions that would retrieve a 2nd order function ... I get the error:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Adding the option to the definition of dydx doesn't solve the problem, and adding it to d2ydx2 gives a NoneType definition.
Is there something wrong with the layers as they are?
Quick Solution:
add allow_unused=True to .grad functions. So, change
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]
To
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True, allow_unused=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]
More explanation:
See what allow_unused do:
allow_unused (bool, optional): If ``False``, specifying inputs that were not
used when computing outputs (and therefore their grad is always zero)
is an error. Defaults to ``False``.
So, if you try to differentiate w.r.t to a variable that is not in being used to compute the value, it will give an error. Also, note that error only occurs when you use linear layers.
This is because when you use linear layers, you have y=W1*W2*x + b = Wx+b and dy/dx is not a function of x, it is simply W. So when you try to differentiate dy/dx w.r.t x it throws an error. This error goes away as soon as you use sigmoid because then dy/dx will be a function of x. To avoid the error, either make sure dy/dx is a function of x or use allow_unused=True
I am looking at the KalmanFilter from pykalman shown in examples:
pykalman documentation
Example 1
Example 2
and I am wondering
observation_covariance=100,
vs
observation_covariance=1,
the documentation states
observation_covariance R: e(t)^2 ~ Gaussian (0, R)
How should the value be set here correctly?
Additionally, is it possible to apply the Kalman filter without intercept in the above module?
The observation covariance shows how much error you assume to be in your input data. Kalman filter works fine on normally distributed data. Under this assumption you can use the 3-Sigma rule to calculate the covariance (in this case the variance) of your observation based on the maximum error in the observation.
The values in your question can be interpreted as follows:
Example 1
observation_covariance = 100
sigma = sqrt(observation_covariance) = 10
max_error = 3*sigma = 30
Example 2
observation_covariance = 1
sigma = sqrt(observation_covariance) = 1
max_error = 3*sigma = 3
So you need to choose the value based on your observation data. The more accurate the observation, the smaller the observation covariance.
Another point: you can tune your filter by manipulating the covariance, but I think it's not a good idea. The higher the observation covariance value the weaker impact a new observation has on the filter state.
Sorry, I did not understand the second part of your question (about the Kalman Filter without intercept). Could you please explain what you mean?
You are trying to use a regression model and both intercept and slope belong to it.
---------------------------
UPDATE
I prepared some code and plots to answer your questions in details. I used EWC and EWA historical data to stay close to the original article.
First of all here is the code (pretty the same one as in the examples above but with a different notation)
from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
# reading data (quick and dirty)
Datum=[]
EWA=[]
EWC=[]
for line in open('data/dataset.csv'):
f1, f2, f3 = line.split(';')
Datum.append(f1)
EWA.append(float(f2))
EWC.append(float(f3))
n = len(Datum)
# Filter Configuration
# both slope and intercept have to be estimated
# transition_matrix
F = np.eye(2) # identity matrix because x_(k+1) = x_(k) + noise
# observation_matrix
# H_k = [EWA_k 1]
H = np.vstack([np.matrix(EWA), np.ones((1, n))]).T[:, np.newaxis]
# transition_covariance
Q = [[1e-4, 0],
[ 0, 1e-4]]
# observation_covariance
R = 1 # max error = 3
# initial_state_mean
X0 = [0,
0]
# initial_state_covariance
P0 = [[ 1, 0],
[ 0, 1]]
# Kalman-Filter initialization
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
transition_matrices = F,
observation_matrices = H,
transition_covariance = Q,
observation_covariance = R,
initial_state_mean = X0,
initial_state_covariance = P0)
# Filtering
state_means, state_covs = kf.filter(EWC)
# Restore EWC based on EWA and estimated parameters
EWC_restored = np.multiply(EWA, state_means[:, 0]) + state_means[:, 1]
# Plots
plt.figure(1)
ax1 = plt.subplot(211)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.legend(loc="upper left")
ax2 = plt.subplot(212)
plt.plot(state_means[:, 1], label="Intercept")
plt.grid()
plt.legend(loc="upper left")
# check the result
plt.figure(2)
plt.plot(EWC, label="EWC original")
plt.plot(EWC_restored, label="EWC restored")
plt.grid()
plt.legend(loc="upper left")
plt.show()
I could not retrieve data using pandas, so I downloaded them and read from the file.
Here you can see the estimated slope and intercept:
To test the estimated data I restored the EWC value from the EWA using the estimated parameters:
About the observation covariance value
By varying the observation covariance value you tell the Filter how accurate the input data is (normally you just describe your confidence in the observation using some datasheets or your knowledge about the system).
Here are estimated parameters and the restored EWC values using different observation covariance values:
You can see the filter follows the original function better with a bigger confidence in observation (smaller R). If the confidence is low (bigger R) the filter leaves the initial estimate (slope = 0, intercept = 0) very slowly and the restored function is far away from the original one.
About the frozen intercept
If you want to freeze the intercept for some reason, you need to change the whole model and all filter parameters.
In the normal case we had:
x = [slope; intercept] #estimation state
H = [EWA 1] #observation matrix
z = [EWC] #observation
Now we have:
x = [slope] #estimation state
H = [EWA] #observation matrix
z = [EWC-const_intercept] #observation
Results:
Here is the code:
from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
# only slope has to be estimated (it will be manipulated by the constant intercept) - mathematically incorrect!
const_intercept = 10
# reading data (quick and dirty)
Datum=[]
EWA=[]
EWC=[]
for line in open('data/dataset.csv'):
f1, f2, f3 = line.split(';')
Datum.append(f1)
EWA.append(float(f2))
EWC.append(float(f3))
n = len(Datum)
# Filter Configuration
# transition_matrix
F = 1 # identity matrix because x_(k+1) = x_(k) + noise
# observation_matrix
# H_k = [EWA_k]
H = np.matrix(EWA).T[:, np.newaxis]
# transition_covariance
Q = 1e-4
# observation_covariance
R = 1 # max error = 3
# initial_state_mean
X0 = 0
# initial_state_covariance
P0 = 1
# Kalman-Filter initialization
kf = KalmanFilter(n_dim_obs=1, n_dim_state=1,
transition_matrices = F,
observation_matrices = H,
transition_covariance = Q,
observation_covariance = R,
initial_state_mean = X0,
initial_state_covariance = P0)
# Creating the observation based on EWC and the constant intercept
z = EWC[:] # copy the list (not just assign the reference!)
z[:] = [x - const_intercept for x in z]
# Filtering
state_means, state_covs = kf.filter(z) # the estimation for the EWC data minus constant intercept
# Restore EWC based on EWA and estimated parameters
EWC_restored = np.multiply(EWA, state_means[:, 0]) + const_intercept
# Plots
plt.figure(1)
ax1 = plt.subplot(211)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.legend(loc="upper left")
ax2 = plt.subplot(212)
plt.plot(const_intercept*np.ones((n, 1)), label="Intercept")
plt.grid()
plt.legend(loc="upper left")
# check the result
plt.figure(2)
plt.plot(EWC, label="EWC original")
plt.plot(EWC_restored, label="EWC restored")
plt.grid()
plt.legend(loc="upper left")
plt.show()
I am trying to use bvp4c to solve a system of 4 odes. The issue is that one of the boundaries is unknown.
Can bvp4c handle this? In my code L is the unknown I am solving for.
I get an error message printed below.
function mat4bvp
L = 8;
solinit = bvpinit(linspace(0,L,100),#mat4init);
sol = bvp4c(#mat4ode,#mat4bc,solinit);
sint = linspace(0,L);
Sxint = deval(sol,sint);
end
% ------------------------------------------------------------
function dtdpdxdy = mat4ode(s,y,L)
Lambda = 0.3536;
dtdpdxdy = [y(2)
-sin(y(1)) + Lambda*(L-s)*cos(y(1))
cos(y(1))
sin(y(1))];
end
% ------------------------------------------------------------
function res = mat4bc(ya,yb,L)
res = [ ya(1)
ya(2)
ya(3)
ya(4)
yb(1)];
end
% ------------------------------------------------------------
function yinit = mat4init(s)
yinit = [ cos(s)
0
0
0
];
end
Unfortunately I get the following error message ;
>> mat4bvp
Not enough input arguments.
Error in mat4bvp>mat4ode (line 13)
-sin(y(1)) + Lambda*(L-s)*cos(y(1))
Error in bvparguments (line 105)
testODE = ode(x1,y1,odeExtras{:});
Error in bvp4c (line 130)
bvparguments(solver_name,ode,bc,solinit,options,varargin);
Error in mat4bvp (line 4)
sol = bvp4c(#mat4ode,#mat4bc,solinit);
One trick to transform a variable end point into a fixed one is to change the time scale. If x'(t)=f(t,x(t)) is the differential equation, set t=L*s, s from 0 to 1, and compute the associated differential equation for y(s)=x(L*s)
y'(s)=L*x'(L*s)=L*f(L*s,y(s))
The next trick to employ is to transform the global variable into a part of the differential equation by computing it as constant function. So the new system is
[ y'(s), L'(s) ] = [ L(s)*f(L(s)*s,y(s)), 0 ]
and the value of L occurs as additional free left or right boundary value, increasing the number of variables = dimension of the state vector to the number of boundary conditions.
I do not have Matlab readily available, in Python with the tools in scipy this can be implemented as
from math import sin, cos
import numpy as np
from scipy.integrate import solve_bvp, odeint
import matplotlib.pyplot as plt
# The original function with the interval length as parameter
def fun0(t, y, L):
Lambda = 0.3536;
#print t,y,L
return np.array([ y[1], -np.sin(y[0]) + Lambda*(L-t)*np.cos(y[0]), np.cos(y[0]), np.sin(y[0]) ]);
# Wrapper function to apply both tricks to transform variable interval length to a fixed interval.
def fun1(s,y):
L = y[-1];
dydt = np.zeros_like(y);
dydt[:-1] = L*fun0(L*s, y[:-1], L);
return dydt;
# Implement evaluation of the boundary condition residuals:
def bc(ya, yb):
return [ ya[0],ya[1], ya[2], ya[3], yb[0] ];
# Define the initial mesh with 5 nodes:
x = np.linspace(0, 1, 3)
# This problem has multiple solutions. Try two initial guesses.
L_a=8
L_b=9
y_a = odeint(lambda y,t: fun1(t,y), [0,0,0,0,L_a], x)
y_b = odeint(lambda y,t: fun1(t,y), [0,0,0,0,L_b], x)
# Now we are ready to run the solver.
res_a = solve_bvp(fun1, bc, x, y_a.T)
res_b = solve_bvp(fun1, bc, x, y_b.T)
L_a = res_a.sol(0)[-1]
L_b = res_b.sol(0)[-1]
print "L_a=%.8f, L_b=%.8f" % ( L_a,L_b )
# Plot the two found solutions. The solution are in a spline form, use this to produce a smooth plot.
x_plot = np.linspace(0, 1, 100)
y_plot_a = res_a.sol(x_plot)[0]
y_plot_b = res_b.sol(x_plot)[0]
plt.plot(L_a*x_plot, y_plot_a, label='L=%.8f'%L_a)
plt.plot(L_b*x_plot, y_plot_b, label='L=%.8f'%L_b)
plt.legend()
plt.xlabel("t")
plt.ylabel("y")
plt.grid(); plt.show()
which produces
Trying different initial values for L finds other solutions on quite different scales, among them
L=0.03195111
L=0.05256775
L=0.05846539
L=0.06888907
L=0.08231966
L=4.50411522
L=6.84868060
L=20.01725616
L=22.53189063
I designed an optical system with an a-spheric surface profile. I then had this lens manufactured and measured. I was given a cross sectional graph from the measurement of the manufactured surface profile. (The surface holds rotational symmetry)
The formula being used to model said aspheric surface is:
How can I fit this generalized equation with my cross sectional curve to obtain corresponding alpha coefficients to the curve? (alpha coefficients are referring to those in the provided formula) I know the radius of curvature of the surface.
I have access to Python and Matlab (no toolboxes) to achieve this. I can also obtain digitized, tabulated data points from the curve.
Assuming you have a array of discreet r and for each value of this array z(r). You want to fit a curve to estimate the parameters of an aspheric lens. I will use lmfit as mentioned here to show one way to do this using python.
Importing the modules used for this:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model, Parameters
Define the function of an asperic lens:
def asphere_complete(x, r0, k, a2, a4, a6, a8, a10, a12):
r_squared = x ** 2.
z_even_r = r_squared * (a2 + (r_squared * (a4 + r_squared * (a6 + r_squared * (a8 + r_squared * (a10 + (r_squared * a12)))))))
square_root_term = 1 - (1 + k) * ((x / r0) ** 2)
zg = (x ** 2) / (r0 * (1 + np.sqrt(square_root_term)))
return z_even_r + zg
As you do not provide any data, I will use the following to create some example data, including artificial noise:
def generate_dummy_data(x, asphere_parameters, noise_sigma, seed=12345):
np.random.seed(seed)
return asphere_complete(x, **asphere_parameters) + noise_sigma * np.random.randn(x.shape[0])
The following function does the fitting and plots the resulting curve:
def fit_asphere(r, z, fit_parameters):
# create two subplots to plot the original data and the fit in one plot and the residual in another
fig, axarr = plt.subplots(1, 2, figsize=(10, 5))
fit_plot = axarr[0]
residuum_plot = axarr[1]
# configure first plot:
fit_plot.set_xlabel("r")
fit_plot.set_ylabel("z")
fit_plot.grid()
# configure second plot:
residuum_plot.set_xlabel("r")
residuum_plot.set_ylabel("$\Delta$z")
residuum_plot.grid()
# plot original data
fit_plot.plot(r, z, label="Input")
# create an lmfit model and the parameters
function_model = Model(asphere_complete)
# The fitting procedure may throw ValueErrors, if the radicand gets negative
try:
result = function_model.fit(z, fit_parameters, x=r)
# To plot the resulting curve remove the parameters which were just used for the constraints
opt_parameters = dict(result.values)
opt_parameters.pop('r_max', None)
opt_parameters.pop('radicand', None)
# calculate z-values of fitted curve:
z_fitted = asphere_complete(r, **opt_parameters)
# calculate residual values
z_residual = z - z_fitted
# plot fit and residual:
fit_plot.plot(r, z_fitted, label="Fit")
residuum_plot.plot(r, z_residual, label="Residual")
# legends:
fit_plot.legend(loc="best")
residuum_plot.legend(loc="best")
print(result.fit_report())
except ValueError as val_error:
print("Fit Failed: ")
print(val_error)
To set the parameters of the example data I use the Parametersobject of lmfit:
if __name__ == "__main__":
parameters_dummy = Parameters()
parameters_dummy.add('r0', value=-34.4)
parameters_dummy.add('k', value=-0.98)
parameters_dummy.add('a2', value=0)
parameters_dummy.add('a4', value=-9.67e-9)
parameters_dummy.add('a6', value=1.59e-10)
parameters_dummy.add('a8', value=-5.0e-12)
parameters_dummy.add('a10', value=0)
parameters_dummy.add('a12', value=-1.0e-19)
Create the example data:
r = np.linspace(0, 35, 1000)
z = generate_dummy_data(r, parameters_dummy, 0.00001)
The reason to use lmfitinstead of scipy's curve_fitis that the radicand of the square root may become negativ. We need to ensure:
Therefor, we need to define a constraint as mentioned here.
Let's start to define our parameters we want to use in fitting. The basic radius is added straightforward:
parameters = Parameters()
parameters.add('r0', value=-30, vary=True)
To obey the inequality add a variable radicand which is not allowed to become less than zero. Instead of letting k taking part in the fitting normaly, make it direclty dependend on r0, r_max and radicand. We need to use r_max because the inequality is most problematic for the maximal r. Solving the inequalty for k leads to
which is used as exprbelow. I use a bool flag to switch on/off the constraint:
keep_radicand_safe = True
if keep_radicand_safe:
r_max = np.max(r)
parameters.add('r_max', r_max, vary=False)
parameters.add('radicand', value=0.98, vary=True, min=0)
parameters.add('k', expr='(r0/r_max)**2*(1-radicand)-1')
else:
parameters.add('k', value=-0.98, vary=True)
The remaining parameters are added straightforward:
parameters.add('a2', value=0, vary=False)
parameters.add('a4', value=0, vary=True)
parameters.add('a6', value=0, vary=True)
parameters.add('a8', value=0, vary=True)
parameters.add('a10', value=0, vary=False)
parameters.add('a12', value=0, vary=True)
Now we are ready to start and get our results:
fit_asphere(r, z, parameters)
plt.show()
On the console you should see the output:
[[Variables]]
r0: -34.3999435 +/- 6.1027e-05 (0.00%) (init = -30)
r_max: 35 (fixed)
radicand: 0.71508611 +/- 0.09385813 (13.13%) (init = 0.98)
k: -0.72477176 +/- 0.09066656 (12.51%) == '(r0/r_max)**2*(1-radicand)-1'
a2: 0 (fixed)
a4: 7.7436e-07 +/- 2.7872e-07 (35.99%) (init = 0)
a6: 2.5547e-10 +/- 6.3330e-11 (24.79%) (init = 0)
a8: -4.9832e-12 +/- 1.7115e-14 (0.34%) (init = 0)
a10: 0 (fixed)
a12: -9.8670e-20 +/- 2.0716e-21 (2.10%) (init = 0)
With the data I use above, you should see the fit fail if keep_radicand_safe is set to False.