I am using curve_fit to fit a step response of a first order dynamic system to estimate the gain and time constant. I use two approaches. First approach is to fit the curve generated from the function , in the time domain .
# define the first order dynamics in the time domain
def model(t,gain,tau):
return (gain*(1-exp(-t/tau)))
#define the time intervals
time_interval = linspace(1,100,100)
#genearte the output using the model with gain= 10 and tau= 4
output= model(t,10,4)
# fit to output and estimate parameters - gain and tau
par = curve_fit(time_interval, output)
Now checking par reveals an array of 10 and 4 which is perfect.
The second approach is to estimate gain and time constant by fitting to a step response of a LTI system
The LTI System is defined as a transfer function with numerator and denominator.
#define function as a step response of a LTI system .
# The argument x has no significance here,
# I have included because , the curve_fit requires passing "x" data to the function
def model1(x ,gain1,tau1):
return lti(gain1,[tau1,1]).step()[1]
#generate output using the above model
output1 = model1(0,10,4)
par1 = curve_fit(model1,1,output1)
now checking par1 reveals an array of [ 1.00024827, 0.01071004] which is wrong. What is wrong with my second approach here? Is there more efficient way of estimating the transfer function coefficients from the data by curve_fit
Thank you
The first three arguments to curve_fit are the function to be fit,
the xdata and the ydata. You have passed xdata=1. Instead you should
give it the time values associated with output1.
One way to do that is to actually use the first argument in the function
model1, like you did in model(). For example:
import numpy as np
from scipy.signal import lti
from scipy.optimize import curve_fit
def model1(x, gain1, tau1):
y = lti(gain1, [tau1, 1]).step(T=x)[1]
return y
time_interval = np.linspace(1,100,100)
output1 = model1(time_interval, 10, 4)
par1 = curve_fit(model1, time_interval, output1)
I get [10., 4.] for the parameters, as expected.
Related
I am trying to replicate somehow what excel solver would do in python.
I have a set of functions like this: P1 = f1(x), P2= f2(x), Q1= g1(x) and Q2= g2(x)
I am trying to find the value of x such as P1+P2 = some target and Q1+Q2 is minimum. can it be done with scipy? I already know how to set the P1+P2 part using fsolve, just dont know if the Q1+Q2 restriction can be added. any idea?
As suggested by joni, this is doable by using the scipy.optimize.minimize library. You could define a function residual as follows:
def residual(x):
# calculate/define Q1 = g1(x)
# calculate/define Q2 = g2(x)
res = Q1 + Q2
return res
This function then can easily be minimized using a constrained algorithm from scipy.optimize.minimize:
import numpy as np
from scipy.optimize import minimize
x0 = 1 # just for example
res = minimize(residual, x0, method='trust-constr', constraints=your_constraints)
The constraint P1+P2 = target must be defined and passed to the constraints argument as described here. You have to look for linear or non-linear constraint depending upon your constraint.
Using either lmfit.Model or scipy.optimize.curve_fit I have to optimize a function whose output needs to be convolved with some experimental data before being fit to some other experimental data. To sum up, the workflow is something like this:
(1) Function A is defined (for example, a Gaussian function).
(2) The output of function A is convolved with an experimental signal called data B.
(3) The parameters of function A are optimized for the convolution mentioned in (2) to perfectly match some other experimental data called data C.
I am convolving the output of function A with data B using Fourier transforms as follows:
from scipy.fftpack import fft, ifft
def convolve(data_B, function_A):
convolved = ifft(fft(IRF) * fft(model)).real
return convolved
How can I use lmfit.Model or scipy.optimize.curve_fit to fit "convolved" to data C?
EDIT: In response to the submitted answer, I have incorporated my convolution step into the equation used for the fit in the following manner:
#1 component exponential distribution:
def ExpDecay_1(x, ampl1, tau1, y0, x0, args=(new_y_irf)): # new_y_irf is a list.
h = np.zeros(x.size)
lengthVec = len(new_y_decay)
shift_1 = np.remainder(np.remainder(x-np.floor(x0)-1, lengthVec) + lengthVec, lengthVec)
shift_Incr1 = (1 - x0 + np.floor(x0))*new_y_irf[shift_1.astype(int)]
shift_2 = np.remainder(np.remainder(x-np.ceil(x0)-1, lengthVec) + lengthVec, lengthVec)
shift_Incr2 = (x0 - np.floor(x0))*new_y_irf[shift_2.astype(int)]
irf_shifted = (shift_Incr1 + shift_Incr2)
irf_norm = irf_shifted/sum(irf_shifted)
h = ampl1*np.exp(-(x)/tau1)
conv = ifft(fft(h) * fft(irf_norm)).real # This is the convolution step.
return conv
However, when I try this:
gmodel = Model(ExpDecay_1)
I get this:
gmodel = Model(ExpDecay_1) Traceback (most recent call last):
File "", line 1, in
gmodel = Model(ExpDecay_1)
File "C:\Users\lopez\Anaconda3\lib\site-packages\lmfit\model.py",
line 273, in init
self._parse_params()
File "C:\Users\lopez\Anaconda3\lib\site-packages\lmfit\model.py",
line 477, in _parse_params
if fpar.default == fpar.empty:
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
EDIT 2: I managed to make it work as follows:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import numpy as np
from lmfit import Model
from scipy.fftpack import fft, ifft
def Test_fit2(x, arg=new_y_irf, data=new_y_decay, num_decay=1):
IRF = arg
DATA = data
def Exp(x, ampl1=1.0, tau1=3.0): # This generates an exponential model.
res = ampl1*np.exp(-x/tau1)
return res
def Conv(IRF,decay): # This convolves a model with the data (data = Instrument Response Function, IRF).
conv = ifft(fft(decay) * fft(IRF)).real
return conv
if num_decay == 1: # If the user chooses to use a model equation with one exponential term.
def fitting(x, ampl1=1.0, tau1=3.0):
exponential = Exp(x,ampl1,tau1)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 2: # If the user chooses to use a model equation with two exponential terms.
def fitting(x, ampl1=1.0, tau1=3.0, ampl2=1.0, tau2=1.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 3: # If the user chooses to use a model equation with three exponential terms.
def fitting(x, ampl1=1.0, tau1=3.0, ampl2=2.0, tau2=1.0, ampl3=3.0, tau3=5.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)+Exp(x,ampl3,tau3)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 4: # If the user chooses to use a model equation with four exponential terms.
def fitting(x, ampl1=1.0, tau1=0.1, ampl2=2.0, tau2=1.0, ampl3=3.0, tau3=5.0, ampl4=1.0, tau4=10.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)+Exp(x,ampl3,tau3)+Exp(x,ampl4,tau4)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
return res
It is always helpful to post a complete, minimal example of what you are trying to do. Without a complete example, only vague answers are possible.
You could simply do the convolutions in your model function that is wrapped by lmfit.Model, passing in the kernel array to use in the convolution. Or you could create convolution kernel and function, and do the convolution as part of the modeling process, as described for example at https://lmfit.github.io/lmfit-py/examples/documentation/model_composite.html
I would imagine that the first approach is easier if the kernel is not actually meant to be changed during the fit, but it's hard to know that for sure without more details.
I'm trying to fit sine function on my data. No errors are shown but it doesn't seem to work.
python
def sin_fun(x,a,b):
return (a*np.sin(b*x))
p_opt,p_cov=cf(sin_fun,xdata,ydata)
print(p_opt)
plt.plot(xdata,sin_fun(xdata,*p_opt))
plt.scatter(xdata,ydata)
plt.show()
This is the output I am getting:
I have simulated your data. There are 2 problems with your code as to why it isn't doing what you want. First is that your sin_fun needs a y-offset parameter, otherwise the function will always be symmetrical about y = 0. Secondly, the fit works better if you can provide curve_fit with a reasonable guess. This is done using the p0 argument. Have a look here:
from scipy.optimize import curve_fit as cf
import numpy as np
from matplotlib import pyplot as plt
# simulate your data
xdata = np.linspace(0, 25000, 256)
ydata = 15000 * np.sin(xdata/2000) + 22000
# add some noise
ydata += np.random.rand(xdata.size) * 2000
# sin function needs a y-offset -> c
def sin_fun(x,a,b,c):
return a*np.sin(b*x)+c
# need a reasonable guess -> note that the guess is not quite right but curve_fit still works
p_opt,p_cov=cf(sin_fun,xdata,ydata, p0=(10000, 1/2500, 15000))
print(p_opt)
plt.plot(xdata,sin_fun(xdata,*p_opt))
plt.plot(xdata,ydata, 'r.', ms=1)
plt.show()
With these fixes you can get a good fit. You could also add a phase parameter to your function to help fit other sinusoids.
I'd like to build a GP with marginalized hyperparameters.
I have seen that this is possible with the HMC sampler provided in gpflow from this notebook
However, when I tried to run the following code as a first step of this (NOTE this is on gpflow 0.5, an older version), the returned samples are negative, even though the lengthscale and variance need to be positive (negative values would be meaningless).
import numpy as np
from matplotlib import pyplot as plt
import gpflow
from gpflow import hmc
X = np.linspace(-3, 3, 20)
Y = np.random.exponential(np.sin(X) ** 2)
Y = (Y - np.mean(Y)) / np.std(Y)
k = gpflow.kernels.Matern32(1, lengthscales=.2, ARD=False)
m = gpflow.gpr.GPR(X[:, None], Y[:, None], k)
m.kern.lengthscales.prior = gpflow.priors.Gamma(1., 1.)
m.kern.variance.prior = gpflow.priors.Gamma(1., 1.)
# dont want likelihood be a hyperparam now so fixed
m.likelihood.variance = 1e-6
m.likelihood.variance.fixed = True
m.optimize(maxiter=1000)
samples = m.sample(500)
print(samples)
Output:
[[-0.43764571 -0.22753325]
[-0.50418501 -0.11070128]
[-0.5932655 0.00821438]
[-0.70217714 0.05077999]
[-0.77745654 0.09362291]
[-0.79404456 0.13649446]
[-0.83989415 0.27118385]
[-0.90355789 0.29589641]
...
I don't know too much in detail about HMC sampling but I would expect that the sampled posterior hyperparameters are positive, I've checked the code and it seems maybe related to the Log1pe transform, though I failed to figure it out myself.
Any hint on this?
It would be helpful if you specified which GPflow version you are using - especially given that from the output you posted it looks like you are using a really old version of GPflow (pre-1.0), and this is actually something that got improved since. What is happening here (in old GPflow) is that the sample() method returns a single array S x P, where S is the number of samples, and P is the number of free parameters [e.g. for a M x M matrix parameter with lower-triangular transform (such as the Cholesky of the covariance of the approximate posterior, q_sqrt), only M * (M - 1)/2 parameters are actually stored and optimised!]. These are the values in the unconstrained space, i.e. they can take any value whatsoever. Transforms (see gpflow.transforms module) provide the mapping between this value (between plus/minus infinity) and the constrained value (e.g. gpflow.transforms.positive for lengthscales and variances). In old GPflow, the model provides a get_samples_df() method that takes the S x P array returned by sample() and returns a pandas DataFrame with columns for all the trainable parameters which would be what you want. Or, ideally, you would just use a recent version of GPflow, in which the HMC sampler directly returns the DataFrame!
I am trying to fit a piecewise (otherwise linear) function to a set of experimental data. The form of the data is such that there is only horizontal error bars and no vertical error bars. I am familiar with scipy.optimize.curve_fit module but that works when there is only vertical error bars corresponding to the dependent variable y. After searching for my specific need, I came across the following post where it explains about the possibility of using scipy.odr module when errorbars are those of independent variable x. (Correct fitting with scipy curve_fit including errors in x?)
Attached is my version of the code which tries to find best-fit parameters using ODR methodology. It actually draws best-fit function and it seems it's working. However, after changing initial (educated guess) values and trying to extract best-fit parameters, I am getting the same guessed parameters I inserted initially. This means that the method is not convergent and you can verify this by printing output.stopreason and getting
['Numerical error detected']
So, my question is whether this methodology is consistent with my function being piecewise and if not, if there is any other correct methodology to adopt in such cases?
from numpy import *
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
from scipy.odr import ODR, Model, Data, RealData
x_array=array([8.2,8.6,9.,9.4,9.8,10.2,10.6,11.,11.4,11.8])
x_err_array=array([0.2]*10)
y_array=array([-2.05179545,-1.64998354,-1.49136169,-0.94200805,-0.60205999,0.,0.,0.,0.,0.])
y_err_array=array([0]*10)
# Linear Fitting Model
def func(beta, x):
return piecewise(x, [x < beta[0]], [lambda x:beta[1]*x-beta[1]*beta[0], lambda x:0.0])
data = RealData(x_array, y_array, x_err_array, y_err_array)
model = Model(func)
odr = ODR(data, model, [10.1,1.02])
odr.set_job(fit_type=0)
output = odr.run()
f, (ax1) = plt.subplots(1, sharex=True, sharey=True, figsize=(10,10))
ax1.errorbar(x_array, y_array, xerr = x_err_array, yerr = y_err_array, ecolor = 'blue', elinewidth = 3, capsize = 3, linestyle = '')
ax1.plot(x_array, func(output.beta, x_array), 'blue', linestyle = 'dotted', label='Best-Fit')
ax1.legend(loc='lower right', ncol=1, fontsize=12)
ax1.set_xlim([7.95, 12.05])
ax1.set_ylim([-2.1, 0.1])
ax1.yaxis.set_major_locator(MaxNLocator(prune='upper'))
ax1.set_ylabel('$y$', fontsize=12)
ax1.set_xlabel('$x$', fontsize=12)
ax1.set_xscale("linear", nonposx='clip')
ax1.set_yscale("linear", nonposy='clip')
ax1.get_xaxis().tick_bottom()
ax1.get_yaxis().tick_left()
f.subplots_adjust(top=0.98,bottom=0.14,left=0.14,right=0.98)
plt.setp([a.get_xticklabels() for a in f.axes[:-1]], visible=True)
plt.show()
An error of 0 for y is causing problems. Make it small but not zero, e.g. 1e-16. Doing so the fit converges. It also does if you omit the y_err_array when defining RealData but I am not sure what happens internally in that case.