How to set x of a function given a target and a constraint? - scipy

I am trying to replicate somehow what excel solver would do in python.
I have a set of functions like this: P1 = f1(x), P2= f2(x), Q1= g1(x) and Q2= g2(x)
I am trying to find the value of x such as P1+P2 = some target and Q1+Q2 is minimum. can it be done with scipy? I already know how to set the P1+P2 part using fsolve, just dont know if the Q1+Q2 restriction can be added. any idea?

As suggested by joni, this is doable by using the scipy.optimize.minimize library. You could define a function residual as follows:
def residual(x):
# calculate/define Q1 = g1(x)
# calculate/define Q2 = g2(x)
res = Q1 + Q2
return res
This function then can easily be minimized using a constrained algorithm from scipy.optimize.minimize:
import numpy as np
from scipy.optimize import minimize
x0 = 1 # just for example
res = minimize(residual, x0, method='trust-constr', constraints=your_constraints)
The constraint P1+P2 = target must be defined and passed to the constraints argument as described here. You have to look for linear or non-linear constraint depending upon your constraint.

Related

How to use `feedback` function in Matlab?

Matlab's feedback function is used to obtain the closed loop transfer function of a system. Example:
sys = feedback(sys1,sys2) returns a model object sys for the negative feedback interconnection of model objects sys1,sys2. To compute the closed-loop system with positive feedback, use sign = +1, for negative feedback we use -1.
My question arises when we have a system of the following type:
According to these docs, we can use feedback to create the negative feedback loop with G and C.
sys = feedback(G*C,-1)
This is a source of confusion, shouldn't the above be: sys = feedback(G*C,1,-1)? These are not the same.
However, looking at these docs, for a unit loop gain k, you can compute the closed-loop transfer function T using:
G = tf([.5 1.3],[1 1.2 1.6 0]);
T = feedback(G,1);
Why are we using 1 and not -1? This is still negative feedback and not positive feedback.
G = tf([.5 1.3],[1 1.2 1.6 0]);
T = feedback(G,1);
The one in feedback(G,1) represents sys2 and since the function has two inputs, the default value will be a negative unity feedback according to the following line
sys = feedback(sys1,sys2) returns a model object sys for the negative
feedback interconnection of model objects sys1,sys2.
Consider the following script
s = tf('s');
G = 1/s;
T1 = feedback(G,1)
T2 = feedback(G,1,-1)
T1 and T2 are same.

How to define a loss function in pytorch with dependency to partial derivatives of the model w.r.t input?

After reading about how to solve an ODE with neural networks following the paper Neural Ordinary Differential Equations and the blog that uses the library JAX I tried to do the same thing with "plain" Pytorch but found a point rather "obscure": How to properly use the partial derivative of a function (in this case the model) w.r.t one of the input parameters.
To resume the problem at hand as shown in 2 it is intended to solve the ODE y' = -2*x*y with the condition y(x=0) = 1 in the domain -2 <= x <= 2. Instead of using finite differences the solution is replaced by a NN as y(x) = NN(x) with a single layer with 10 nodes.
I managed to (more or less) replicate the blog with the following code
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the problem
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
x = torch.sigmoid(self.lin1(x))
x = torch.sigmoid(self.lin2(x))
return x
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
dydx, = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)
eq = dydx + 2.* x * y # y' = - 2x*y
ic = model(torch.tensor([0.])) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
loss_func = ODE
# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad
# Define reference grid
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor
# Iterative learning
epochs = 1000
for epoch in range(epochs):
opt.zero_grad()
y_trial = model(x_data)
loss = loss_func(x_data, y_trial)
loss.backward()
opt.step()
if epoch % 100 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()
From here I manage to get the results as shown in the fig.
enter image description here
The problems is that at the definition of the ODE functional, instead of passing (x,y) I would rather prefer to pass something like (x,fun) (where fun is my model) such that the partial derivative and specific evaluations of the model can be done with a call . So, something like
def ODE(x,fun):
dydx, = "grad of fun w.r.t x as a function"
eq = dydx(x) + 2.* x * fun(x) # y' = - 2x*y
ic = fun( torch.tensor([0.]) ) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
Any ideas? Thanks in advance
EDIT:
After some trials I found a way to pass the model as an input but found another strange behavior... The new problem is to solve the ODE y'' = -2 with the BC y(x=-2) = -1 and y(x=2) = 1, for which the analytical solution is y(x) = -x^2+x/2+4
Let's modify a bit the previous code as:
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the equation
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
y = torch.sigmoid(self.lin1(x))
z = torch.sigmoid(self.lin2(y))
return z
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
y = fun(x)
dydx = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x,
grad_outputs=dydx.data.new(dydx.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
eq = d2ydx2 + torch.tensor([ 2.]) # y'' = - 2
bc1 = fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
bc2 = fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) = 1
return torch.mean(eq**2) + bc1**2 + bc2**2
loss_func = ODE
So, here I passed the model as argument and managed to derive twice... so far so good. BUT, using the sigmoid function for this case is not only not necessary but also gives a result that is far from the analytical one.
If I change the NN for:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,1)
self.lin2 = nn.Linear(1,1)
def forward(self, x):
y = self.lin1(x)
z = self.lin2(y)
return z
In which case I would expect to optimize a double pass through two linear functions that would retrieve a 2nd order function ... I get the error:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Adding the option to the definition of dydx doesn't solve the problem, and adding it to d2ydx2 gives a NoneType definition.
Is there something wrong with the layers as they are?
Quick Solution:
add allow_unused=True to .grad functions. So, change
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]
To
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True, allow_unused=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]
More explanation:
See what allow_unused do:
allow_unused (bool, optional): If ``False``, specifying inputs that were not
used when computing outputs (and therefore their grad is always zero)
is an error. Defaults to ``False``.
So, if you try to differentiate w.r.t to a variable that is not in being used to compute the value, it will give an error. Also, note that error only occurs when you use linear layers.
This is because when you use linear layers, you have y=W1*W2*x + b = Wx+b and dy/dx is not a function of x, it is simply W. So when you try to differentiate dy/dx w.r.t x it throws an error. This error goes away as soon as you use sigmoid because then dy/dx will be a function of x. To avoid the error, either make sure dy/dx is a function of x or use allow_unused=True

ipython non-linear least squares with constraints equations

I am new to iPython, and need to solve a specific curve fitting problem, I have the concept but my programming knowledge is yet too limited. I have experimental data (x, y) to fit to an equation (curve fitting) with four coefficients (a,b,c,d), I would like to fix one of these coefficients (e.g. a ) to a specific value and refit my experimental data (non-linear least squares). coefficients b, c and d are not independent of another, meaning they are related by a system of equations.
Is it more adequate to use curve_fit or lmfit?
I started this with curve_fit:
def fitfunc(x,a,b,c,d):
return a+b*x+c/x+log10(x)*d
popt, fitcov = curve_fit(fitfunc, x, y)
or a code like this with lmfit:
import scipy as sp
from lmfit import minimize, Parameters, Parameter, report_fit
def fctmin(params, x, y):
a = params['a'].value
b = params['b'].value
c = params['c'].value
d = params['d'].value
model = a+b*x+c/x+d*np.log10(x)
return model - y
#create parameters
params = Parameters()
params.add('a', value = -89)
params.add('b', value =b)
params.add('c', value = c)
params.add('d', value = d)
#fit leastsq model
result = minimize(fctmin, params, args=(x, y))
#calculate results
final = y + result.residual
report_fit(params)
I'll admit to being biased. curve_fit() is designed to simplify scipy.optimize.leastsq() by assuming that you are fitting y(x) data to a model for y(x, parameters), so the function you pass to curve_fit() is one that will calculate the model for the values to be fit. lmfit is a bit more general and flexible in that your objective function has to return the array to be minimized in the least-squares sense, but your objective function has to return "model-data" instead of "model"
But, lmfit has features that appear to do exactly what you want: fix one of the parameters in the model without having to rewrite the objective function.
That is, you could say
params.add('a', value = -89, vary=False)
and the parameter 'a' will stay fixed. To do that with curve_fit() you have to rewrite your model function.
In addition, you say that b, c and d are related by equations, but don't give details. With lmfit, you might be able to include these equations as constraints. You have
params.add('b', value =b)
params.add('c', value = c)
params.add('d', value = d)
though I don't see a value for b. Even assuming there is a value, this creates three independent variables with the same starting value. You might mean "vary b, and force c and d to have the same value". lmfit can do this with:
params.add('b', value = 10)
params.add('c', expr = 'b')
params.add('d', expr = 'c')
That will have one independent variable, and the value for c will be forced to the value of b (and d to c). You can use (approximately) any valid python statement as a constraints constraint expression, for example:
params.add('b', value = 10)
params.add('c', expr = 'sqrt(b/10)')
params.add('d', expr = '1-c')
I think that might be the sort of thing you're looking for.

BFGS Fails to Converge

The model I'm working on is a multinomial logit choice model. It's a very specific dataset so other existing MNLogit libraries don't fit with my data.
So basically, it's a very complex function which takes 11 parameters and returns a loglikelihood value. Then I need to find the optimal parameter values that can minimize the loglikelihood using scipy.optimize.minimize.
Here are the problems that I encounter with different methods:
'Nelder-Mead’: it works well, and always give me the correct answer. However, it's EXTREMELY slow. For another function with a more complicated setup, it takes 15 hours to get to the optimal point. At the same time, the same function takes only 1 hour on Matlab using fminunc (which uses BFGS by default)
‘BFGS’: This is the method used by Matlab. It works well for any simply functions. However, for the function that I have, it always fails to converge and returns 'Desired error not necessarily achieved due to precision loss.’. I've spent lots of time playing around with the options but still failed to work.
'Powell': It quickly converges successfully but returns a wrong answer. The code is printed below (x0 is the correct answer, Nelder-Mead works for whatever initial value), and you can get the data here: https://www.dropbox.com/s/aap2dhor5jyxy94/data.csv
Thanks!
import pandas as pd
import numpy as np
from scipy.optimize import minimize
# https://www.dropbox.com/s/aap2dhor5jyxy94/data.csv
df = pd.read_csv('data.csv', index_col=0)
dfhh = df.hh
B = df.ix[:,'b0':'b4'].values # NT*5
P = df.ix[:,'p1':'p4'].values # NT*4
F = df.ix[:,'f1':'f4'].values # NT*4
SDV = df.ix[:,'lagb1':'lagb4'].values
def Li(x):
b1 = x[0] # coeff on prices
b2 = x[1] # coeff on features
a = x[2:7] # take first 4 values as alpha
E = np.exp(a + b1*P + b2*F) # (1*4) + (NT*4) + (NT*4) build matrix (NT*J) for each exp()
E = np.insert(E, 0, 1, axis=1) # (NT*5)
denom = E.sum(1)
return -np.log((B * E).sum(1) / denom).sum()
x0 = np.array([-32.31028223, 0.23965953, 0.84739154, 0.25418215,-3.38757007,-0.38036966])
np.random.seed(0)
x0 = x0 + np.random.rand(6)
minL = minimize(Li, x0, method='Nelder-Mead',options={'xtol': 1e-8, 'disp': True})
# minL = minimize(Li, x0, method='BFGS')
# minL = minimize(Li, x0, method='Powell', options={'xtol': 1e-12, 'ftol': 1e-12})
print minL
Update: 03/07/14 Simpler Version of the Code
Now Powell works well with very small tolerance, however the speed of Powell is slower than Nelder-Mead in this case. BFGS still fails to work.

Fitting transfer function models in Scipy.signal

I am using curve_fit to fit a step response of a first order dynamic system to estimate the gain and time constant. I use two approaches. First approach is to fit the curve generated from the function , in the time domain .
# define the first order dynamics in the time domain
def model(t,gain,tau):
return (gain*(1-exp(-t/tau)))
#define the time intervals
time_interval = linspace(1,100,100)
#genearte the output using the model with gain= 10 and tau= 4
output= model(t,10,4)
# fit to output and estimate parameters - gain and tau
par = curve_fit(time_interval, output)
Now checking par reveals an array of 10 and 4 which is perfect.
The second approach is to estimate gain and time constant by fitting to a step response of a LTI system
The LTI System is defined as a transfer function with numerator and denominator.
#define function as a step response of a LTI system .
# The argument x has no significance here,
# I have included because , the curve_fit requires passing "x" data to the function
def model1(x ,gain1,tau1):
return lti(gain1,[tau1,1]).step()[1]
#generate output using the above model
output1 = model1(0,10,4)
par1 = curve_fit(model1,1,output1)
now checking par1 reveals an array of [ 1.00024827, 0.01071004] which is wrong. What is wrong with my second approach here? Is there more efficient way of estimating the transfer function coefficients from the data by curve_fit
Thank you
The first three arguments to curve_fit are the function to be fit,
the xdata and the ydata. You have passed xdata=1. Instead you should
give it the time values associated with output1.
One way to do that is to actually use the first argument in the function
model1, like you did in model(). For example:
import numpy as np
from scipy.signal import lti
from scipy.optimize import curve_fit
def model1(x, gain1, tau1):
y = lti(gain1, [tau1, 1]).step(T=x)[1]
return y
time_interval = np.linspace(1,100,100)
output1 = model1(time_interval, 10, 4)
par1 = curve_fit(model1, time_interval, output1)
I get [10., 4.] for the parameters, as expected.