Optimizing the convolution of a function with lmfit.Model or scipy.optimize.curve_fit - scipy

Using either lmfit.Model or scipy.optimize.curve_fit I have to optimize a function whose output needs to be convolved with some experimental data before being fit to some other experimental data. To sum up, the workflow is something like this:
(1) Function A is defined (for example, a Gaussian function).
(2) The output of function A is convolved with an experimental signal called data B.
(3) The parameters of function A are optimized for the convolution mentioned in (2) to perfectly match some other experimental data called data C.
I am convolving the output of function A with data B using Fourier transforms as follows:
from scipy.fftpack import fft, ifft
def convolve(data_B, function_A):
convolved = ifft(fft(IRF) * fft(model)).real
return convolved
How can I use lmfit.Model or scipy.optimize.curve_fit to fit "convolved" to data C?
EDIT: In response to the submitted answer, I have incorporated my convolution step into the equation used for the fit in the following manner:
#1 component exponential distribution:
def ExpDecay_1(x, ampl1, tau1, y0, x0, args=(new_y_irf)): # new_y_irf is a list.
h = np.zeros(x.size)
lengthVec = len(new_y_decay)
shift_1 = np.remainder(np.remainder(x-np.floor(x0)-1, lengthVec) + lengthVec, lengthVec)
shift_Incr1 = (1 - x0 + np.floor(x0))*new_y_irf[shift_1.astype(int)]
shift_2 = np.remainder(np.remainder(x-np.ceil(x0)-1, lengthVec) + lengthVec, lengthVec)
shift_Incr2 = (x0 - np.floor(x0))*new_y_irf[shift_2.astype(int)]
irf_shifted = (shift_Incr1 + shift_Incr2)
irf_norm = irf_shifted/sum(irf_shifted)
h = ampl1*np.exp(-(x)/tau1)
conv = ifft(fft(h) * fft(irf_norm)).real # This is the convolution step.
return conv
However, when I try this:
gmodel = Model(ExpDecay_1)
I get this:
gmodel = Model(ExpDecay_1) Traceback (most recent call last):
File "", line 1, in
gmodel = Model(ExpDecay_1)
File "C:\Users\lopez\Anaconda3\lib\site-packages\lmfit\model.py",
line 273, in init
self._parse_params()
File "C:\Users\lopez\Anaconda3\lib\site-packages\lmfit\model.py",
line 477, in _parse_params
if fpar.default == fpar.empty:
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
EDIT 2: I managed to make it work as follows:
import pandas as pd
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import numpy as np
from lmfit import Model
from scipy.fftpack import fft, ifft
def Test_fit2(x, arg=new_y_irf, data=new_y_decay, num_decay=1):
IRF = arg
DATA = data
def Exp(x, ampl1=1.0, tau1=3.0): # This generates an exponential model.
res = ampl1*np.exp(-x/tau1)
return res
def Conv(IRF,decay): # This convolves a model with the data (data = Instrument Response Function, IRF).
conv = ifft(fft(decay) * fft(IRF)).real
return conv
if num_decay == 1: # If the user chooses to use a model equation with one exponential term.
def fitting(x, ampl1=1.0, tau1=3.0):
exponential = Exp(x,ampl1,tau1)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 2: # If the user chooses to use a model equation with two exponential terms.
def fitting(x, ampl1=1.0, tau1=3.0, ampl2=1.0, tau2=1.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 3: # If the user chooses to use a model equation with three exponential terms.
def fitting(x, ampl1=1.0, tau1=3.0, ampl2=2.0, tau2=1.0, ampl3=3.0, tau3=5.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)+Exp(x,ampl3,tau3)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
if num_decay == 4: # If the user chooses to use a model equation with four exponential terms.
def fitting(x, ampl1=1.0, tau1=0.1, ampl2=2.0, tau2=1.0, ampl3=3.0, tau3=5.0, ampl4=1.0, tau4=10.0):
exponential = Exp(x,ampl1,tau1)+Exp(x,ampl2,tau2)+Exp(x,ampl3,tau3)+Exp(x,ampl4,tau4)
convolved = Conv(IRF,exponential)
return convolved
modelling = Model(fitting)
res = modelling.fit(DATA,x=new_x_decay,ampl1=1.0,tau1=2.0)
return res

It is always helpful to post a complete, minimal example of what you are trying to do. Without a complete example, only vague answers are possible.
You could simply do the convolutions in your model function that is wrapped by lmfit.Model, passing in the kernel array to use in the convolution. Or you could create convolution kernel and function, and do the convolution as part of the modeling process, as described for example at https://lmfit.github.io/lmfit-py/examples/documentation/model_composite.html
I would imagine that the first approach is easier if the kernel is not actually meant to be changed during the fit, but it's hard to know that for sure without more details.

Related

Fitting a neural network with ReLUs to polynomial functions

Out of curiosity I am trying to fit neural network with rectified linear units to polynomial functions.
For example, I would like to see how easy (or difficult) it is for a neural network to come up with an approximation for the function f(x) = x^2 + x. The following code should be able to do it, but seems to not learn anything. When I run
using Base.Iterators: repeated
ENV["JULIA_CUDA_SILENT"] = true
using Flux
using Flux: throttle
using Random
f(x) = x^2 + x
x_train = shuffle(1:1000)
y_train = f.(x_train)
x_train = hcat(x_train...)
m = Chain(
Dense(1, 45, relu),
Dense(45, 45, relu),
Dense(45, 1),
softmax
)
function loss(x, y)
Flux.mse(m(x), y)
end
evalcb = () -> #show(loss(x_train, y_train))
opt = ADAM()
#show loss(x_train, y_train)
dataset = repeated((x_train, y_train), 50)
Flux.train!(loss, params(m), dataset, opt, cb = throttle(evalcb, 10))
println("Training finished")
#show m([20])
it returns
loss(x_train, y_train) = 2.0100101f14
loss(x_train, y_train) = 2.0100101f14
loss(x_train, y_train) = 2.0100101f14
Training finished
m([20]) = Float32[1.0]
Anyone here sees how I could make the network fit f(x) = x^2 + x?
There seem to be couple of things wrong with your trial that have mostly to do with how you use your optimizer and treat your input -- nothing wrong with Julia or Flux. Provided solution does learn, but is by no means optimal.
It makes no sense to have softmax output activation on a regression problem. Softmax is used in classification problems where the output(s) of your model represent probabilities and therefore should be on the interval (0,1). It is clear your polynomial has values outside this interval. It is usual to have linear output activation in regression problems like these. This means in Flux no output activation should be defined on the output layer.
The shape of your data matters. train! computes gradients for loss(d...) where d is a batch in your data. In your case a minibatch consists of 1000 samples, and this same batch is repeated 50 times. Neural nets are often trained with smaller batches sizes, but a larger sample set. In the code I provided all batches consist of different data.
For training neural nets, in general, it is advised to normalize your input. Your input takes values from 1 to 1000. My example applies a simple linear transformation to get the input data in the right range.
Normalization can also apply to the output. If the outputs are large, this can result in (too) large gradients and weight updates. Another approach is to lower the learning rate a lot.
using Flux
using Flux: #epochs
using Random
normalize(x) = x/1000
function generate_data(n)
f(x) = x^2 + x
xs = reduce(hcat, rand(n)*1000)
ys = f.(xs)
(normalize(xs), normalize(ys))
end
batch_size = 32
num_batches = 10000
data_train = Iterators.repeated(generate_data(batch_size), num_batches)
data_test = generate_data(100)
model = Chain(Dense(1,40, relu), Dense(40,40, relu), Dense(40, 1))
loss(x,y) = Flux.mse(model(x), y)
opt = ADAM()
ps = Flux.params(model)
Flux.train!(loss, ps, data_train, opt , cb = () -> #show loss(data_test...))

How to define a loss function in pytorch with dependency to partial derivatives of the model w.r.t input?

After reading about how to solve an ODE with neural networks following the paper Neural Ordinary Differential Equations and the blog that uses the library JAX I tried to do the same thing with "plain" Pytorch but found a point rather "obscure": How to properly use the partial derivative of a function (in this case the model) w.r.t one of the input parameters.
To resume the problem at hand as shown in 2 it is intended to solve the ODE y' = -2*x*y with the condition y(x=0) = 1 in the domain -2 <= x <= 2. Instead of using finite differences the solution is replaced by a NN as y(x) = NN(x) with a single layer with 10 nodes.
I managed to (more or less) replicate the blog with the following code
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the problem
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
x = torch.sigmoid(self.lin1(x))
x = torch.sigmoid(self.lin2(x))
return x
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
dydx, = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)
eq = dydx + 2.* x * y # y' = - 2x*y
ic = model(torch.tensor([0.])) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
loss_func = ODE
# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad
# Define reference grid
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor
# Iterative learning
epochs = 1000
for epoch in range(epochs):
opt.zero_grad()
y_trial = model(x_data)
loss = loss_func(x_data, y_trial)
loss.backward()
opt.step()
if epoch % 100 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()
From here I manage to get the results as shown in the fig.
enter image description here
The problems is that at the definition of the ODE functional, instead of passing (x,y) I would rather prefer to pass something like (x,fun) (where fun is my model) such that the partial derivative and specific evaluations of the model can be done with a call . So, something like
def ODE(x,fun):
dydx, = "grad of fun w.r.t x as a function"
eq = dydx(x) + 2.* x * fun(x) # y' = - 2x*y
ic = fun( torch.tensor([0.]) ) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
Any ideas? Thanks in advance
EDIT:
After some trials I found a way to pass the model as an input but found another strange behavior... The new problem is to solve the ODE y'' = -2 with the BC y(x=-2) = -1 and y(x=2) = 1, for which the analytical solution is y(x) = -x^2+x/2+4
Let's modify a bit the previous code as:
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the equation
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
y = torch.sigmoid(self.lin1(x))
z = torch.sigmoid(self.lin2(y))
return z
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
y = fun(x)
dydx = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x,
grad_outputs=dydx.data.new(dydx.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
eq = d2ydx2 + torch.tensor([ 2.]) # y'' = - 2
bc1 = fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
bc2 = fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) = 1
return torch.mean(eq**2) + bc1**2 + bc2**2
loss_func = ODE
So, here I passed the model as argument and managed to derive twice... so far so good. BUT, using the sigmoid function for this case is not only not necessary but also gives a result that is far from the analytical one.
If I change the NN for:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,1)
self.lin2 = nn.Linear(1,1)
def forward(self, x):
y = self.lin1(x)
z = self.lin2(y)
return z
In which case I would expect to optimize a double pass through two linear functions that would retrieve a 2nd order function ... I get the error:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Adding the option to the definition of dydx doesn't solve the problem, and adding it to d2ydx2 gives a NoneType definition.
Is there something wrong with the layers as they are?
Quick Solution:
add allow_unused=True to .grad functions. So, change
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]
To
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True, allow_unused=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]
More explanation:
See what allow_unused do:
allow_unused (bool, optional): If ``False``, specifying inputs that were not
used when computing outputs (and therefore their grad is always zero)
is an error. Defaults to ``False``.
So, if you try to differentiate w.r.t to a variable that is not in being used to compute the value, it will give an error. Also, note that error only occurs when you use linear layers.
This is because when you use linear layers, you have y=W1*W2*x + b = Wx+b and dy/dx is not a function of x, it is simply W. So when you try to differentiate dy/dx w.r.t x it throws an error. This error goes away as soon as you use sigmoid because then dy/dx will be a function of x. To avoid the error, either make sure dy/dx is a function of x or use allow_unused=True

Merging two tensors by convolution in Keras

I'm trying to convolve two 1D tensors in Keras.
I get two inputs from other models:
x - of length 100
ker - of length 5
I would like to get the 1D convolution of x using the kernel ker.
I wrote a Lambda layer to do it:
import tensorflow as tf
def convolve1d(x):
y = tf.nn.conv1d(value=x[0], filters=x[1], padding='VALID', stride=1)
return y
x = Input(shape=(100,))
ker = Input(shape=(5,))
y = Lambda(convolve1d)([x,ker])
model = Model([x,ker], [y])
I get the following error:
ValueError: Shape must be rank 4 but is rank 3 for 'lambda_67/conv1d/Conv2D' (op: 'Conv2D') with input shapes: [?,1,100], [1,?,5].
Can anyone help me understand how to fix it?
It was much harder than I expected because Keras and Tensorflow don't expect any batch dimension in the convolution kernel so I had to write the loop over the batch dimension myself, which requires to specify batch_shape instead of just shape in the Input layer. Here it is :
import numpy as np
import tensorflow as tf
import keras
from keras import backend as K
from keras import Input, Model
from keras.layers import Lambda
def convolve1d(x):
input, kernel = x
output_list = []
if K.image_data_format() == 'channels_last':
kernel = K.expand_dims(kernel, axis=-2)
else:
kernel = K.expand_dims(kernel, axis=0)
for i in range(batch_size): # Loop over batch dimension
output_temp = tf.nn.conv1d(value=input[i:i+1, :, :],
filters=kernel[i, :, :],
padding='VALID',
stride=1)
output_list.append(output_temp)
print(K.int_shape(output_temp))
return K.concatenate(output_list, axis=0)
batch_input_shape = (1, 100, 1)
batch_kernel_shape = (1, 5, 1)
x = Input(batch_shape=batch_input_shape)
ker = Input(batch_shape=batch_kernel_shape)
y = Lambda(convolve1d)([x,ker])
model = Model([x, ker], [y])
a = np.ones(batch_input_shape)
b = np.ones(batch_kernel_shape)
c = model.predict([a, b])
In the current state :
It doesn't work for inputs (x) with multiple channels.
If you provide several filters, you get as many outputs, each being the convolution of the input with the corresponding kernel.
From given code it is difficult to point out what you mean when you say
is it possible
But if what you mean is to merge two layers and feed merged layer to convulation, yes it is possible.
x = Input(shape=(100,))
ker = Input(shape=(5,))
merged = keras.layers.concatenate([x,ker], axis=-1)
y = K.conv1d(merged, 'same')
model = Model([x,ker], y)
EDIT:
#user2179331 thanks for clarifying your intention. Now you are using Lambda Class incorrectly, that is why the error message is showing.
But what you are trying to do can be achieved using keras.backend layers.
Though be noted that when using lower level layers you will lose some higher level abstraction. E.g when using keras.backend.conv1d you need to have input shape of (BATCH_SIZE,width, channels) and kernel with shape of (kernel_size,input_channels,output_channels). So in your case let as assume the x has channels of 1(input channels ==1) and y also have the same number of channels(output channels == 1).
So your code now can be refactored as follows
from keras import backend as K
def convolve1d(x,kernel):
y = K.conv1d(x,kernel, padding='valid', strides=1,data_format="channels_last")
return y
input_channels = 1
output_channels = 1
kernel_width = 5
input_width = 100
ker = K.variable(K.random_uniform([kernel_width,input_channels,output_channels]),K.floatx())
x = Input(shape=(input_width,input_channels)
y = convolve1d(x,ker)
I guess I have understood what you mean. Given the wrong example code below:
input_signal = Input(shape=(L), name='input_signal')
input_h = Input(shape=(N), name='input_h')
faded= Lambda(lambda x: tf.nn.conv1d(input, x))(input_h)
You want to convolute each signal vector with different fading coefficients vector.
The 'conv' operation in TensorFlow, etc. tf.nn.conv1d, only support a fixed value kernel. Therefore, the code above can not run as you want.
I have no idea, too. The code you given can run normally, however, it is too complex and not efficient. In my idea, another feasible but also inefficient way is to multiply with the Toeplitz matrix whose row vector is the shifted fading coefficients vector. When the signal vector is too long, the matrix will be extremely large.

I am training a keras neural net. I would like to have a custom loss function given by y_true*y_pred. Is this allowed?

This is a snippet of my model:
W1 = create_base_network(latent_dim)
input_a = Input(shape=(1,latent_dim))
input_b = Input(shape=(1,latent_dim))
x_a = encoder(input_a)
x_b = encoder(input_b)
processed_a = W1(x_a)
processed_b = W1(x_b)
del1 = Lambda(Delta1, output_shape=Delta1_output_shape)([processed_a, processed_b])
model = Model(input=[input_a, input_b], output=del1)
# train
rms = RMSprop()
model.compile(loss='kappa_delta_loss', optimizer=rms)
Basically, the neural net is getting a (pre-trained) encoder representation of the two inputs and computing the difference in prediction values for the two inputs by passing through a MLP. This difference is Delta1 which is y_pred of the network. I want the loss function to be y_pred*y_true. However, when I do that, I get the error, 'Invalid objective: kappa_delta_loss'.
What am I doing wrong?
You almost answer the question yourself. Create your objective
function like ones in
https://github.com/fchollet/keras/blob/master/keras/objectives.py like
this,
import theano import theano.tensor as T
epsilon = 1.0e-9
def custom_objective(y_true, y_pred):
'''Just another crossentropy'''
y_pred = T.clip(y_pred, epsilon, 1.0 - epsilon)
y_pred /= y_pred.sum(axis=-1, keepdims=True)
cce = T.nnet.categorical_crossentropy(y_pred, y_true)
return cce and pass it to compile argument
model.compile(loss=custom_objective, optimizer='adadelta')
from https://github.com/fchollet/keras/issues/369
So you should create your custom loss function with two arguments, the first being the target and the second your prediction.
Assuming your output (y_pred) is a scalar, your custom objective could be
def custom objective(y_true,y_pred)
return K.dot(y_true,y_pred)
K for keras backend (more generic than the theano example)

Fitting transfer function models in Scipy.signal

I am using curve_fit to fit a step response of a first order dynamic system to estimate the gain and time constant. I use two approaches. First approach is to fit the curve generated from the function , in the time domain .
# define the first order dynamics in the time domain
def model(t,gain,tau):
return (gain*(1-exp(-t/tau)))
#define the time intervals
time_interval = linspace(1,100,100)
#genearte the output using the model with gain= 10 and tau= 4
output= model(t,10,4)
# fit to output and estimate parameters - gain and tau
par = curve_fit(time_interval, output)
Now checking par reveals an array of 10 and 4 which is perfect.
The second approach is to estimate gain and time constant by fitting to a step response of a LTI system
The LTI System is defined as a transfer function with numerator and denominator.
#define function as a step response of a LTI system .
# The argument x has no significance here,
# I have included because , the curve_fit requires passing "x" data to the function
def model1(x ,gain1,tau1):
return lti(gain1,[tau1,1]).step()[1]
#generate output using the above model
output1 = model1(0,10,4)
par1 = curve_fit(model1,1,output1)
now checking par1 reveals an array of [ 1.00024827, 0.01071004] which is wrong. What is wrong with my second approach here? Is there more efficient way of estimating the transfer function coefficients from the data by curve_fit
Thank you
The first three arguments to curve_fit are the function to be fit,
the xdata and the ydata. You have passed xdata=1. Instead you should
give it the time values associated with output1.
One way to do that is to actually use the first argument in the function
model1, like you did in model(). For example:
import numpy as np
from scipy.signal import lti
from scipy.optimize import curve_fit
def model1(x, gain1, tau1):
y = lti(gain1, [tau1, 1]).step(T=x)[1]
return y
time_interval = np.linspace(1,100,100)
output1 = model1(time_interval, 10, 4)
par1 = curve_fit(model1, time_interval, output1)
I get [10., 4.] for the parameters, as expected.