How to define a loss function in pytorch with dependency to partial derivatives of the model w.r.t input? - neural-network

After reading about how to solve an ODE with neural networks following the paper Neural Ordinary Differential Equations and the blog that uses the library JAX I tried to do the same thing with "plain" Pytorch but found a point rather "obscure": How to properly use the partial derivative of a function (in this case the model) w.r.t one of the input parameters.
To resume the problem at hand as shown in 2 it is intended to solve the ODE y' = -2*x*y with the condition y(x=0) = 1 in the domain -2 <= x <= 2. Instead of using finite differences the solution is replaced by a NN as y(x) = NN(x) with a single layer with 10 nodes.
I managed to (more or less) replicate the blog with the following code
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the problem
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
x = torch.sigmoid(self.lin1(x))
x = torch.sigmoid(self.lin2(x))
return x
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,y):
dydx, = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)
eq = dydx + 2.* x * y # y' = - 2x*y
ic = model(torch.tensor([0.])) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
loss_func = ODE
# Define the optimization
# opt = optim.SGD(model.parameters(), lr=0.1, momentum=0.99,nesterov=True) # Equivalent to blog
opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad
# Define reference grid
x_data = torch.linspace(-2.0,2.0,401,requires_grad=True)
x_data = x_data.view(401,1) # reshaping the tensor
# Iterative learning
epochs = 1000
for epoch in range(epochs):
opt.zero_grad()
y_trial = model(x_data)
loss = loss_func(x_data, y_trial)
loss.backward()
opt.step()
if epoch % 100 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Plot Results
plt.plot(x_data.data.numpy(), np.exp(-x_data.data.numpy()**2), label='exact')
plt.plot(x_data.data.numpy(), y_data.data.numpy(), label='approx')
plt.legend()
plt.show()
From here I manage to get the results as shown in the fig.
enter image description here
The problems is that at the definition of the ODE functional, instead of passing (x,y) I would rather prefer to pass something like (x,fun) (where fun is my model) such that the partial derivative and specific evaluations of the model can be done with a call . So, something like
def ODE(x,fun):
dydx, = "grad of fun w.r.t x as a function"
eq = dydx(x) + 2.* x * fun(x) # y' = - 2x*y
ic = fun( torch.tensor([0.]) ) - 1. # y(x=0) = 1
return torch.mean(eq**2) + ic**2
Any ideas? Thanks in advance
EDIT:
After some trials I found a way to pass the model as an input but found another strange behavior... The new problem is to solve the ODE y'' = -2 with the BC y(x=-2) = -1 and y(x=2) = 1, for which the analytical solution is y(x) = -x^2+x/2+4
Let's modify a bit the previous code as:
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
import numpy as np
# Define the NN model to solve the equation
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,10)
self.lin2 = nn.Linear(10,1)
def forward(self, x):
y = torch.sigmoid(self.lin1(x))
z = torch.sigmoid(self.lin2(y))
return z
model = Model()
# Define loss_function from the Ordinary differential equation to solve
def ODE(x,fun):
y = fun(x)
dydx = torch.autograd.grad(y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x,
grad_outputs=dydx.data.new(dydx.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
eq = d2ydx2 + torch.tensor([ 2.]) # y'' = - 2
bc1 = fun(torch.tensor([-2.])) - torch.tensor([-1.]) # y(x=-2) = -1
bc2 = fun(torch.tensor([ 2.])) - torch.tensor([ 1.]) # y(x= 2) = 1
return torch.mean(eq**2) + bc1**2 + bc2**2
loss_func = ODE
So, here I passed the model as argument and managed to derive twice... so far so good. BUT, using the sigmoid function for this case is not only not necessary but also gives a result that is far from the analytical one.
If I change the NN for:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.lin1 = nn.Linear(1,1)
self.lin2 = nn.Linear(1,1)
def forward(self, x):
y = self.lin1(x)
z = self.lin2(y)
return z
In which case I would expect to optimize a double pass through two linear functions that would retrieve a 2nd order function ... I get the error:
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Adding the option to the definition of dydx doesn't solve the problem, and adding it to d2ydx2 gives a NoneType definition.
Is there something wrong with the layers as they are?

Quick Solution:
add allow_unused=True to .grad functions. So, change
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True)[0]
To
dydx = torch.autograd.grad(
y, x,
grad_outputs=y.data.new(y.shape).fill_(1),
create_graph=True, retain_graph=True, allow_unused=True)[0]
d2ydx2 = torch.autograd.grad(dydx, x, grad_outputs=dydx.data.new(
dydx.shape).fill_(1), create_graph=True, retain_graph=True, allow_unused=True)[0]
More explanation:
See what allow_unused do:
allow_unused (bool, optional): If ``False``, specifying inputs that were not
used when computing outputs (and therefore their grad is always zero)
is an error. Defaults to ``False``.
So, if you try to differentiate w.r.t to a variable that is not in being used to compute the value, it will give an error. Also, note that error only occurs when you use linear layers.
This is because when you use linear layers, you have y=W1*W2*x + b = Wx+b and dy/dx is not a function of x, it is simply W. So when you try to differentiate dy/dx w.r.t x it throws an error. This error goes away as soon as you use sigmoid because then dy/dx will be a function of x. To avoid the error, either make sure dy/dx is a function of x or use allow_unused=True

Related

Interpolation with radial basis function in julia

I have found few radial basis functions like BasisExpansionFunction, Surrogates.jl, ScatteredInterpolation in Julia.
However, I am unable to replicate the results from python's scipy.interpolate.rbf() function.
Python Example
from scipy.interpolate import Rbf
import numpy as np
xs = np.arange(10)
ys = xs**2 + np.sin(xs) + 1
interp_func = Rbf(xs, ys) # By default RbF uses Multiquadratic function
newarr = interp_func(np.arange(2.1, 3, 0.1))
print(newarr)
What is correct approach to replicate the above example in Julia?
The first tutorial in Surrogates.jl shows how to make and interpolate a radial basis function.
using Surrogates
using LinearAlgebra
f = x -> x[1]*x[2]
lb = [1.0,2.0]
ub = [10.0,8.5]
x = sample(50,lb,ub,SobolSample())
y = f.(x)
my_radial_basis = RadialBasis(x,y,lb,ub)
#I want an approximation at (1.0,1.4)
approx = my_radial_basis((1.0,1.4))

gaussian process regression in multiple dimensions with GPflow

I would like to perform some multivariant regression using gaussian process regression as implemented in GPflow using version 2.
Installed with pip install gpflow==2.0.0rc1
Below is some example code that generates some 2D data and then attempts to fit it with using GPR and the finally computes the difference
between the true input data and the GPR prediction.
Eventually I would like to extend to higher dimensions
and do tests against a validation set to check for over-fitting
and experiment with other kernels and "Automatic Relevance Determination"
but understanding how to get this to work is the first step.
Thanks!
The following code snippet will work in a jupyter notebook.
import gpflow
import numpy as np
import matplotlib
from gpflow.utilities import print_summary
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = (12, 6)
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def gen_data(X, Y):
"""
make some fake data.
X, Y are np.ndarrays with shape (N,) where
N is the number of samples.
"""
ys = []
for x0, x1 in zip(X,Y):
y = x0 * np.sin(x0*10)
y = x1 * np.sin(x0*10)
y += 1
ys.append(y)
return np.array(ys)
# generate some fake data
x = np.linspace(0, 1, 20)
X, Y = np.meshgrid(x, x)
X = X.ravel()
Y = Y.ravel()
z = gen_data(X, Y)
#note X.shape, Y.shape and z.shape
#are all (400,) for this case.
# if you would like to plot the data you can do the following
fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(X, Y, z, s=100, c='k')
# had to set this
# to avoid the following error
# tensorflow.python.framework.errors_impl.InvalidArgumentError: Cholesky decomposition was not successful. The input might not be valid. [Op:Cholesky]
gpflow.config.set_default_positive_minimum(1e-7)
# setup the kernel
k = gpflow.kernels.Matern52()
# set up GPR model
# I think the shape of the independent data
# should be (400, 2) for this case
XY = np.column_stack([[X, Y]]).T
print(XY.shape) # this will be (400, 2)
m = gpflow.models.GPR(data=(XY, z), kernel=k, mean_function=None)
# optimise hyper-parameters
opt = gpflow.optimizers.Scipy()
def objective_closure():
return - m.log_marginal_likelihood()
opt_logs = opt.minimize(objective_closure,
m.trainable_variables,
options=dict(maxiter=100)
)
# predict training set
mean, var = m.predict_f(XY)
print(mean.numpy().shape)
# (400, 400)
# I would expect this to be (400,)
# If it was then I could compute the difference
# between the true data and the GPR prediction
# `diff = mean - z`
# but because the shape is not as expected this of course
# won't work.
The shape of z must be (N, 1), whereas in your case it is (N,). However, this is a missing check in GPflow and not your fault.

GP + Tensorflow training

I am trying to train a GPR model and a tensorflow model together. The training part has no issue. But for prediction using the trained model I receive a type error in a tf.placeholder op.
pred, uncp=sess.run([my, yv], feed_dict={X:xtr})
The code is similar to the 2nd example from https://gpflow.readthedocs.io/en/master/notebooks/advanced_usage.html
import numpy as np
import tensorflow as tf
import gpflow
float_type = gpflow.settings.float_type
gpflow.reset_default_graph_and_session()
def cnn_fn(x, output_dim):
out= tf.layers.dense(inputs=x, units=output_dim, activation=tf.nn.relu)
print(out)
return out
N = 150
xtr = np.random.rand(N,1)
ytr = np.sin(12*xtr) + 0.66*np.cos(25*xtr) + np.random.randn(N,1)*0.1 + 3
xtr = np.random.rand(N,28)
print(xtr.shape, ytr.shape)
nepoch=50
gp_dim=xtr.shape[1]
print(gp_dim)
minibatch_size = 16
X = tf.placeholder(tf.float32, [None, gp_dim])
Y = tf.placeholder(tf.float32, [None, 1])
with tf.variable_scope('cnn'):
f_X = tf.cast(cnn_fn(X, gp_dim), dtype=float_type)
k = gpflow.kernels.Matern52(gp_dim)
gp_model = gpflow.models.GPR(f_X, tf.cast(Y, dtype=float_type), k)
loss = -gp_model.likelihood_tensor
m, v = gp_model._build_predict(f_X)
my, yv = gp_model.likelihood.predict_mean_and_var(m, v)
with tf.variable_scope('adam'):
opt_step = tf.train.AdamOptimizer(0.001).minimize(loss)
tf_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='adam')
tf_vars += tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='cnn')
## initialize
sess = tf.Session()
sess.run(tf.variables_initializer(var_list=tf_vars))
gp_model.initialize(session=sess)
for i in range(nepoch):
shind=np.array(range(len(xtr)))
np.random.shuffle(shind)
for j in range(int(len(xtr)/minibatch_size)):
ind=shind[j*minibatch_size: (j+1)*minibatch_size]
sess.run(opt_step, feed_dict={X:xtr[ind], Y:ytr[ind]})
Executing the code above runs fine. But adding the following line gives an error:
pred, uncp=sess.run([my, yv], feed_dict={X:xtr})
with the following error:
<ipython-input-25-269715087df2> in <module>
----> 1 pred, uncp=sess.run([my, yv], feed_dict={X:xtr})
[...]
InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_1' with dtype float and shape [?,1]
[[node Placeholder_1 (defined at <ipython-input-24-39ccf45cd248>:2) = Placeholder[dtype=DT_FLOAT, shape=[?,1], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
The reason your code fails is because you are not actually feeding in a value to one of the placeholders. This is easier to spot if you actually give them names:
X = tf.placeholder(tf.float32, [None, gp_dim], name='X')
Y = tf.placeholder(tf.float32, [None, 1], name='Y')
Tensorflow requires the entire compute graph to be well-defined, and the GPR model you are using depends on both X and Y. If you run the following line, it works fine:
pred, uncp = sess.run([my, yv], feed_dict={X: xtr, Y: ytr})
Update: as user1018464 pointed out in the comment, you are using the GPR model, in which the predictions directly depend on the training data (e.g. see equations (2.22) and (2.23) on page 16 of http://www.gaussianprocess.org/gpml/chapters/RW2.pdf). Hence you will need to pass in both xtr and ytr to make predictions.
Other models such as SVGP represent the function through "inducing features", commonly "pseudo input/output" pairs that summarise the data, in which case you won't need to feed in the original input values at all (I got it wrong when I first answered).
You could set up the model as follows:
gp_model = gpflow.models.SVGP(f_X, tf.cast(Y, dtype=float_type), k, gpflow.likelihoods.Gaussian(), xtr.copy(), num_data=N)
Then pred, uncp=sess.run([my, yv], feed_dict={X:xtr}) works as expected.
If you want to predict at different points Xtest, you need to set up a separate placeholder, and reuse the cnn (note the reuse=True in the variable_scope with the same name), as in example 2 of the notebook:
Xtest = tf.placeholder(tf.float32, [None, Mnist.input_dim], name='Xtest')
with tf.variable_scope('cnn', reuse=True):
f_Xtest = tf.cast(cnn_fn(Xtest, gp_dim), dtype=float_type)
Set up the model as before using f_X, but use f_Xtest in the call to _build_predict:
m, v = gp_model._build_predict(f_Xtest)
my, yv = gp_model.likelihood.predict_mean_and_var(m, v)
Now you need to pass in both X, Y, and Xtest into the session's run():
pred, uncp = sess.run([my, yv], feed_dict={X: xtr, Y: Ytr, Xtest: xtest})
where xtest is the numpy array with points at which you want to predict.
The GPflow manages TensorFlow sessions for you and you don't need to create your own TF session, when you use GPflow alone. In your case, tf.layers.dense makes
new variables, which should be initialized and I would advise to use a session which were created by GPflow. Essentially, you need to replace these lines
## initialize
sess = tf.Session()
sess.run(tf.variables_initializer(var_list=tf_vars))
gp_model.initialize(session=sess)
with:
sess = gpflow.get_default_session()
sess.run(tf.variables_initializer(var_list=tf_vars)
or wrap your code with default session context:
with tf.Session() as session:
... build and run

Optimizing a piecewise linear regression

I have written a function that, given parameters, can apply a piecewise linear fit, with arbitrarily many piecewise sections, to some data.
I am trying to fit the function to my data using scipy.optimize.curve_fit, but I am receiving an "OptimizeWarning: Covariance of the parameters could not be estimated" error. I believe this may be because of the nested lambda functions I am using to define the piecewise sections.
Is there an easy way to tweak my code to get round this, or a different scipy optimisation function that might be more suitable?
#The piecewise function
def piecewise_linear(x, *params):
N=len(params)/2
if N.is_integer():N=int(N)
else:raise(ValueError())
c=params[0]
xbounds=params[1:N]
grads=params[N:]
#First we define our conditions, which are true if x is a member of a given
#bin.
conditions=[]
#first and last bins are a special case:
cond0=lambda x: x<xbounds[0]
condl=lambda x: x>=xbounds[-1]
conditions.append(cond0(x))
for i in range(len(xbounds)-1):
cond=lambda x : (x >= xbounds[i]) & (x < xbounds[i+1])
conditions.append(cond(x))
conditions.append(condl(x))
#Next we define our linear regression function for each bin. The offset
#for each bin depends on where the previous bin ends, so we define
#the regression functions recursively:
functions=[]
func0 = lambda x: grads[0]*x +c
functions.append(func0)
for i in range(len(grads)-1):
func = (lambda j: lambda x: grads[j+1]*(x-xbounds[j])\
+functions[j](xbounds[j]))(i)
functions.append(func)
return np.piecewise(x,conditions,functions)
#Some data
x=np.arange(100)
y=np.array([*np.arange(0,19,1),*np.arange(20,59,2),\
*np.arange(60,20,-1),*np.arange(21,42,1)]) + np.random.randn(100)
#A first guess of parameters
cguess=0
boundguess=[20,30,50]
gradguess=[1,1,1,1]
p0=[cguess,*boundguess,*gradguess]
fit=scipy.optimize.curve_fit(piecewise_linear,x,y,p0=p0)
Here is example code that fits two straight lines to a curved data set with a breakpoint, where the line parameters and breakpoint are all fitted. This example uses scipy's Differential Evolution genetic algorithm to determine initial parameter estimates for the regression. That module uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires bounds within which to search. In this example those search bounds are derived from the data itself. Note that it is much easier to find ranges for the initial parameter estimates than to give specific values.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
xData = numpy.array([19.1647, 18.0189, 16.9550, 15.7683, 14.7044, 13.6269, 12.6040, 11.4309, 10.2987, 9.23465, 8.18440, 7.89789, 7.62498, 7.36571, 7.01106, 6.71094, 6.46548, 6.27436, 6.16543, 6.05569, 5.91904, 5.78247, 5.53661, 4.85425, 4.29468, 3.74888, 3.16206, 2.58882, 1.93371, 1.52426, 1.14211, 0.719035, 0.377708, 0.0226971, -0.223181, -0.537231, -0.878491, -1.27484, -1.45266, -1.57583, -1.61717])
yData = numpy.array([0.644557, 0.641059, 0.637555, 0.634059, 0.634135, 0.631825, 0.631899, 0.627209, 0.622516, 0.617818, 0.616103, 0.613736, 0.610175, 0.606613, 0.605445, 0.603676, 0.604887, 0.600127, 0.604909, 0.588207, 0.581056, 0.576292, 0.566761, 0.555472, 0.545367, 0.538842, 0.529336, 0.518635, 0.506747, 0.499018, 0.491885, 0.484754, 0.475230, 0.464514, 0.454387, 0.444861, 0.437128, 0.415076, 0.401363, 0.390034, 0.378698])
def func(xArray, breakpoint, slopeA, offsetA, slopeB, offsetB):
returnArray = []
for x in xArray:
if x < breakpoint:
returnArray.append(slopeA * x + offsetA)
else:
returnArray.append(slopeB * x + offsetB)
return returnArray
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
slope = 10.0 * (maxY - minY) / (maxX - minX) # times 10 for safety margin
parameterBounds = []
parameterBounds.append([minX, maxX]) # search bounds for breakpoint
parameterBounds.append([-slope, slope]) # search bounds for slopeA
parameterBounds.append([minY, maxY]) # search bounds for offsetA
parameterBounds.append([-slope, slope]) # search bounds for slopeB
parameterBounds.append([minY, maxY]) # search bounds for offsetB
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# call curve_fit without passing bounds from genetic algorithm
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

Merging two tensors by convolution in Keras

I'm trying to convolve two 1D tensors in Keras.
I get two inputs from other models:
x - of length 100
ker - of length 5
I would like to get the 1D convolution of x using the kernel ker.
I wrote a Lambda layer to do it:
import tensorflow as tf
def convolve1d(x):
y = tf.nn.conv1d(value=x[0], filters=x[1], padding='VALID', stride=1)
return y
x = Input(shape=(100,))
ker = Input(shape=(5,))
y = Lambda(convolve1d)([x,ker])
model = Model([x,ker], [y])
I get the following error:
ValueError: Shape must be rank 4 but is rank 3 for 'lambda_67/conv1d/Conv2D' (op: 'Conv2D') with input shapes: [?,1,100], [1,?,5].
Can anyone help me understand how to fix it?
It was much harder than I expected because Keras and Tensorflow don't expect any batch dimension in the convolution kernel so I had to write the loop over the batch dimension myself, which requires to specify batch_shape instead of just shape in the Input layer. Here it is :
import numpy as np
import tensorflow as tf
import keras
from keras import backend as K
from keras import Input, Model
from keras.layers import Lambda
def convolve1d(x):
input, kernel = x
output_list = []
if K.image_data_format() == 'channels_last':
kernel = K.expand_dims(kernel, axis=-2)
else:
kernel = K.expand_dims(kernel, axis=0)
for i in range(batch_size): # Loop over batch dimension
output_temp = tf.nn.conv1d(value=input[i:i+1, :, :],
filters=kernel[i, :, :],
padding='VALID',
stride=1)
output_list.append(output_temp)
print(K.int_shape(output_temp))
return K.concatenate(output_list, axis=0)
batch_input_shape = (1, 100, 1)
batch_kernel_shape = (1, 5, 1)
x = Input(batch_shape=batch_input_shape)
ker = Input(batch_shape=batch_kernel_shape)
y = Lambda(convolve1d)([x,ker])
model = Model([x, ker], [y])
a = np.ones(batch_input_shape)
b = np.ones(batch_kernel_shape)
c = model.predict([a, b])
In the current state :
It doesn't work for inputs (x) with multiple channels.
If you provide several filters, you get as many outputs, each being the convolution of the input with the corresponding kernel.
From given code it is difficult to point out what you mean when you say
is it possible
But if what you mean is to merge two layers and feed merged layer to convulation, yes it is possible.
x = Input(shape=(100,))
ker = Input(shape=(5,))
merged = keras.layers.concatenate([x,ker], axis=-1)
y = K.conv1d(merged, 'same')
model = Model([x,ker], y)
EDIT:
#user2179331 thanks for clarifying your intention. Now you are using Lambda Class incorrectly, that is why the error message is showing.
But what you are trying to do can be achieved using keras.backend layers.
Though be noted that when using lower level layers you will lose some higher level abstraction. E.g when using keras.backend.conv1d you need to have input shape of (BATCH_SIZE,width, channels) and kernel with shape of (kernel_size,input_channels,output_channels). So in your case let as assume the x has channels of 1(input channels ==1) and y also have the same number of channels(output channels == 1).
So your code now can be refactored as follows
from keras import backend as K
def convolve1d(x,kernel):
y = K.conv1d(x,kernel, padding='valid', strides=1,data_format="channels_last")
return y
input_channels = 1
output_channels = 1
kernel_width = 5
input_width = 100
ker = K.variable(K.random_uniform([kernel_width,input_channels,output_channels]),K.floatx())
x = Input(shape=(input_width,input_channels)
y = convolve1d(x,ker)
I guess I have understood what you mean. Given the wrong example code below:
input_signal = Input(shape=(L), name='input_signal')
input_h = Input(shape=(N), name='input_h')
faded= Lambda(lambda x: tf.nn.conv1d(input, x))(input_h)
You want to convolute each signal vector with different fading coefficients vector.
The 'conv' operation in TensorFlow, etc. tf.nn.conv1d, only support a fixed value kernel. Therefore, the code above can not run as you want.
I have no idea, too. The code you given can run normally, however, it is too complex and not efficient. In my idea, another feasible but also inefficient way is to multiply with the Toeplitz matrix whose row vector is the shifted fading coefficients vector. When the signal vector is too long, the matrix will be extremely large.