How to use groups parameter in PyTorch conv2d function - convolution

I am trying to compute a per-channel gradient image in PyTorch. To do this, I want to perform a standard 2D convolution with a Sobel filter on each channel of an image. I am using the torch.nn.functional.conv2d function for this
In my minimum working example code below, I get an error:
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(1,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1)
RuntimeError: Given groups=1, weight[1, 1, 3, 3], so expected
input[1, 3, 10, 10] to have 1 channels, but got 3 channels instead
This suggests that groups need to be 3. However, when I make groups=3, I get a different error:
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(1,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1, groups=3)
RuntimeError: invalid argument 4: out of range at
/usr/local/src/pytorch/torch/lib/TH/generic/THTensor.c:440
When I check that code snippet in the THTensor class, it refers to a bunch of dimension checks, but I don't know where I'm going wrong.
What does this error mean? How can I perform my intended convolution with this conv2d function? I believe I am misunderstanding the groups parameter.

If you want to apply a per-channel convolution then your out-channel should be the same as your in-channel. This is expected, considering each of your input channels creates a separate output channel that it corresponds to.
In short, this will work
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(3,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1, groups=3)
whereas, filters of size (2, 1, 3, 3) or (1, 1, 3, 3) will not work.
Additionally, you can also make your out-channel a multiple of in-channel. This works for instances where you want to have multiple convolution filters for each input channel.
However, This only makes sense if it is a multiple. If not, then pytorch falls back to its closest multiple, a number less than what you specified. This is once again expected behavior. For example a filter of size (4, 1, 3, 3) or (5, 1, 3, 3), will result in an out-channel of size 3.

Related

Why does the HMC sampler return negative values for hyperparameters that need to be positive? [older GPflow versions before 1.0]

I'd like to build a GP with marginalized hyperparameters.
I have seen that this is possible with the HMC sampler provided in gpflow from this notebook
However, when I tried to run the following code as a first step of this (NOTE this is on gpflow 0.5, an older version), the returned samples are negative, even though the lengthscale and variance need to be positive (negative values would be meaningless).
import numpy as np
from matplotlib import pyplot as plt
import gpflow
from gpflow import hmc
X = np.linspace(-3, 3, 20)
Y = np.random.exponential(np.sin(X) ** 2)
Y = (Y - np.mean(Y)) / np.std(Y)
k = gpflow.kernels.Matern32(1, lengthscales=.2, ARD=False)
m = gpflow.gpr.GPR(X[:, None], Y[:, None], k)
m.kern.lengthscales.prior = gpflow.priors.Gamma(1., 1.)
m.kern.variance.prior = gpflow.priors.Gamma(1., 1.)
# dont want likelihood be a hyperparam now so fixed
m.likelihood.variance = 1e-6
m.likelihood.variance.fixed = True
m.optimize(maxiter=1000)
samples = m.sample(500)
print(samples)
Output:
[[-0.43764571 -0.22753325]
[-0.50418501 -0.11070128]
[-0.5932655 0.00821438]
[-0.70217714 0.05077999]
[-0.77745654 0.09362291]
[-0.79404456 0.13649446]
[-0.83989415 0.27118385]
[-0.90355789 0.29589641]
...
I don't know too much in detail about HMC sampling but I would expect that the sampled posterior hyperparameters are positive, I've checked the code and it seems maybe related to the Log1pe transform, though I failed to figure it out myself.
Any hint on this?
It would be helpful if you specified which GPflow version you are using - especially given that from the output you posted it looks like you are using a really old version of GPflow (pre-1.0), and this is actually something that got improved since. What is happening here (in old GPflow) is that the sample() method returns a single array S x P, where S is the number of samples, and P is the number of free parameters [e.g. for a M x M matrix parameter with lower-triangular transform (such as the Cholesky of the covariance of the approximate posterior, q_sqrt), only M * (M - 1)/2 parameters are actually stored and optimised!]. These are the values in the unconstrained space, i.e. they can take any value whatsoever. Transforms (see gpflow.transforms module) provide the mapping between this value (between plus/minus infinity) and the constrained value (e.g. gpflow.transforms.positive for lengthscales and variances). In old GPflow, the model provides a get_samples_df() method that takes the S x P array returned by sample() and returns a pandas DataFrame with columns for all the trainable parameters which would be what you want. Or, ideally, you would just use a recent version of GPflow, in which the HMC sampler directly returns the DataFrame!

How to use scipy signal for MIMO systems

I am looking for a way to simulate the output of a signal for various input signals. To be more precise, I have a system defined by its transfer function H that takes one input and has one output. I generated several signals (stored in a numpy array). What I would like to do, is get the response of the system, to each input signal whithout using a for loop. Is there a way to proceed? Below is the code I wrote so far.
from __future__ import division
import numpy as np
from scipy import signal
nbr_inputs = 5
t_in = np.arange(0,10,0.2)
dim = (nbr_inputs, len(t_in))
x = np.cumsum(np.random.normal(0,2e-3, dim), axis=1)
H = signal.TransferFunction([1, 3, 3], [1, 2, 1])
t_out, y, _ = signal.lsim(H, x[0], t_in) # here, I would just like to simply write x
thanks for your help
This is not a MIMO system, it is a SISO system but you have multiple inputs.
You can create a MIMO system and apply your inputs all at once which will be computed channel by channel but simultaneously. Moreover, you can't use scipy.signal.lsim for MIMO systems yet. You can use other options such as python-control (if you have slycot extension otherwise again no MIMO) or harold if you have Python 3.6 or greater (disclaimer: I'm the author).
import numpy as np
from harold import *
import matplotlib.pyplot
nbr_inputs = 5
t_in = np.arange(0,10,0.2)
dim = (nbr_inputs, len(t_in))
x = np.cumsum(np.random.normal(0,2e-3, dim), axis=1)
# Forming a 1x5 system, common denominator will be completed automatically
H = Transfer([[[1, 3, 3]]*nbr_inputs], [1, 2, 1])
The keyword per_channel=True applies first input to first channel, second input to second and so on. Otherwise combined response is returned. You can check the shapes by playing around with it to see what I mean.
# Notice it is x.T below -> input shape = <num samples>, <num inputs>
y, t = simulate_linear_system(H, x.T, t_in, per_channel=True)
plt.plot(t, y)
This gives

Padding with even kernel size in a convolutional layer in Keras (Theano)

I need to now how data is padded in a 1d convolutional layer using Keras with Theano as backend. I use a "same" padding.
Assuming we have an output_length of 8 and a kernel_size of 4. According to the original Keras code we have padding of 8//4 == 2. However, when adding two zeros at the left and the right end of my horizontal data, I could compute 9 convolutions instead of 8.
Can somebody explain me how data is padded? Where are zeros added and how do I compute the number of padding values on the right and left side of my data?
How to test the way keras pads the sequences:
A very simple test you can do is to create a model with a single convolutional layer, enforce its weights to be 1 and its biases to be 0, and give it an input with ones to see the output:
from keras.layers import *
from keras.models import Model
import numpy as np
#creating the model
inp = Input((8,1))
out = Conv1D(filters=1,kernel_size=4,padding='same')(inp)
model = Model(inp,out)
#adjusting the weights
ws = model.layers[1].get_weights()
ws[0] = np.ones(ws[0].shape) #weights
ws[1] = np.zeros(ws[1].shape) #biases
model.layers[1].set_weights(ws)
#predicting the result for a sequence with 8 elements
testData=np.ones((1,8,1))
print(model.predict(testData))
The output of this code is:
[[[ 2.] #a result 2 shows only 2 of the 4 kernel frames were activated
[ 3.] #a result 3 shows only 3 of the 4 kernel frames were activated
[ 4.] #a result 4 shows the full kernel was used
[ 4.]
[ 4.]
[ 4.]
[ 4.]
[ 3.]]]
So we can conclude that:
Keras adds the padding before performing the convolutions, not after. So the results are not "zero".
Keras distributes the padding equally, and when there is an odd number, it goes first.
So, it made the input data look like this before applying the convolutions
[0,0,1,1,1,1,1,1,1,1,0]

In Tensorflow, What kind of neural network should I use?

I am doing Tensorflow tutorial, getting what TF is. But I am confused about what neural network should I use in my work.
I am looking at Single Layer Neural Network, CNN, RNN, and LSTM RNN.
There is a sensor which measures something and represents the result in 2 boolean ways. Here, they are Blue and Red, like this:
the sensor gives result values every 5minutes. If we pile up the values for each color, we can see some patterns:
number inside each circle represents the sequence of result values given from sensor. (107 was given right after 106) when you see from 122 to 138, you can see decalcomanie-like pattern.
I want to predict the next boolean value before the sensor result. I may do supervised learning using past results. But I'm not sure which neural network or method is suitable. Thinking that this work needs pattern using past results (have to see context), and memorize past results, maybe LSTM RNN (long-short term memory recurrent neural network) would be suitable one. Could you tell me what is the right one?
So it sounds like you need to process a sequences of images. You could actually use both CNN and RNN together. I did this a month ago when I was training a network to swipe left or right on tinder using the sequence of profile pictures. What you would do is pass all of the images through a CNN and then into the RNN. Below is part of the code for my tinder bot. See how I distribute the convolutions over the sequence and then push it through the RNN. Finally I put a softmax classifier on the last time step to make the prediction, however in your case I think you will distribuite the prediction in time since you want the next item in the sequence.
self.input_tensor = tf.placeholder(tf.float32, (None, self.max_seq_len, self.img_height, self.img_width, 3), 'input_tensor')
self.expected_classes = tf.placeholder(tf.int64, (None,))
self.is_training = tf.placeholder_with_default(False, None, 'is_training')
self.learning_rate = tf.placeholder(tf.float32, None, 'learning_rate')
self.tensors = {}
activation = tf.nn.elu
rnn = tf.nn.rnn_cell.LSTMCell(256)
with tf.variable_scope('series') as scope:
state = rnn.zero_state(tf.shape(self.input_tensor)[0], tf.float32)
for t, img in enumerate(reversed(tf.unpack(self.input_tensor, axis = 1))):
y = tf.map_fn(tf.image.per_image_whitening, img)
features = 48
for c_layer in range(3):
with tf.variable_scope('pool_layer_%d' % c_layer):
with tf.variable_scope('conv_1'):
filter = tf.get_variable('filter', (3, 3, y.get_shape()[-1].value, features))
b = tf.get_variable('b', (features,))
y = tf.nn.conv2d(y, filter, (1, 1, 1, 1), 'SAME') + b
y = activation(y)
self.tensors['img_%d_conv_%d' % (t, 2 * c_layer)] = y
with tf.variable_scope('conv_2'):
filter = tf.get_variable('filter', (3, 3, y.get_shape()[-1].value, features))
b = tf.get_variable('b', (features,))
y = tf.nn.conv2d(y, filter, (1, 1, 1, 1), 'SAME') + b
y = activation(y)
self.tensors['img_%d_conv_%d' % (t, 2 * c_layer + 1)] = y
y = tf.nn.max_pool(y, (1, 3, 3, 1), (1, 3, 3, 1), 'SAME')
self.tensors['pool_%d' % c_layer] = y
features *= 2
print(y.get_shape())
with tf.variable_scope('rnn'):
y = tf.reshape(y, (-1, np.prod(y.get_shape().as_list()[1:])))
y, state = rnn(y, state)
self.tensors['rnn_%d' % t] = y
scope.reuse_variables()
with tf.variable_scope('output_classifier'):
W = tf.get_variable('W', (y.get_shape()[-1].value, 2))
b = tf.get_variable('b', (2,))
y = tf.nn.dropout(y, tf.select(self.is_training, 0.5, 1.0))
y = tf.matmul(y, W) + b
self.tensors['classifier'] = y
Yes, an RNN (recurrent neural network) fits the task of accumulating state along along a sequence in order to predict its next element. LSTM (long short-term memory) is a particular design for the recurrent pieces of the network that has turned out to be very successful in avoiding numerical challenges from long-lasting recurrences; see colah's much-cited blogpost for more. (Alternatives to the LSTM cell design exist but I would only fine tune that much later, possibly never.)
The TensorFlow RNN codelab explains LSTM RNNs for the case of language models, which predict the (n+1)-st word of a sentence from the preceding n words, for each n (like for each timestep in your series of measurements). Your case is simpler than language models in that you only have two words (red and blue), so if you read anything about embeddings of words, ignore it.
You also mentioned other types of neural networks. These are not aimed at accumulating state along a sequence, such as your boolean sequence of red/blue inputs. However, your second image suggests that there might be pattern in the sequence of counts of successive red/blue values. You could try using the past k counts as input to a plain feed-forward (i.e., non-recursive) neural network that predicts the probability of the next measurement having the same color as the current one. - Maybe that works with a single layer, or maybe two or even three work better; experimentation will tell. This is a less fancy approach than an RNN, but if it works good enough, it gives you a simpler solution with fewer technicalities to worry about.
CNNs (convolutional neural networks) would not be my first choice here. These aim to discover a set of fixed-scale features at various places in the input, for example, some texture or curved edge anywhere in an image. But you only want to predict one next item that extends your input sequence. A plain neural network (see above) may discover useful patterns on the k previous values, and training it with all earlier partial sequences will help it find those patterns. The CNN approach would help to discover them during prediction at long-gone parts of the input; I have no intuition why that would help.

Passing Individual Channels of Tensors to Layers in Keras

I am trying to emulate something equivalent to a SeparableConvolution2D layer for the theano backend (it already exists for the TensorFlow backend). As the first step What I need to do is pass ONE channel from a tensor into the next layer. So say I have a 2D convolution layer called conv1 with 16 filters which produces an output with shape: (batch_size, 16, height, width) I need to select the subtensor with shape (: , 0, : , : ) and pass it to the next layer. Simple enough right?
This is my code:
from keras import backend as K
image_input = Input(batch_shape = (batch_size, 1, height, width ), name = 'image_input' )
conv1 = Convolution2D(16, 3, 3, name='conv1', activation = 'relu')(image_input)
conv2_input = K.reshape(conv1[:,0,:,:] , (batch_size, 1, height, width))
conv2 = Convolution2D(16, 3, 3, name='conv1', activation = 'relu')(conv2_input)
This throws:
Exception: You tried to call layer "conv1". This layer has no information about its expected input shape, and thus cannot be built. You can build it manually via: layer.build(batch_input_shape)
Why does the layer not have the required shape information? I'm using reshape from the theano backend. Is this the right way of passing individual channels to the next layer?
I asked this question on the keras-user group and I got an answer there:
https://groups.google.com/forum/#!topic/keras-users/bbQ5CbVXT1E
Quoting it:
You need to use a lambda layer, like: Lambda(x: x[:, 0:1, :, :], output_shape=lambda x: (x[0], 1, x[2], x[3]))
Note that such a manual implementation of a separable convolution would be horribly inefficient. The correct solution is to use the TensorFlow backend.