How to merge two tensors at the beginning of a network in Torch? - neural-network

Given the following beginning of a network
local net = nn.Sequential()
net:add(SpatialConvolution(3, 64, 4, 4, 2, 2, 1, 1))
with an input tensor input
local input = torch.Tensor(batchSize, 3, 64, 64)
// during training
local output = net:forward(input)
I want to modify the network to accept a second tensor cond as input
local cond = torch.Tensor(batchSize, 1000, 1, 1)
// during training
local output = net:forward({input, cond})
I modified the network by adding a JoinTable before the SpatialConvolution is added, like so:
local net = nn.Sequential()
net:add(nn.JoinTable(2, 4))
net:add(SpatialConvolution(3, 64, 4, 4, 2, 2, 1, 1))
This is not working because both tensors have different sizes in dimensions 2, 3, and 4. Giving the cond tensor as size of (batchSize, 1000, 64, 64) is not an option since its a waste of memory.
Is there any best practise for merging two different tensors at the beginning of a network to be feed into the first layer.

There is no such thing as "merging" tensors which do not have compatible shapes. You should simply pass a table of tensors and start your network with SelectTable operation and work with nngraph, not simple Sequential. In particular - how would you expect Spatial Convolution to work on such odd "tensor" which "narrows down" to your cond? There is no well defined operation in mathematics for such use case, thus you have to be more specific (which you will achieve with nngraph and SelectTable).

Related

What's the purpose of nb_epoch in Keras's fit_generator?

It seems like I could get the exact same result by making num_samples bigger and keeping nb_epoch=1. I thought the purpose of multiple epochs was to iterate over the same data multiple times, but Keras doesn't reinstantiate the generator at the end of each epoch. It just keeps going. For example training this autoencoder:
import numpy as np
from keras.layers import (Convolution2D, MaxPooling2D,
UpSampling2D, Activation)
from keras.models import Sequential
rand_imgs = [np.random.rand(1, 100, 100, 3) for _ in range(1000)]
def keras_generator():
i = 0
while True:
print(i)
rand_img = rand_imgs[i]
i += 1
yield (rand_img, rand_img)
layers = ([
Convolution2D(20, 5, 5, border_mode='same',
input_shape=(100, 100, 3), activation='relu'),
MaxPooling2D((2, 2), border_mode='same'),
Convolution2D(3, 5, 5, border_mode='same', activation='relu'),
UpSampling2D((2, 2)),
Convolution2D(3, 5, 5, border_mode='same', activation='relu')])
autoencoder = Sequential()
for layer in layers:
autoencoder.add(layer)
gen = keras_generator()
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
history = autoencoder.fit_generator(gen, samples_per_epoch=100, nb_epoch=2)
It seems like I get the same result with (samples_per_epoch=100, nb_epoch=2) as I do for (samples_per_epoch=200, nb_epoch=1). Am I using fit_generator as intended?
Yes - you are right that when using keras.fit_generator these two approaches are equivalent. But - there are variety of reasons why keeping epochs is reasonable:
Logging: in this case epoch comprises the amount of data after which you want to log some important statistics about training (like e.g. time or loss at the end of the epoch).
Keeping directory structure when you are using generator to load data from your hard disk - in this case - when you know how many files you have in your directory - you may adjust the batch_size and nb_epoch to such values that epoch would comprise going through every example in your dataset.
Keeping the structure of data when using flow generator - in this case, when you have e.g. a set of pictures loaded to your Python and you want to use Keras.ImageDataGenerator to apply different kind of data transformations, setting batch_size and nb_epoch in such way that epoch comprises going through every example in your dataset might help you in keeping track of a progress of your trainning process.

Get desired output from convolution (GAN)

I'm trying to code a GAN model for cifar10.
But I have a problem.
How to get the desired ouput (3x32x32) from a convolutional network ?
I actually inspire my model from one I found for the mnist :
model = Sequential()
model.add(Dense(input_dim=100, output_dim=1024))
model.add(Activation('tanh'))
model.add(Dense(128*7*7))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Reshape((128, 7, 7), input_shape=(128*7*7,)))
model.add(UpSampling2D(size=(2, 2)))
model.add(Convolution2D(64, 5, 5, border_mode='same'))
model.add(Activation('tanh'))
model.add(UpSampling2D(size=(2, 2)))
model.add(Convolution2D(3, 5, 5, border_mode='same'))
So, from there, I have an output of 3x28x28
Do you know how I can get 3x32x32 ?
Thanks!
You can either do PaddingLayers (https://keras.io/layers/convolutional/#zeropadding2d) and then apply convolutions to acquire a sensible output or do another Upsampling and then apply continous Convolutions with border_mode='valid' to acquire the corect output size. You can do the convolutions earlier so you do not need as many.

How to set proper arguments to build keras Convolution2D NN model [Text Classification]?

I am trying to use 2D CNN to do text classification on Chinese Article and have trouble on setting arguments of keras Convolution2D. I know the basic flow of Convolution2D to cope with image, but stuck by using my dataset with keras.
Input data
My data is 9800 Chinese Article, max sentence length is 6810,with 200 word2vec size.
So the input shape is `(9800, 1, 6810, 200)`
Code for building model
MAX_FEATURES = 6810
# I just randomly pick one filter, seems this is the problem?
nb_filter = 128
input_shape = (1, 6810, 200)
# each word is 200 (word2vec size)
embedding_size = 200
# 3 word length
n_gram = 3
# so stride here is embedding_size*n_gram
model = Sequential()
model.add(Convolution2D(nb_filter, n_gram, embedding_size, border_mode='valid', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(100, 1), border_mode='valid'))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(hidden_dims))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# X is (9800, 1, 6810, 200)
model.fit(X, y, batch_size=32,
nb_epoch=5,
validation_split=0.1)
Question 1. I have problem to set Convolution2D arguments. My reseach is below,
The official docs do not contain an exmaple for 2D CNN text classifacation(though has 1D CNN).
Convolution2D defination is here https://keras.io/layers/convolutional/:
keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation=None, weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='default', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None, bias=True)
nb_filter: Number of convolution filters to use.
nb_row: Number of rows in the convolution kernel.
nb_col: Number of columns in the convolution kernel.
border_mode: 'valid', 'same' or 'full'. ('full' requires the Theano backend.)
Some research about the arguments:
This issue https://github.com/fchollet/keras/issues/233 is about 2D CNN for text classification, I read all comments and pick:
(1) https://github.com/fchollet/keras/issues/233#issuecomment-117427013
model.add(Convolution2D(nb_filter=N_FILTERS, stack_size=1, nb_row=FIELD_SIZE,
nb_col=1, subsample=(STRIDE, 1)))
(2) https://github.com/fchollet/keras/issues/233#issuecomment-117700913
sequential.add(Convolution2D(nb_feature_maps, 1, n_gram, embedding_size))
But it seems has some diference to current keras version, also the arguments naming by different people are in a mess (I hope keras has an easy understandable argument expanation).
Another comment I see about current api:
https://github.com/fchollet/keras/issues/1665#issuecomment-181181000
The current API is as below:
keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation='linear', weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='th', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None)
So (36,1,7,7) seems the reason, the correct arguments would be (36,7,7,...).
By above research, on my understanding of convolution, Convolution2D create a (nb_filter, nb_row, nb_col) filter , by sliding a stride to get one filter result, recurse sliding, finally combine the result into array with shape (1, one_sample_article_length[6810] / nb_filter), and go to the next layer, is that right? Is my code below set nb_row and nb_col correct ?
Question 2. What is the proper MaxPooling2D arguments? (for my dateset or for commonm, either is OK)
I refer this issue https://github.com/fchollet/keras/issues/233#issuecomment-117427013 to set the argument, there are two kinds:
MaxPooling2D(poolsize=(((nb_features - FIELD_SIZE) / STRIDE) + 1, 1))
MaxPooling2D(poolsize=(maxlen - n_gram + 1, 1))
I have no idea why they calculate MaxPooling2D argument like that.
Question 3. Any recommendation for batch_size and nb_epoch to do such text classification? I have no idea at all.

Deconv implementation in keras output_shape issue

I am implementing following Colorization Model written in Caffe. I am confused about my output_shape parameter to supply in Keras
model.add(Deconvolution2D(256,4,4,border_mode='same',
output_shape=(None,3,14,14),subsample=(2,2),dim_ordering='th',name='deconv_8.1'))
I have added a dummy output_shape parameter. But how can I determine the output parameter? In caffe model the layer is defined as:
layer {
name: "conv8_1"
type: "Deconvolution"
bottom: "conv7_3norm"
top: "conv8_1"
convolution_param {
num_output: 256
kernel_size: 4
pad: 1
dilation: 1
stride: 2
}
If I do not supply this parameter the code give parameter error but I can not understand what should I supply as output_shape
p.s. already asked on data science forum page with no response. may be due to small user base
What output shape does the Caffe deconvolution layer produce?
For this colorization model in particular you can simply refer to page 24 of their paper (which is linked in their GitHub page):
So basically the output shape of this deconvolution layer in the original model is [None, 56, 56, 128]. This is what you want to pass to Keras as output_shape. The only problem is as I mention in the section below, Keras doesn't really use this parameter to determine the output shape, so you need to run a dummy prediction to find what your other parameters need to be in order for you to get what you want.
More generally the Caffe source code for computing its Deconvolution layer output shape is:
const int kernel_extent = dilation_data[i] * (kernel_shape_data[i] - 1) + 1;
const int output_dim = stride_data[i] * (input_dim - 1)
+ kernel_extent - 2 * pad_data[i];
Which with a dilation argument equal to 1 reduces to just:
const int output_dim = stride_data[i] * (input_dim - 1)
+ kernel_shape_data[i] - 2 * pad_data[i];
Note that this matches the Keras documentation when the parameter a is zero:
Formula for calculation of the output shape 3, 4: o = s (i - 1) +
a + k - 2p
How to verify actual output shape with your Keras backend
This is tricky, because the actual output shape depends on the backend implementation and configuration. Keras is currently unable to find it on its own. So you actually have to execute a prediction on some dummy input to find the actual output shape. Here's an example of how to do this from the Keras docs for Deconvolution2D:
To pass the correct `output_shape` to this layer,
one could use a test model to predict and observe the actual output shape.
# Examples
```python
# apply a 3x3 transposed convolution with stride 1x1 and 3 output filters on a 12x12 image:
model = Sequential()
model.add(Deconvolution2D(3, 3, 3, output_shape=(None, 3, 14, 14), border_mode='valid', input_shape=(3, 12, 12)))
# Note that you will have to change the output_shape depending on the backend used.
# we can predict with the model and print the shape of the array.
dummy_input = np.ones((32, 3, 12, 12))
# For TensorFlow dummy_input = np.ones((32, 12, 12, 3))
preds = model.predict(dummy_input)
print(preds.shape)
# Theano GPU: (None, 3, 13, 13)
# Theano CPU: (None, 3, 14, 14)
# TensorFlow: (None, 14, 14, 3)
Reference: https://github.com/fchollet/keras/blob/master/keras/layers/convolutional.py#L507
Also you might be curious to know why is it that the output_shape parameter apparently doesn't really define the output shape. According to the post Deconvolution2D layer in keras this is why:
Back to Keras and how the above is implemented. Confusingly, the output_shape parameter is actually not used for determining the output shape of the layer, and instead they try to deduce it from the input, the kernel size and the stride, while assuming only valid output_shapes are supplied (though it's not checked in the code to be the case). The output_shape itself is only used as input to the backprop step. Thus, you must also specify the stride parameter (subsample in Keras) in order to get the desired result (which could've been determined by Keras from the given input shape, output shape and kernel size).

Average saved output from multiple runs of function

I have a function that has 11 input parameters.
MyFunction(40, 40, 1, 1, 1, 5, 0, 1, 0, 1500, 'MyFile');
The input parameter 'MyFile' when passed through the MyFunction saves a text file using the save command that is 6 columns by the 10th input parameter of rows (e.g. 1500). I usually then load this files back into MATLAB when I am ready to analyze different runs.
I'd like to run MyFunction m times and ultimately have the 'MyFile' be a measure of central tendency (e.g. mean or median) of those m runs.
m=10
for i = 1:m;
MyFunction(40, 40, 1, 1, 1, 5, 0, 1, 0, 1500, 'MyFile');
end;
I could use the for-loop to generate a new 'MyFile' name for each iteration (e.g. MyFile1, MyFile2,...,MyFileM) with something like MyFile = sprintf('MyFile%m'); and then load all of the MyFiles back into MATLAB and then take their average and save it as a UltimateMyFile, but this seems cumbersome. Is their a better method to average these output files more directly? Should I store the files as an object, use dlmwrite, or -append?
Thanks.
since you are trying to find median, you need access to all the data.
you can define a 3 dimension array say
data = zeros(1500,6,m);
and then at each step of for loop update it:
data(:,:,i) = MyFunction(40, 40, 1, 1, 1, 5, 0, 1, 0, 1500);
of course you will need to redefine your function to get the right output.
However if you need to access the data at some other time, then you are better of writing it to a file and reading it from there.
in case you are only interested in the average, you can keep a running total as each case is analyzed and then then just divide it by number of cases (m).