Keras : How to merge a dense layer and an embedding layer - merge

I use Keras and I try to concatenate two different layers into a vector (first values of the vector would be values of the first layer, and the other part would be the values of the second layer).
One of these layers is a Dense layer and the other layer is a Embedding layer.
I know how to merge two embedding layers or two dense layers but I don't know how to merge a embedding layer and a dense layer (dimensional problem).
A simple example would be like this:
L_branch = Sequential()
L_branch.add(Dense(10, input_shape = (4,) , activation = 'relu'))
L_branch.add(BatchNormalization())
R_branch = Sequential()
R_branch.add(Embedding(1000, 64, input_length=5))
final_branch.add(Merge([L_branch, R_branch], mode = 'concat'))
But this will not work because you can't merge layers with different dimensionalities.
PS : Sorry, english is not my native languague and I hope you will understand my problem.
Best regards.

Use Flatten layer.
L_branch = Sequential()
L_branch.add(Dense(10, input_shape = (4,) , activation = 'relu'))
L_branch.add(BatchNormalization())
R_branch = Sequential()
R_branch.add(Embedding(1000, 64, input_length=5))
R_branch.add(Flatten()) # <--
final_branch = Sequential() # <--
final_branch.add(Merge([L_branch, R_branch], mode = 'concat'))

Related

What is wrong with my siamese network? Why does it output the same value(appx 0.5) irrespective of the input pairs?

I'm trying to build a Siamese Network for https://www.kaggle.com/moltean/fruits dataset. I've picked 10 Images per class from this dataset. There are a total of 131 classes in this dataset. I'm using the below model to train my network. However, it is failing to converge. I saw a strange behaviour, after 3000 epochs my results are 0.5000003 irrespective of the input pair I give and my loss stops at 0.61. The specifications of the network are as specified in the paper. I tried changing the following things,
Changing Denes layer activation to ReLU
Importing 'ImageNet' weights of ResNet50
Tried increasing and decreasing learning rate.
I also checked the batch inputs to see if the correct input pair (x) is paired with the correct y value. However, I think I'm doing something basically wrong. Glad if you could help me. Thank you :)
The notebook is hosted in Kaggle https://www.kaggle.com/krishnaprasad96/siamese-network.
If you have some doubts on how certain parts of the code works refer https://medium.com/#krishnaprasad_54871/siamese-networks-line-by-line-explanation-for-beginners-55b8be1d2fc6
#Building a sequential model
input_shape=(100, 100, 3)
left_input = Input(input_shape)
right_input = Input(input_shape)
W_init = keras.initializers.RandomNormal(mean = 0.0, stddev = 1e-2)
b_init = keras.initializers.RandomNormal(mean = 0.5, stddev = 1e-2)
model = keras.models.Sequential([
keras.layers.Conv2D(64, (10,10), activation='relu', input_shape=input_shape, kernel_initializer=W_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (7,7), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(256, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='sigmoid', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(1e-3))
])
encoded_l = model(left_input)
encoded_r = model(right_input)
subtracted = keras.layers.Subtract()([encoded_l, encoded_r])
prediction = Dense(1, activation='sigmoid', bias_initializer=b_init)(subtracted)
siamese_net = Model(input=[left_input, right_input], output=prediction)
optimizer= Adam(learning_rate=0.0006)
siamese_net.compile(loss='binary_crossentropy', optimizer=optimizer)
plot_model(siamese_net, show_shapes=True, show_layer_names=True)
I have seen the notebook on kaggle. Thanks for all the information. But it seems that training and validation spilt is wrong. As this model trains on initial 91 classes only. What about remaining 40 classes. Train and validation spilt should be from the same class. Suppose I have 10 images in a class. I can use 8 image for train and 2 images for validation. Train and validation spilt should be on images not on classes. Also I couldn't see the testing script. It would be a great help if you can provide that also.

Bad performance of the model Keras

I am working with a Boston housing price. I have my X and Y with a shape of (506, 13). Then, i define my model
def basic_model_1():
t_model = Sequential()
t_model.add(Dense(13, activation="tanh", input_dim = 13))
t_model.add(Dense(10, activation="tanh"))
t_model.add(Dropout(0.2))
t_model.add(Dense(6, activation="tanh"))
t_model.add(Dense(3, activation="tanh"))
t_model.add(Dense(1))
print(t_model.summary())
t_model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
t_model.fit(X,Y, nb_epoch = 200 , batch_size= 10, validation_split= 0.20)
return(t_model)
When i run this model, i get pretty bad performance of val_acc 0.0098. I changed activation function to sigmoid or relu. Performance increases slightly. What do i need to increase model performance?
In my opinion you could:
1) Add more neurons at every layer (use a multiple of 2 for better performance, try 64, 128, 256).
2) Add more dropout layers, one after every Dense layer.
3) Add much more data.
There is nothing wrong with your model architecture. Only thing I would suggest is to use kernel_initializer='normal', activation='relu' in all Dense layers (specially in output layer) since it's a regression model.

How to build a recurrent neural net in Keras where each input goes through a layer first?

I'm trying to build an neural net in Keras that would look like this:
Where x_1, x_2, ... are input vectors that undergo the same transformation f. f is itself a layer whose parameters must be learned. The sequence length n is variable across instances.
I'm having trouble understanding two things here:
What should the input look like?
I'm thinking of a 2D tensor with shape (number_of_x_inputs, x_dimension), where x_dimension is the length of a single vector $x$. Can such 2D tensor have a variable shape? I know tensors can have variable shapes for batch processing, but I don't know if that helps me here.
How do I pass each input vector through the same transformation before feeding it to the RNN layer?
Is there a way to sort of extend for example a GRU so that an f layer is added before going through the actual GRU cell?
I'm not an expert, but I hope this helps.
Question 1:
Vectors x1, x2... xn can have different shapes, but I'm not sure if the instances of x1 can have different shapes. When I have different shapes I usually pad the short sequences with 0s.
Question 2:
I'm not sure about extending a GRU, but I would do something like this:
x_dims = [50, 40, 30, 20, 10]
n = 5
def network():
shared_f = Conv1D(5, 3, activation='relu')
shated_LSTM = LSTM(10)
inputs = []
to_concat = []
for i in range(n):
x_i = Input(shape=(x_dims[i], 1), name='x_' + str(i))
inputs.append(x_i)
step1 = shared_f(x_i)
to_concat.append(shated_LSTM(step1))
merged = concatenate(to_concat)
final = Dense(2, activation='softmax')(merged)
model = Model(inputs=inputs, outputs=[final])
# model = Model(inputs=[sequence], outputs=[part1])
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
return model
m = network()
In this example, I used a Conv1D as the shared f transformation, but you could use something else (Embedding, etc.).

PyTorch: How to convert pretrained FC layers in a CNN to Conv layers

I want to convert a pre-trained CNN (like VGG-16) to a fully convolutional network in Pytorch. How can I do so?
You can do that as follows (see comments for description):
import torch
import torch.nn as nn
from torchvision import models
# 1. LOAD PRE-TRAINED VGG16
model = models.vgg16(pretrained=True)
# 2. GET CONV LAYERS
features = model.features
# 3. GET FULLY CONNECTED LAYERS
fcLayers = nn.Sequential(
# stop at last layer
*list(model.classifier.children())[:-1]
)
# 4. CONVERT FULLY CONNECTED LAYERS TO CONVOLUTIONAL LAYERS
### convert first fc layer to conv layer with 512x7x7 kernel
fc = fcLayers[0].state_dict()
in_ch = 512
out_ch = fc["weight"].size(0)
firstConv = nn.Conv2d(in_ch, out_ch, 7, 7)
### get the weights from the fc layer
firstConv.load_state_dict({"weight":fc["weight"].view(out_ch, in_ch, 7, 7),
"bias":fc["bias"]})
# CREATE A LIST OF CONVS
convList = [firstConv]
# Similarly convert the remaining linear layers to conv layers
for layer in enumerate(fcLayers[1:]):
if isinstance(module, nn.Linear):
# Convert the nn.Linear to nn.Conv
fc = module.state_dict()
in_ch = fc["weight"].size(1)
out_ch = fc["weight"].size(0)
conv = nn.Conv2d(in_ch, out_ch, 1, 1)
conv.load_state_dict({"weight":fc["weight"].view(out_ch, in_ch, 1, 1),
"bias":fc["bias"]})
convList += [conv]
else:
# Append other layers such as ReLU and Dropout
convList += [layer]
# Set the conv layers as a nn.Sequential module
convLayers = nn.Sequential(*convList)

How to have parallel convolutional layers in keras?

I am a little new to neural networks and keras. I have some images with size 6*7 and the size of the filter is 15. I want to have several filters and train a convolutional layer separately on each and then combine them. I have looked at one example here:
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten(input_shape=input_shape))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('tanh'))
This model works with one filter. Can anybody give me some hints on how to modify the model to work with parallel convolutional layers.
Thanks
Here is an example of designing a network of parallel convolution and sub sampling layers in keras version 2. I hope this resolves your problem.
rows, cols = 100, 15
def create_convnet(img_path='network_image.png'):
input_shape = Input(shape=(rows, cols, 1))
tower_1 = Conv2D(20, (100, 5), padding='same', activation='relu')(input_shape)
tower_1 = MaxPooling2D((1, 11), strides=(1, 1), padding='same')(tower_1)
tower_2 = Conv2D(20, (100, 7), padding='same', activation='relu')(input_shape)
tower_2 = MaxPooling2D((1, 9), strides=(1, 1), padding='same')(tower_2)
tower_3 = Conv2D(20, (100, 10), padding='same', activation='relu')(input_shape)
tower_3 = MaxPooling2D((1, 6), strides=(1, 1), padding='same')(tower_3)
merged = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1)
merged = Flatten()(merged)
out = Dense(200, activation='relu')(merged)
out = Dense(num_classes, activation='softmax')(out)
model = Model(input_shape, out)
plot_model(model, to_file=img_path)
return model
The image of this network will look like
My approach is to create other model that defines all parallel convolution and pulling operations and concat all parallel result tensors to single output tensor. Now you can add this parallel model graph in your sequential model just like layer. Here is my solution, hope it solves your problem.
# variable initialization
from keras import Input, Model, Sequential
from keras.layers import Conv2D, MaxPooling2D, Concatenate, Activation, Dropout, Flatten, Dense
nb_filters =100
kernel_size= {}
kernel_size[0]= [3,3]
kernel_size[1]= [4,4]
kernel_size[2]= [5,5]
input_shape=(32, 32, 3)
pool_size = (2,2)
nb_classes =2
no_parallel_filters = 3
# create seperate model graph for parallel processing with different filter sizes
# apply 'same' padding so that ll produce o/p tensor of same size for concatination
# cancat all paralle output
inp = Input(shape=input_shape)
convs = []
for k_no in range(len(kernel_size)):
conv = Conv2D(nb_filters, kernel_size[k_no][0], kernel_size[k_no][1],
border_mode='same',
activation='relu',
input_shape=input_shape)(inp)
pool = MaxPooling2D(pool_size=pool_size)(conv)
convs.append(pool)
if len(kernel_size) > 1:
out = Concatenate()(convs)
else:
out = convs[0]
conv_model = Model(input=inp, output=out)
# add created model grapg in sequential model
model = Sequential()
model.add(conv_model) # add model just like layer
model.add(Conv2D(nb_filters, kernel_size[1][0], kernel_size[1][0]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten(input_shape=input_shape))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('tanh'))
For more information refer similar question: Combining the outputs of multiple models into one model