How to connect some nodes directly to the output layer in Keras - neural-network

How to make a not fully connected graph in Keras? I am trying to make a network with some nodes in input layer that are not connected to the hidden layer but to the output layer. Is there any way to do this easily in Keras?
Thanks!

Yes, this is possible. The easiest way to do this is to specify two inputs:
in_1 = Input(...)
in_2 = Input(...)
hidden = Dense(...)(in_1)
# combine secondary inputs and hidden outputs
in_2_and_hidden = merge([in_2, hidden], mode='concat')
# feed combined vector to output
output = Dense(...)(in_2_and_hidden)
The documentation is better at explaining what merge does in detail. The general idea of multiple inputs and the functional model can be read here.

Related

NSLocalizedDescription = "The size of the output layer 'Identity' in the neural network does not match the number of classes in the classifier."

I just created a model that does a binary classification and has a dense layer of 1 unit at the end. I used Sigmoid activation. However, I get this error now when I wanna convert it to CoreML.
I tried to change the number of units to 2 and activation to softmax but still didn't work.
import coremltools as ct
#1. define input size
image_input = ct.ImageType(scale=1/255)
#2. give classifier
classifier_config = coremltools.ClassifierConfig(class_labels=[0, 1]) #ERROR here
#3. convert the model
coreml_model = coremltools.convert("mask_detection_model_surgical_mask.h5",
inputs=[image_input], classifier_config=classifier_config)
#4. load and resize an example image
example_image = Image.open("Unknown3.jpg").resize((256, 256))
# Make a prediction using Core ML
out_dict = coreml_model.predict({mymodel.input_names[0]: example_image})
print(out_dict["classLabels"])
# save to disk
#coreml_model.save("FINALLY.mlmodel")
I found the answer to my question.
Use Softmax activation and 2 Dense units as the final layer with either loss='binary_crossentropy' or `loss='categorical_crossentropy'
Good luck to hundreds of people who posted a similar question but received no answer.

merge different models with different inputs Keras

I would like to train two different Conv models in Keras with different input dimensions.
I have:
input_size=4
input_sizeB=6
model=Sequential()
model.add(Conv2D(filters=10,input_shape=
(1,time_steps,input_size),kernel_size(24,3),activation='relu',data_format='channels_first',kernel_regularizer=regularizers.l2(0.001)))
model.add(Flatten())
A= model.add(Dense(25,
activation='tanh',kernel_regularizer=regularizers.l2(0.003)))
model2=Sequential()
model2.add(Conv2D(filters=10,input_shape=
(1,time_steps,input_sizeB),kernel_size(24,3),activation='relu',data_format='channels_first',kernel_regularizer=regularizers.l2(0.001)))
model2.add(Flatten())
B= model2.add(Dense(25,
activation='tanh',kernel_regularizer=regularizers.l2(0.003)))
Now I would merge the two dense layers at the end of both Conv net.
How I should do?
Using the Sequential API, you can use the Merge layer (doc) as follows:
merged_layer = Merge([model, model2], mode='concat') # mode='sum', 'ave', etc.
merged_model = Sequential()
merged_model.add(merged_layer)
Note that this will throw a warning (depending on your version, the code should still work), as sequential Merge is getting deprecated. You could otherwise consider the Functional API, which offers some more flexibility in that regards c.f. the several pre-defined merge layers Keras provides depending on the operation you want to use (doc). Find an example below:
merged_layer = Concatenate()([model.output, model2.output])
merged_model = Model([model.input, model2.input], merged_layer)

Using hidden activations in loss function

I want to create a custom loss function for a double-input double-output model in Keras that:
minimizes the reconstruction error of two autoencoders;
maximizes the correlation of the bottleneck features of the autoencoders.
For this I need to pass to the loss function:
both inputs;
both outputs / reconstructions;
output of intermediate layers for both (hidden activations).
I know I can pass both inputs and outputs to Model, but am struggling to find a way to pass the hidden activations.
I could create two new Models that have the output of the intermediate layers and pass that to loss, like:
intermediate_layer_model1 = Model(input=input1, output=autoencoder.get_layer('encoded1').output)
intermediate_layer_model2 = Model(input=input2, output=autoencoder.get_layer('encoded2').output)
autoencoder.compile(optimizer='adadelta', loss=loss(intermediate_layer_model1, intermediate_layer_model2))
But still, I would need to find a way to match the y_true in loss to the correct intermediate model.
What is the right way to approach this?
Edit
Here's an approach that I think should work. Simplified:
# autoencoder 1
input1 = Input(shape=(input_dim,))
encoded1 = Dense(encoding_dim, activation='relu', name='encoded1')(input1)
decoded1 = Dense(input_dim, activation='sigmoid', name='decoded1')(encoded1)
# autoencoder 2
input2 = Input(shape=(input_dim,))
encoded2 = Dense(encoding_dim, activation='relu', name='encoded2')(input2)
decoded2 = Dense(input_dim, activation='sigmoid', name='decoded2')(encoded2)
# merge encodings
merge_layer = merge([encoded1, encoded2], mode='concat', name='merge', concat_axis=1)
model = Model(input=[input1, input2], output=[decoded1, decoded2, merge_layer])
model.compile(optimizer='rmsprop', loss={
'decoded1': 'binary_crossentropy',
'decoded2': 'binary_crossentropy',
'merge': correlation,
})
Then in correlation I can split y_pred and do the calculations.
How about:
Defining a single model with a multiple outputs (be sure that you named a coding and reconstruction layer properly):
duo_model = Model(input=input, output=[coding_layer, reconstruction_layer])
Compiling your model with two different losses (or even performing a loss reweighting):
duo_model.compile(optimizer='rmsprop',
loss={'coding_layer': correlation_loss,
'reconstruction_layer': 'mse'})
Taking your final model as a:
encoder = Model(input=input, output=[coding_layer])
autoencoder = Model(input=input, output=[reconstruction_layer])
After proper compilation this should do the job.
When it comes to defining a proper correlation loss function there are two ways:
when coding layer and your output layer have the same dimension -
you could easly use predefinied cosine_proximity function from
Keras library.
when coding layer has different dimensonality -
you shoud first find embedding of coding vector and reconstruction vector to the same space and then - compute correlation there. Remember that this embedding should either be a Keras layer / function or Theano / Tensor flow operation (depending on which backend you are using). Of course you can compute both embedding and correlation function as a part of one loss function.

How to reuse an existing neural network to train a new one using TensorFlow?

I want to train a new neural network using TensorFlow by reusing the lower layers of an existing neural network (which is already trained). I want to drop the top layers of the existing network and replace them with new layers, and I also want to lock the lowest layers to prevent backpropagation from modifying them. Here's a little ascii art to summarize this:
*Original model* *New model*
Output Layer Output Layer (new)
| |
Hidden Layer 3 Hidden Layer 3 (copied)
| ==> |
Hidden Layer 2 Hidden Layer 2 (copied+locked)
| |
Hidden Layer 1 Hidden Layer 1 (copied+locked)
| |
Inputs Inputs
What's a good way to do this?
Edit
My original network was created like this:
X = tf.placeholder(tf.float32, shape=(None, 500), name="X")
y = tf.placeholder(tf.int64, shape=(None), name="y")
hidden1 = fully_connected(X, 300, scope="hidden1")
hidden2 = fully_connected(hidden1, 100, scope="hidden2")
hidden3 = fully_connected(hidden2, 50, scope="hidden3")
output = fully_connected(hidden3, 5, activation_fn=None, scope="output)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, y)
loss = tf.reduce_mean(xentropy, name="loss")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
training_op = optimizer.minimize(loss)
init = tf.initialize_all_variables()
saver = tf.train.Saver()
# ... Train then save the network using the saver
What's the code that would load this network, lock the 2 lower hidden layers, and replace the output layer? If possible, it would be great to be able to cache the output of the top locked layer (hidden2) for each input, to speed up training.
Extra details
I looked at retrain.py and the corresponding How-To (a very interesting read). The code basically loads the original model, then it computes the output of the bottleneck layer (ie. the last hidden layer before the output layer) for each input. Then it creates a brand new model and trains it using the bottleneck outputs as inputs. This basically answers my question for the copied+locked layers: I just need to run the original model on the whole training set and store the output of the top-most locked layer. But I don't know how to handle the copied but unlocked (ie. trainable) layers (eg. Hidden Layer 3 in my diagram).
Thanks!
TensorFlow gives you fine grain control of the set of parameters (Variables) you update in every training step. For instance, in your model, suppose the layers are all fully connected layers. Then you would have a weights parameter and biases parameter for each layer. Let's say you have the corresponding Variable objects in W1, b1, W2, b2, W3, b3, Woutput and boutput. Assuming you are using the Optimizer interface, and assuming that loss is the value you want to minimize, you can only train hidden and output layers by doing the following :
opt = GradientDescentOptimizer(learning_rate=0.1)
grads_and_vars = opt.compute_gradients(loss, var_list=[W3, b3, Woutput, boutput])
train_op = opt.apply_gradients(grads_and_vars)
NOTE: opt.minimize(loss, var_list) does the equivalent of above, but I split it in two to illustrate the details.
opt.compute_gradients computes the gradients with respect to specific set of your model parameters, and you have full control as to what you consider as your model parameters. Note that you have to initialize Hidden layer 3 parameters from the older model, and Output layer parameters randomly. You can do so by restoring your new model from the original model which would copy all the parameters from the original model, and adding extra tf.assign operations to initialize the output layer parameters randomly.

matconvnet classification training last layer (softmax)?

I would like to retrain the vgg-imagenet-f network to do classification (rather than direct image comparison, which is what I have done with my own network).
The downloaded network however is a deployment net, and doesn't have a loss layer included. As I've not done classification training before, I'm a bit stumped as to how to design this last layer. I expect it will be something like this:
layer.name = 'loss' ;
layer.type = 'custom' ;
layer.forward = #forward ;
layer.backward = #backward ;
layer.class = [] ;
but I don't know what my #forward and #backward functions should be. Should they be softmax?
Of note, I have a imdb with about 10k images, corresponding labels, and an ID element with unique numbers running 1 - 10k.
Thanks for any help, or any links to a sample of the way one should construct this layer in matconvnet/matlab!
You could implement your own network adjusting the filters accordingly, since you want to 'retrain' vgg instead of initializing the weights with random numbers you can adapt your classification network using trained filers from downloaded network. The last layer could be softmaxloss
http://www.vlfeat.org/matconvnet/mfiles/vl_nnsoftmaxloss/