tensorflow, Transfert learning by removing and adding layer - neural-network

I try to use a model already trained, remove the output layer and replace it by a new not train one.
I was using this code one years ago but now it don't work anymore!!?
model2= Model(inputs=source_model.input, outputs=source_model.layers[-2].output) # -2
source_model.summary()
model2.summary()
new_mod = Sequential()
new_mod.add(model2)
# add/replace layer
new_mod.add(Dense(ny_train.shape[1], activation='linear',
kernel_initializer=initializers.he_uniform(),
name='new_final_output'))
I get this error at line "new_mod.add(model2)":
AssertionError: Could not compute output KerasTensor(type_spec=TensorSpec(shape=(None,
50), dtype=tf.float32, name=None), name='dropout_8/Identity:0', description="created by
layer 'dropout_8'")
Extra-question:
It is possible to do that in a functionnal way?

Related

NSLocalizedDescription = "The size of the output layer 'Identity' in the neural network does not match the number of classes in the classifier."

I just created a model that does a binary classification and has a dense layer of 1 unit at the end. I used Sigmoid activation. However, I get this error now when I wanna convert it to CoreML.
I tried to change the number of units to 2 and activation to softmax but still didn't work.
import coremltools as ct
#1. define input size
image_input = ct.ImageType(scale=1/255)
#2. give classifier
classifier_config = coremltools.ClassifierConfig(class_labels=[0, 1]) #ERROR here
#3. convert the model
coreml_model = coremltools.convert("mask_detection_model_surgical_mask.h5",
inputs=[image_input], classifier_config=classifier_config)
#4. load and resize an example image
example_image = Image.open("Unknown3.jpg").resize((256, 256))
# Make a prediction using Core ML
out_dict = coreml_model.predict({mymodel.input_names[0]: example_image})
print(out_dict["classLabels"])
# save to disk
#coreml_model.save("FINALLY.mlmodel")
I found the answer to my question.
Use Softmax activation and 2 Dense units as the final layer with either loss='binary_crossentropy' or `loss='categorical_crossentropy'
Good luck to hundreds of people who posted a similar question but received no answer.

implementing a MLP model in keras for timeseries prediction but the model doesn't train well

I'm trying to come up with a MLP model for timeseries prediction following this blog post. I have 138 timeseries with a lookback_window=28 (splitted as 50127 timeseries for traing and 24255 timeseries for validation). I need to predict the next value (timesteps=28, n_features=1). I started from a 3 layer network but it didn't train well. I tried to make the network deeper by adding more layers/more hunits, but it doesn't improve. In the picture, you can see the result of prediction of the following model Here is my model code:
inp = Input(batch_shape=(batch_size, lookback_window))
first_layer = Dense(1000, input_dim=28, activation='relu')(inp)
snd_layer = Dense(500)(first_layer)
thirs_layer = Dense(250)(snd_layer)
tmp = Dense(100)(thirs_layer)
tmp2 = Dense(50)(tmp)
tmp3 = Dense(25)(tmp2)
out = Dense(1)(tmp3)
model = Model(inp, out)
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(train_data, train_y,
epochs=1000,
batch_size=539,
validation_data=(validation_data, validation_y),
verbose=1,
shuffle=False)
What am I missing? How can I improve it?
The main thing I noticed is that you are not using non-linearities in your layers. I would use relus for the hidden layers and linear layer for the final layer in case you want values larger than 1 / -1 to be possible. If you do not want them to be possible use tanh. By increasing the data you make the problem harder and therefore your mostly linear model is underfitting severely.
I managed to get better results by the following changes:
Using RMSprop instead of Adam with lr=0.001, and as #TommasoPasini mentioned added them to all Dense layers (expect the last one). It improves the results a lot!
epochs= 3000 instead of 1000.
But now I think it is overfitting. Here are the plots of the results and the validation and train loss:

concatenate error during model import

I have this problem: I concatenate correctly some outputs from different conv layers. I train the model. Everything is fine up to the training. I save json model and weights.
When I load the model from another script using json API, I have this error:
ValueError: Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 12, 12, 512), (None, None, None, 512)]
This happens only when I import the saved model.
More details: I have an output from a Conv2 (12, 12, 512) and from another one that matches. The proof is the fact that I can train without any problem (and actually I plot the summary and all the output_shape are correctly defined... no 'None' in shapes but the batch size).
I also tried to add in keras.json image_data_format. But I still have this issue.
I use keras '2.1.2' and tensorflow '1.2.0'. I use the functional API of keras.
Any suggestions?

Caffe - MNSIT - How do I use the network on a single image?

I'm using Caffe (http://caffe.berkeleyvision.org/) for image classification. I'm using it on Windows and everything seems to be compiling just fine.
To start learning I followed the MNIST tutorial (http://caffe.berkeleyvision.org/gathered/examples/mnist.html). I downloaded the data and ran ..\caffe.exe train --solver=...examples\mnist\lenet_solver.prototxt. It ran 10.000 iterations, printed that the accuracy was 98.5, and generated two files: lenet_iter_10000.solverstate, and lenet_iter_10000.caffemodel.
So, I though it would be funny to try to classify my own image, it should be easy right?.
I can find resources such as: https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples telling how to prepare, train and time my model. But each time a tutorial/article comes to actually putting a single instance into the CNN, they skip to the next point and tell to download some new model. Some resources tell to use the classifier.bin/.exe, but this file takes a imagenet_mean.binaryproto or similar for mnist. I have no idea where to find or generated this file.
So in short: When I have trained a CNN using Caffe, how to I input a single image and get the output using the files I already have?
Update: Based on the help, I got the Net to recognize an image but the recognition is not correct even if the network had an accuracy of 99.0%. I used the following python code to recognice an image:
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'lenet_iter_10000.caffemodel'
net = caffe.Net(NET_FILE, MODEL_FILE, caffe.TEST)
im = Image.open("img4.jpg")
in_ = np.array(im, dtype=np.float32)
net.blobs['data'].data[...] = in_
out = net.forward() # Run the network for the given input image
print out;
I'm not sure if I format the image correctly for the MNIST example. The image is a 28x28 grayscale image with a basic 4. Do I have to do more transformations on the image?
The network (deploy) looks like this (start and end):
input: "data"
input_shape {
dim: 1 # batchsize
dim: 1 # number of colour channels - rgb
dim: 28 # width
dim: 28 # height
}
....
layer {
name: "loss"
type: "Softmax"
bottom: "ip2"
top: "loss"
}
If I understand the question correctly, you have a trained model and you want to test the model using your own input images. There are many ways to do this.
One method I commonly use is to run a python script similar to what I have here.
Just keep in mind that you have to build python in caffe using make pycaffe and point to the folder by editing the line sys.path.append('../../../python')
Also edit the following lines to your model filenames.
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'fcn8s-heavy-pascal.caffemodel'
Edit the following line. Instead of score you should use the last layer of your network to get the output.
out = net.blobs['score'].data
You need to create a deploy.prototxt file from your original network.prototxt file. The data layer has to look like this:
input: "data"
input_shape {
dim: 1
dim: [channles]
dim: [width]
dim: [height]
}
where you replace [channels], [width], and [height] with the correct values of your image.
You also need to remove any layers which get the "label" as its bottom input (this would usually be only your loss layer).
Then you can use this deploy.prototxt file to test your inputs using MATLAB or PYTHON.

Shape mismatch when using combining layers in Caffe

I'm using the Caffe library for training a convolutional neural network (CNN). However, I'm getting the following error when using the concat layer to combine the output from two convolutional layers before applying it to a inner_product layer.
F1023 15:14:03.867435 2660 net.cpp:788] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'fc1'; shape mismatch. Source param shape is 400 800 (320000); target param shape is 400 400 (160000)
As far as I know I am using the concat layer in the exact same way as in BVLC_GoogLeNet. The concat layer can be found in my train.prototxt at pastebin under the name combined. The dimensions of my input blob is 256x8x7x24, where the data format in Caffe is batch_size x channels x height x width. I've tried training both using the pycaffe interface and the console. I get the same error. Below is code for training using the console.
solver_path = CAFFE_ROOT+'build/tools/caffe train -solver '
model_path = self.run_dir+'models/solver.prototxt'
log_path = self.run_dir+'models/training.log'
p = subprocess.Popen("GLOG_logtostderr=1 {} {} 2> {}".format(solver_path, model_path, log_path), shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
What is the meaning of this error? And how can it be resolved?
Update
As mentioned in the comments the log contains nothing else than the error. The stack trace for the error is the following:
# 0x7f231886e267 caffe::Net<>::ShareTrainedLayersWith()
# 0x7f231885c338 caffe::Solver<>::Test()
# 0x7f231885cc3e caffe::Solver<>::TestAll()
# 0x7f231885cd79 caffe::Solver<>::Step()
# 0x7f231885d6c5 caffe::Solver<>::Solve()
# 0x408d2b train()
# 0x4066f1 main
It should also be noted that my solver and code works fine for training the exact same CNN with only 1 "path" along the network, i.e. without the CONCAT layer.
I believe the issue you're having is that your train net has been updated to have a concat layer while your test net hasn't.
It would explain the 400x400 vs 400x800 issue you're having considering your concat merges two 400x400 layers. I can't know for certain without being able to see your test net.