Shape mismatch when using combining layers in Caffe - neural-network

I'm using the Caffe library for training a convolutional neural network (CNN). However, I'm getting the following error when using the concat layer to combine the output from two convolutional layers before applying it to a inner_product layer.
F1023 15:14:03.867435 2660 net.cpp:788] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'fc1'; shape mismatch. Source param shape is 400 800 (320000); target param shape is 400 400 (160000)
As far as I know I am using the concat layer in the exact same way as in BVLC_GoogLeNet. The concat layer can be found in my train.prototxt at pastebin under the name combined. The dimensions of my input blob is 256x8x7x24, where the data format in Caffe is batch_size x channels x height x width. I've tried training both using the pycaffe interface and the console. I get the same error. Below is code for training using the console.
solver_path = CAFFE_ROOT+'build/tools/caffe train -solver '
model_path = self.run_dir+'models/solver.prototxt'
log_path = self.run_dir+'models/training.log'
p = subprocess.Popen("GLOG_logtostderr=1 {} {} 2> {}".format(solver_path, model_path, log_path), shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
What is the meaning of this error? And how can it be resolved?
Update
As mentioned in the comments the log contains nothing else than the error. The stack trace for the error is the following:
# 0x7f231886e267 caffe::Net<>::ShareTrainedLayersWith()
# 0x7f231885c338 caffe::Solver<>::Test()
# 0x7f231885cc3e caffe::Solver<>::TestAll()
# 0x7f231885cd79 caffe::Solver<>::Step()
# 0x7f231885d6c5 caffe::Solver<>::Solve()
# 0x408d2b train()
# 0x4066f1 main
It should also be noted that my solver and code works fine for training the exact same CNN with only 1 "path" along the network, i.e. without the CONCAT layer.

I believe the issue you're having is that your train net has been updated to have a concat layer while your test net hasn't.
It would explain the 400x400 vs 400x800 issue you're having considering your concat merges two 400x400 layers. I can't know for certain without being able to see your test net.

Related

NSLocalizedDescription = "The size of the output layer 'Identity' in the neural network does not match the number of classes in the classifier."

I just created a model that does a binary classification and has a dense layer of 1 unit at the end. I used Sigmoid activation. However, I get this error now when I wanna convert it to CoreML.
I tried to change the number of units to 2 and activation to softmax but still didn't work.
import coremltools as ct
#1. define input size
image_input = ct.ImageType(scale=1/255)
#2. give classifier
classifier_config = coremltools.ClassifierConfig(class_labels=[0, 1]) #ERROR here
#3. convert the model
coreml_model = coremltools.convert("mask_detection_model_surgical_mask.h5",
inputs=[image_input], classifier_config=classifier_config)
#4. load and resize an example image
example_image = Image.open("Unknown3.jpg").resize((256, 256))
# Make a prediction using Core ML
out_dict = coreml_model.predict({mymodel.input_names[0]: example_image})
print(out_dict["classLabels"])
# save to disk
#coreml_model.save("FINALLY.mlmodel")
I found the answer to my question.
Use Softmax activation and 2 Dense units as the final layer with either loss='binary_crossentropy' or `loss='categorical_crossentropy'
Good luck to hundreds of people who posted a similar question but received no answer.

pretrained densenet/vgg16/resnet50 + gp does not train on cifar10 data

I'm trying to train a hybrid model with GP on top of pre-trained CNN (Densenet, VGG and Resnet) with CIFAR10 data, mimic the ex2 function in the gpflow document. But the testing result is always between 0.1~0.2, which generally means random guess (Wilson+2016 paper shows hybrid model for CIFAR10 data should get accuracy of 0.7). Could anyone give me a hint of what could be wrong?
I've tried same code with simpler cnn models (2 conv layer or 4 conv layer) and both have reasonable results. I've tried to use different Keras applications: Densenet121, VGG16, ResNet50, neither works. I've tried to freeze the weights in the pre-trained models still not working.
def cnn_dn(output_dim):
base_model = DenseNet121(weights='imagenet', include_top=False, input_shape=(32,32,3))
bout = base_model.output
fcl = GlobalAveragePooling2D()(bout)
#for layer in base_model.layers:
# layer.trainable = False
output=Dense(output_dim, activation='relu')(fcl)
md=Model(inputs=base_model.input, outputs=output)
return md
#add gp on top, reference:ex2() function in
#https://nbviewer.jupyter.org/github/GPflow/GPflow/blob/develop/doc/source/notebooks/tailor/gp_nn.ipynb
#needs to slightly change build graph part because keras variable #sharing is not the same as tensorflow
#......
## build graph
with tf.variable_scope('cnn'):
md=cnn_dn(gp_dim)
f_X = tf.cast(md(X), dtype=float_type)
f_Xtest = tf.cast(md(Xtest), dtype=float_type)
#......
## predict
res=np.argmax(sess.run(my, feed_dict={Xtest:xts}),1).reshape(yts.shape)
correct = res == yts.astype(int)
print(np.average(correct.astype(float)))
I finally figure out that the solution is training larger iterations. In the original code, I just use 50 iterations as used in the ex2() function for MNIST data and it is not enough for more complicated network and CIFAR10 data. Adjusting some hyper-parameter (e.g. learning rate and activation function) also helps.

an error in caffe train

everybody.I'd like to use caffe to train a 5 classes detection task with "SSD: Single Shot MultiBox Detector", so I changed the num_classes from 21 to 6.However,I get an following error:
"Check failed: num_priors_ * num_classes_ == bottom[1]->channels() (52392 vs. 183372) Number of priors must match number of confidence predictions."
I can understand this error,and I found 52392/6=183372/21,namely why I changed num_classes to 6,but the number of confidence predictions is still 183372. So how to solve this problem. Thank you very much!
Since SSD depends on the number of labels not only for the classification output, but also for the BB prediction, you would need to change num_output in several other places in the model.
I would strongly suggest you wouldn't do that manually, but rather use the python scripts provided in the 'examples/ssd' folder. For instance, you can change line 277 in 'examples/ssd/ssd_pascal_speed.py' to:
num_classes = 5 # instead of 21
And then use the model files this script provides.

Caffe - MNSIT - How do I use the network on a single image?

I'm using Caffe (http://caffe.berkeleyvision.org/) for image classification. I'm using it on Windows and everything seems to be compiling just fine.
To start learning I followed the MNIST tutorial (http://caffe.berkeleyvision.org/gathered/examples/mnist.html). I downloaded the data and ran ..\caffe.exe train --solver=...examples\mnist\lenet_solver.prototxt. It ran 10.000 iterations, printed that the accuracy was 98.5, and generated two files: lenet_iter_10000.solverstate, and lenet_iter_10000.caffemodel.
So, I though it would be funny to try to classify my own image, it should be easy right?.
I can find resources such as: https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples telling how to prepare, train and time my model. But each time a tutorial/article comes to actually putting a single instance into the CNN, they skip to the next point and tell to download some new model. Some resources tell to use the classifier.bin/.exe, but this file takes a imagenet_mean.binaryproto or similar for mnist. I have no idea where to find or generated this file.
So in short: When I have trained a CNN using Caffe, how to I input a single image and get the output using the files I already have?
Update: Based on the help, I got the Net to recognize an image but the recognition is not correct even if the network had an accuracy of 99.0%. I used the following python code to recognice an image:
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'lenet_iter_10000.caffemodel'
net = caffe.Net(NET_FILE, MODEL_FILE, caffe.TEST)
im = Image.open("img4.jpg")
in_ = np.array(im, dtype=np.float32)
net.blobs['data'].data[...] = in_
out = net.forward() # Run the network for the given input image
print out;
I'm not sure if I format the image correctly for the MNIST example. The image is a 28x28 grayscale image with a basic 4. Do I have to do more transformations on the image?
The network (deploy) looks like this (start and end):
input: "data"
input_shape {
dim: 1 # batchsize
dim: 1 # number of colour channels - rgb
dim: 28 # width
dim: 28 # height
}
....
layer {
name: "loss"
type: "Softmax"
bottom: "ip2"
top: "loss"
}
If I understand the question correctly, you have a trained model and you want to test the model using your own input images. There are many ways to do this.
One method I commonly use is to run a python script similar to what I have here.
Just keep in mind that you have to build python in caffe using make pycaffe and point to the folder by editing the line sys.path.append('../../../python')
Also edit the following lines to your model filenames.
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'fcn8s-heavy-pascal.caffemodel'
Edit the following line. Instead of score you should use the last layer of your network to get the output.
out = net.blobs['score'].data
You need to create a deploy.prototxt file from your original network.prototxt file. The data layer has to look like this:
input: "data"
input_shape {
dim: 1
dim: [channles]
dim: [width]
dim: [height]
}
where you replace [channels], [width], and [height] with the correct values of your image.
You also need to remove any layers which get the "label" as its bottom input (this would usually be only your loss layer).
Then you can use this deploy.prototxt file to test your inputs using MATLAB or PYTHON.

matconvnet classification training last layer (softmax)?

I would like to retrain the vgg-imagenet-f network to do classification (rather than direct image comparison, which is what I have done with my own network).
The downloaded network however is a deployment net, and doesn't have a loss layer included. As I've not done classification training before, I'm a bit stumped as to how to design this last layer. I expect it will be something like this:
layer.name = 'loss' ;
layer.type = 'custom' ;
layer.forward = #forward ;
layer.backward = #backward ;
layer.class = [] ;
but I don't know what my #forward and #backward functions should be. Should they be softmax?
Of note, I have a imdb with about 10k images, corresponding labels, and an ID element with unique numbers running 1 - 10k.
Thanks for any help, or any links to a sample of the way one should construct this layer in matconvnet/matlab!
You could implement your own network adjusting the filters accordingly, since you want to 'retrain' vgg instead of initializing the weights with random numbers you can adapt your classification network using trained filers from downloaded network. The last layer could be softmaxloss
http://www.vlfeat.org/matconvnet/mfiles/vl_nnsoftmaxloss/