I trained a neural network model using Caffe:
/home/f/caffe-master/build/tools/caffe train -solver=/media/my_solver.prototxt
I then scored the learned model on the validation set:
/home/f/caffe-master/build/tools/caffe test -model=/media/my_train_test.prototxt
-weights model.caffemodel -iterations 100
But how to get the labels predicted by the trained neural network model in Caffe?
I know I can use the Python or Matlab bindings for that purpose, but I am curious to know whether we can get the predicted labels in Caffe directly through the command line interface.
It does not seem to be mentioned in the official Caffe's tutorial on interfaces, and looking at caffe's help didn't help:
> f#f-VirtualBox:~/caffe/caffe-master/build/tools$ ./caffe
caffe: command line brew
usage: caffe <command> <args>
commands:
train train or finetune a model
test score a model
device_query show GPU diagnostic information
time benchmark model execution time
Flags from /home/f/caffe-master/tools/caffe.cpp:
-gpu (Run in GPU mode on given device ID.) type: int32 default: -1
-iterations (The number of iterations to run.) type: int32 default: 50
-model (The model definition protocol buffer text file..) type: string
default: ""
-snapshot (Optional; the snapshot solver state to resume training.)
type: string default: ""
-solver (The solver definition protocol buffer text file.) type: string
default: ""
-weights (Optional; the pretrained weights to initialize finetuning. Cannot
be set simultaneously with snapshot.) type: string default: ""
If you don't want to go through Python, you can add a HDF5_OUTPUT layer: it will save the predicted outputs in an HDF5 file.
Otherwise if you feel like going in the code, you could print or save bottom_data_vector[k].second at around https://github.com/BVLC/caffe/blob/master/src/caffe/layers/accuracy_layer.cpp#L74
Related
I just created a model that does a binary classification and has a dense layer of 1 unit at the end. I used Sigmoid activation. However, I get this error now when I wanna convert it to CoreML.
I tried to change the number of units to 2 and activation to softmax but still didn't work.
import coremltools as ct
#1. define input size
image_input = ct.ImageType(scale=1/255)
#2. give classifier
classifier_config = coremltools.ClassifierConfig(class_labels=[0, 1]) #ERROR here
#3. convert the model
coreml_model = coremltools.convert("mask_detection_model_surgical_mask.h5",
inputs=[image_input], classifier_config=classifier_config)
#4. load and resize an example image
example_image = Image.open("Unknown3.jpg").resize((256, 256))
# Make a prediction using Core ML
out_dict = coreml_model.predict({mymodel.input_names[0]: example_image})
print(out_dict["classLabels"])
# save to disk
#coreml_model.save("FINALLY.mlmodel")
I found the answer to my question.
Use Softmax activation and 2 Dense units as the final layer with either loss='binary_crossentropy' or `loss='categorical_crossentropy'
Good luck to hundreds of people who posted a similar question but received no answer.
I'm using Caffe (http://caffe.berkeleyvision.org/) for image classification. I'm using it on Windows and everything seems to be compiling just fine.
To start learning I followed the MNIST tutorial (http://caffe.berkeleyvision.org/gathered/examples/mnist.html). I downloaded the data and ran ..\caffe.exe train --solver=...examples\mnist\lenet_solver.prototxt. It ran 10.000 iterations, printed that the accuracy was 98.5, and generated two files: lenet_iter_10000.solverstate, and lenet_iter_10000.caffemodel.
So, I though it would be funny to try to classify my own image, it should be easy right?.
I can find resources such as: https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples telling how to prepare, train and time my model. But each time a tutorial/article comes to actually putting a single instance into the CNN, they skip to the next point and tell to download some new model. Some resources tell to use the classifier.bin/.exe, but this file takes a imagenet_mean.binaryproto or similar for mnist. I have no idea where to find or generated this file.
So in short: When I have trained a CNN using Caffe, how to I input a single image and get the output using the files I already have?
Update: Based on the help, I got the Net to recognize an image but the recognition is not correct even if the network had an accuracy of 99.0%. I used the following python code to recognice an image:
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'lenet_iter_10000.caffemodel'
net = caffe.Net(NET_FILE, MODEL_FILE, caffe.TEST)
im = Image.open("img4.jpg")
in_ = np.array(im, dtype=np.float32)
net.blobs['data'].data[...] = in_
out = net.forward() # Run the network for the given input image
print out;
I'm not sure if I format the image correctly for the MNIST example. The image is a 28x28 grayscale image with a basic 4. Do I have to do more transformations on the image?
The network (deploy) looks like this (start and end):
input: "data"
input_shape {
dim: 1 # batchsize
dim: 1 # number of colour channels - rgb
dim: 28 # width
dim: 28 # height
}
....
layer {
name: "loss"
type: "Softmax"
bottom: "ip2"
top: "loss"
}
If I understand the question correctly, you have a trained model and you want to test the model using your own input images. There are many ways to do this.
One method I commonly use is to run a python script similar to what I have here.
Just keep in mind that you have to build python in caffe using make pycaffe and point to the folder by editing the line sys.path.append('../../../python')
Also edit the following lines to your model filenames.
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'fcn8s-heavy-pascal.caffemodel'
Edit the following line. Instead of score you should use the last layer of your network to get the output.
out = net.blobs['score'].data
You need to create a deploy.prototxt file from your original network.prototxt file. The data layer has to look like this:
input: "data"
input_shape {
dim: 1
dim: [channles]
dim: [width]
dim: [height]
}
where you replace [channels], [width], and [height] with the correct values of your image.
You also need to remove any layers which get the "label" as its bottom input (this would usually be only your loss layer).
Then you can use this deploy.prototxt file to test your inputs using MATLAB or PYTHON.
I'm using the Caffe library for training a convolutional neural network (CNN). However, I'm getting the following error when using the concat layer to combine the output from two convolutional layers before applying it to a inner_product layer.
F1023 15:14:03.867435 2660 net.cpp:788] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'fc1'; shape mismatch. Source param shape is 400 800 (320000); target param shape is 400 400 (160000)
As far as I know I am using the concat layer in the exact same way as in BVLC_GoogLeNet. The concat layer can be found in my train.prototxt at pastebin under the name combined. The dimensions of my input blob is 256x8x7x24, where the data format in Caffe is batch_size x channels x height x width. I've tried training both using the pycaffe interface and the console. I get the same error. Below is code for training using the console.
solver_path = CAFFE_ROOT+'build/tools/caffe train -solver '
model_path = self.run_dir+'models/solver.prototxt'
log_path = self.run_dir+'models/training.log'
p = subprocess.Popen("GLOG_logtostderr=1 {} {} 2> {}".format(solver_path, model_path, log_path), shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
What is the meaning of this error? And how can it be resolved?
Update
As mentioned in the comments the log contains nothing else than the error. The stack trace for the error is the following:
# 0x7f231886e267 caffe::Net<>::ShareTrainedLayersWith()
# 0x7f231885c338 caffe::Solver<>::Test()
# 0x7f231885cc3e caffe::Solver<>::TestAll()
# 0x7f231885cd79 caffe::Solver<>::Step()
# 0x7f231885d6c5 caffe::Solver<>::Solve()
# 0x408d2b train()
# 0x4066f1 main
It should also be noted that my solver and code works fine for training the exact same CNN with only 1 "path" along the network, i.e. without the CONCAT layer.
I believe the issue you're having is that your train net has been updated to have a concat layer while your test net hasn't.
It would explain the 400x400 vs 400x800 issue you're having considering your concat merges two 400x400 layers. I can't know for certain without being able to see your test net.
I am trying to run a caffe Experiment.I am using the following loss layer in my Train.prototxt,
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
include {
phase: TRAIN
}
}
I see the following configuration being displayed when the training is started,
I0923 21:19:13.101313 26423 net.cpp:410] loss <- ip2
I0923 21:19:13.101323 26423 net.cpp:410] loss <- label
I0923 21:19:13.101339 26423 net.cpp:368] loss -> (automatic)
I have not given top parameter in the loss layer.
What exactly the automatic(loss -> (automatic)) means here?
Thanks in advance!
Caffe layers, including Loss layers, produce Blob (4-D arrays) as output of their computations. If you don't set a Blob name through the top parameter, the corresponding Blob will be added to the "output" of the net.
This means that, if you call the Net::forward() method, it will return a list of Blobs, i.e., the ones that are unbounded to be the input for another layer.
When you call the Caffe training tool, it automatically print to screen such Blobs. This way you can follow the value of loss or accuracy during training.
I have an unlabeled dataset that I want to classify with my newly trained classifier using NaiveBayes classification in Weka. So actually when in the Classify mode in weka if i give the option Supplied Test set, then it accepts the test set only if it is labelled and evaluates and gives the accuracy.
But what I want is to train it using a train.csv or train.arff file and then give it a new unseen and unlabelled test.csv or test.arff file and classify it and give it labels depending on classes in the training file. But if I provide an unlabelled file as test file to wweka it gives:
ERROR: Train and Test set not compatible
Sample format of my Train and test files are as below:
Train.csv file:
article story .......hockey class
1 0 ...... 0 politics
0 0 .......1 sports
.
.
.
.
. sports
and Test.csv file:
article story .......hockey class
0 1 ...... 0
1 0 .......1
.
.
.
.
.
So how do I classify an unlabelled dataset in Weka using NaiveBayes classifier??
It seems you are missing the class label. Weka requires training and test set to have the exact same attributes in the same order. Now there are two cases:
You know the classes of your test set
The performance is calculated by comparing the actual class labels with the predicted ones. You need to supply the class labels in your test set like you did in your training set.
You DON'T know the classes of your test set
To calculate a performance, Weka needs to compare the predicted classes with the actual classes. If you don't have the actual classes, you cannot calculate the performance. You can only predict classes.
You have to add a class label with missing values for your test instances if you just want prediction.
Even if your test set is labelled, Weka will not see it at first stage. It will use the classifier you developed with training data and then will apply the classifier on the test set you supply. The classifier then predicts each instance class and Weka then keeps track of a correct or incorrect classification. So, what you are doing here is exactly what you are trying to achieve. The error is telling that the training and test sets are not compatible because I believe you have removed the "class" label from the test set. Don't worry. Keep it as it is and the accuracy you are getting from Weka is the actual performance of the classifier. Hope that helps.
you cant leave it all empty, you need to set at least one each class label on the class field (as some kind of "clue" for the weka)
article story .......hockey class
0 1 ...... 0 politics
1 0 .......1 sport
1 1 .......1 ?
1 1 .......1 ?
the two first row will provide weka an example of the prediction class. Then you can predict as much as instance with no class (?) using your trained model