I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.
Why permute? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
include { phase: TRAIN }
For more information regarding hdf5 input layer, you can see in this answer.
good evening,
i'm new to coding CNN
i'v got ShanghaiTech crowd counting dataset that has (beside the images) .mat files for what i believe the ground truth for (counting) for images.
i try to print the content of one .mat file in python, here is what i get:
{'image_info': array([[array([[(array([[ 855.32345978, 590.49587357],
[ 965.5908524 , 472.79472415],
[ 937.09478464, 400.93507502],
[ 42.5852337 , 359.87860699],
[1017.48233659, 8.99748811],
[1017.48233659, 23.31916643]]), array([[920]], dtype=uint16))]],
dtype=[('location', 'O'), ('number', 'O')])]], dtype=object), '__version__': '1.0', '__header__': 'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Fri Nov 18 20:06:05 2016', '__globals__': []}
each .mat file corresponds to one image,
i know at some point in CNN we need to calculate the error between the network result and the ground truth we have, but i don't seem to understand the structure and the content of these .mat files.
can someone explain whats in these files and how or for what that content is used in crowd estimation.
so i got the answer,
the data in the .mat presented in the question containes (or at least what we are interested in) two arrays,
the first one:
array([[ 855.32345978, 590.49587357],
[ 965.5908524 , 472.79472415],
[ 937.09478464, 400.93507502],
[ 42.5852337 , 359.87860699],
[1017.48233659, 8.99748811],
[1017.48233659, 23.31916643]])
is an N by 2 array, the 2 corresponds tho the targeted object X and Y coordinates, and the N is the number of targeted objects (ground-truth)
also, the second array contains the ground-truth
the data of the .mat file was extracted through scipy.io.loadmat,
and the structure of the data is dictionary, now getting to the ground-trouth in that was quite tricy, but it went like that:
matContent=spy.io.loadmat(os.path.join(gtPath,gtList[1])) #var type is dictionary
gt=matContent['image_info'][0][0][0][0][1] #getting the ground-thruth number
I'm using Caffe (http://caffe.berkeleyvision.org/) for image classification. I'm using it on Windows and everything seems to be compiling just fine.
To start learning I followed the MNIST tutorial (http://caffe.berkeleyvision.org/gathered/examples/mnist.html). I downloaded the data and ran ..\caffe.exe train --solver=...examples\mnist\lenet_solver.prototxt. It ran 10.000 iterations, printed that the accuracy was 98.5, and generated two files: lenet_iter_10000.solverstate, and lenet_iter_10000.caffemodel.
So, I though it would be funny to try to classify my own image, it should be easy right?.
I can find resources such as: https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples telling how to prepare, train and time my model. But each time a tutorial/article comes to actually putting a single instance into the CNN, they skip to the next point and tell to download some new model. Some resources tell to use the classifier.bin/.exe, but this file takes a imagenet_mean.binaryproto or similar for mnist. I have no idea where to find or generated this file.
So in short: When I have trained a CNN using Caffe, how to I input a single image and get the output using the files I already have?
Update: Based on the help, I got the Net to recognize an image but the recognition is not correct even if the network had an accuracy of 99.0%. I used the following python code to recognice an image:
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'lenet_iter_10000.caffemodel'
net = caffe.Net(NET_FILE, MODEL_FILE, caffe.TEST)
im = Image.open("img4.jpg")
in_ = np.array(im, dtype=np.float32)
net.blobs['data'].data[...] = in_
out = net.forward() # Run the network for the given input image
print out;
I'm not sure if I format the image correctly for the MNIST example. The image is a 28x28 grayscale image with a basic 4. Do I have to do more transformations on the image?
The network (deploy) looks like this (start and end):
input: "data"
input_shape {
dim: 1 # batchsize
dim: 1 # number of colour channels - rgb
dim: 28 # width
dim: 28 # height
layer {
name: "loss"
type: "Softmax"
bottom: "ip2"
top: "loss"
If I understand the question correctly, you have a trained model and you want to test the model using your own input images. There are many ways to do this.
One method I commonly use is to run a python script similar to what I have here.
Just keep in mind that you have to build python in caffe using make pycaffe and point to the folder by editing the line sys.path.append('../../../python')
Also edit the following lines to your model filenames.
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'fcn8s-heavy-pascal.caffemodel'
Edit the following line. Instead of score you should use the last layer of your network to get the output.
out = net.blobs['score'].data
You need to create a deploy.prototxt file from your original network.prototxt file. The data layer has to look like this:
input: "data"
input_shape {
dim: 1
dim: [channles]
dim: [width]
dim: [height]
where you replace [channels], [width], and [height] with the correct values of your image.
You also need to remove any layers which get the "label" as its bottom input (this would usually be only your loss layer).
Then you can use this deploy.prototxt file to test your inputs using MATLAB or PYTHON.
I had hard time working on caffe with HDF5 on the image classification and regression tasks, for some reason, the training on HDF5 will always fail at the first beginning that the test and train loss could very soon drop to close to zero. after trying all the tricks such reducing the learning rate, adding RELU, dropout, nothing started to work, so I started to doubt that the HDF5 data I am feeding to caffe is wrong.
so currently I am working on the universal dataset (Oxford 102 category flower dataset and also it has public code ), firstly I started out by trying ImageData and LMDB layer for the classification, they all worked very well. at last i used HDF5 data layer for the finetuning, the training_prototxt doesn't change unless on the data layer which uses HDF5 instead. and again, at the start of the learning, the loss drops from 5 to 0.14 at iteration 60, 0.00146 at iteration 100, that seems to prove that HDF5 data is incorrect.
i have two image&label to HDF5 snippet on the github, all of them seem to generate the HDF5 dataset, but for some reason these dataset doesn't seem to be not working with caffe
I wonder anything wrong with this data, or anything that makes this example run in HDF5 or if you have some HDF5 examples for classification or regression, which can be helpful to me a lot.
one snippet is shown as
def generateHDF5FromText2(label_num):
print '\nplease wait...'
HDF5_FILE = ['hdf5_train.h5', 'hdf5_test1.h5']
#store the training and testing data path and labels
LIST_FILE = ['train.txt','test.txt']
for kk, list_file in enumerate(LIST_FILE):
#reading the training.txt or testing.txt to extract the all the image path and labels, store into the array
path_list = []
label_list = []
with open(list_file, buffering=1) as hosts_file:
for line in hosts_file:
line = line.rstrip()
array = line.split(' ')
lab = int(array[1])
print len(path_list), len(label_list)
# init the temp data and labels storage for HDF5
datas = np.zeros((len(path_list),3,227,227),dtype='f4')
labels = np.zeros((len(path_list), 1),dtype="f4")
for ii, _file in enumerate(path_list):
# feed the image and label data to the TEMP data
img = caffe.io.load_image( _file )
img = caffe.io.resize( img, (227, 227, 3) ) # resize to fixed size
img = np.transpose( img , (2,0,1))
datas[ii] = img
labels[ii] = int(label_list[ii])
# store the temp data and label into the HDF5
with h5py.File("/data2/"+HDF5_FILE[kk], 'w') as f:
f['data'] = datas
f['label'] = labels
One input transformation that seems to happen in the original net and is missing from your HDF5 creation in mean subtraction.
You should obtain mean file (looks like "imagenet_mean.binaryproto" in your example), read it into python and subtract it from each image.
BTW, the mean file can give you a clue as to the scale of the input image (if pixel values should be in [0..1] range or [0..255]).
You might find caffe.io useful converting binaryproto to numpy array.
I am trying to run a caffe Experiment.I am using the following loss layer in my Train.prototxt,
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
include {
phase: TRAIN
I see the following configuration being displayed when the training is started,
I0923 21:19:13.101313 26423 net.cpp:410] loss <- ip2
I0923 21:19:13.101323 26423 net.cpp:410] loss <- label
I0923 21:19:13.101339 26423 net.cpp:368] loss -> (automatic)
I have not given top parameter in the loss layer.
What exactly the automatic(loss -> (automatic)) means here?
Thanks in advance!
Caffe layers, including Loss layers, produce Blob (4-D arrays) as output of their computations. If you don't set a Blob name through the top parameter, the corresponding Blob will be added to the "output" of the net.
This means that, if you call the Net::forward() method, it will return a list of Blobs, i.e., the ones that are unbounded to be the input for another layer.
When you call the Caffe training tool, it automatically print to screen such Blobs. This way you can follow the value of loss or accuracy during training.
I have done PCA for 21 images of the same person in different conditions. LAst step of the PCA is projection of original data : signals=PC'*data. Size of signals is 21*21, now I want to write this to a CSV file with a label as +1. Please guide me how to do this in matlab. I tried csvwrite but it does not write the labels, only the data.
for i=1:length(signals)
for j=1:1
end(both for)
csvwrite('f1.csv',[label signals]);