how to feed the image data to HDF5 on caffe or existing examples? - neural-network

I had hard time working on caffe with HDF5 on the image classification and regression tasks, for some reason, the training on HDF5 will always fail at the first beginning that the test and train loss could very soon drop to close to zero. after trying all the tricks such reducing the learning rate, adding RELU, dropout, nothing started to work, so I started to doubt that the HDF5 data I am feeding to caffe is wrong.
so currently I am working on the universal dataset (Oxford 102 category flower dataset and also it has public code ), firstly I started out by trying ImageData and LMDB layer for the classification, they all worked very well. at last i used HDF5 data layer for the finetuning, the training_prototxt doesn't change unless on the data layer which uses HDF5 instead. and again, at the start of the learning, the loss drops from 5 to 0.14 at iteration 60, 0.00146 at iteration 100, that seems to prove that HDF5 data is incorrect.
i have two image&label to HDF5 snippet on the github, all of them seem to generate the HDF5 dataset, but for some reason these dataset doesn't seem to be not working with caffe
I wonder anything wrong with this data, or anything that makes this example run in HDF5 or if you have some HDF5 examples for classification or regression, which can be helpful to me a lot.
one snippet is shown as
def generateHDF5FromText2(label_num):
print '\nplease wait...'
HDF5_FILE = ['hdf5_train.h5', 'hdf5_test1.h5']
#store the training and testing data path and labels
LIST_FILE = ['train.txt','test.txt']
for kk, list_file in enumerate(LIST_FILE):
#reading the training.txt or testing.txt to extract the all the image path and labels, store into the array
path_list = []
label_list = []
with open(list_file, buffering=1) as hosts_file:
for line in hosts_file:
line = line.rstrip()
array = line.split(' ')
lab = int(array[1])
label_list.append(lab)
path_list.append(array[0])
print len(path_list), len(label_list)
# init the temp data and labels storage for HDF5
datas = np.zeros((len(path_list),3,227,227),dtype='f4')
labels = np.zeros((len(path_list), 1),dtype="f4")
for ii, _file in enumerate(path_list):
# feed the image and label data to the TEMP data
img = caffe.io.load_image( _file )
img = caffe.io.resize( img, (227, 227, 3) ) # resize to fixed size
img = np.transpose( img , (2,0,1))
datas[ii] = img
labels[ii] = int(label_list[ii])
# store the temp data and label into the HDF5
with h5py.File("/data2/"+HDF5_FILE[kk], 'w') as f:
f['data'] = datas
f['label'] = labels
f.close()

One input transformation that seems to happen in the original net and is missing from your HDF5 creation in mean subtraction.
You should obtain mean file (looks like "imagenet_mean.binaryproto" in your example), read it into python and subtract it from each image.
BTW, the mean file can give you a clue as to the scale of the input image (if pixel values should be in [0..1] range or [0..255]).
You might find caffe.io useful converting binaryproto to numpy array.

Related

How to Divide Drone Images dataset into Train & Test and Valid Parts for Faster R CNN in Matlab2018b

I have 297 Grayscale images and I would Like Divide Them into 3 parts (train-test and validation).
Ofcourse, I wrote some sample codes for example following codes from MathWorks (Object Detection Using Faster R-CNN Deep Learning)
% Split data into a training and test set.
idx = floor(0.6 * height(vehicleDataset));
trainingData = vehicleDataset(1:idx,:);
testData = vehicleDataset(idx:end,:);
But Matlab 2018a show the following error
Error:"Undefined function 'height' for input arguments of type
'struct'."
I would like to detect objects in images using "Faster R CNN" method and determine their locations in images.
Suppose your images are saved in the path "C:\Users\Student\Desktop\myImages"
First, create an imageDataStore object to manage a collection of image files.
datapath = "C:\Users\Student\Desktop\myImages";
imds = imageDatastore(datapath);%You may look at documentation for customizations.
[trainds,testds,valds] = splitEachLabel(imds,.6,.2);%Lets say 60% data for training, 20% for testing and 20% for validation
Now you have train data in the variable trainds and test data in the variable testds.
You can retrieve each images using readimage, say 5th image from train set as;
im = readimage(trainds,5);

Feature extraction from AlexNet fc7 layer in MATLAB

I have this AlexNet model in MATLAB:
net = alexnet;
layers = net.Layers;
layers(end-2) = fullyConnectedLayer(numClasses);
layers(end) = classificationLayer;
I'm using it to learn features from sequencies of frames from videos of different classes. So i need to extract learned features from the 'fc7' layer of this model to save these features as a vector and pass it to an LSTM layer.
The training process of this model for transfer learning its ok, all right.
I divided my data set in a x_train and a x_test sets using splitEachLabel() in my imageDatastore(), and using the function augmentedImageSource() to resize all the images for the network. Everything ok!
But when i try yo use this snippet of code shown bellow to resize images from my imageDatastore to be readed by the function activations(), to save the features as a vector, i'm getting an error:
imageSize = [227 227 3];
auimds = augmentedImageSource(imageSize, imds, 'ColorPreprocessing', 'gray2rgb');
Function activations:
layer = 'fc7';
fclayer = activations(mynet, auimds, layer,'OutputAs','columns');
The error:
Error using SeriesNetwork>iDataDispatcher (line 1113)
For an image input layer, the input data for predict must be a single image, a 4D array of images, or an imageDatastore with the correct size.
Error in SeriesNetwork/activations (line 791)
dispatcher = iDataDispatcher( X, miniBatchSize, precision, ...
Someone help me, please!
Thanks for the support!
Did you check the input size of that layer? The error you are getting is related with the input size of the current layer. Can you check your mynet structure and its fc7 layer input's size in your workspace in Matlab?

Retrieve data from .rec binary file

The question may be naive, but answers could help me.
A measurement is recorded in binary format, with a header that contains all information about the data and the data itself (i.e. a series of doubles).
The measurement data can be exported in csv format from the application, but it takes ages.
What do you have to pay attention to when trying to read data from a binary file? Is this process even feasible using Matlab to import as an array or labview (export as .txt maybe?)
Binary .rec file format may refer to various things (audio/video encoding format of Topfield based on MPEG4-TS, proprietary audio encoding, and even MRI scanner from Phillips) ...
If it refers to MRI scanner you may find some direct reader on fileexchange: Matlab PAR REC Reader
If it refer to something else, you may parse binary file header and data by yourself using the low level routine: fread
Edit
Not knowing the exact file format for your recorded sensor displacement, here is dummy example with fread for reading large rec file block-by-block supposing header contains just the length of data, and that data is just a serie of double values:
function [] = DummyReadRec()
%[
% Open rec file for reading
[fid, errmsg] = fopen('dummy.rec', 'r');
if (fid < 0), error(errmsg); end
cuo = onCleanup(#()fclose(fid));
% Read header (here supposing it is only an integer giving length of data)
reclenght = fread(fid, 1, 'uint32');
% Read data block-by-block (here supposing it is only double values)
MAX_BLOCK_LENGTH = 512;
blockCount = ceil(reclenght / MAX_BLOCK_LENGTH);
for bi = 1:blockCount,
% Will read a maximum of 'MAX_BLOCK_LENGTH' (or less if we're on the last block)
[recdata, siz] = fread(fid, [1 MAX_BLOCK_LENGTH], 'double');
% Do something with this block (fft or whatever)
offset = (bi-1)*MAX_BLOCK_LENGTH;
position = (offset+1):(offset+siz);
plot(position, 20*log10(abs(fft(recdata))));
drawnow();
end
%]
end
The answer is going to depend on the format of your binary file and how large it is.
I have done many conversion of various binary files all with differing layouts. If the file will fit into memory then you can just use fread as long as you know the layout of the binary file. Below is an example of reading a header & simple data block. It would of course have to be modified depending on the layout of your file. Depending on recording equipment & computer type you may also need to make use of the machinefmt ('ieee-le' or 'ieee-be') options of fread ... that has burned me before.
%Open the File for reading
fid = fopen(yourRECfile,'r');
%Read the Header ... your layout will be different
header.MajorRel = fread(fid,1,'uint16'); %Major File Rev #
header.MinorRel = fread(fid,1,'uint16'); %Minor File Rev #
header.IRIGStart = fread(fid,1,'double'); %Start time in secs
header.Flags = fread(fid,1,'uint32'); %Flags
%Read everything else from there until end of file as a series of doubles.
data = fread(fid,inf,'double');
fclose(fid);
If the file does not fit into memory you will either need to process it in blocks or look into using memmapfile.

Caffe - MNSIT - How do I use the network on a single image?

I'm using Caffe (http://caffe.berkeleyvision.org/) for image classification. I'm using it on Windows and everything seems to be compiling just fine.
To start learning I followed the MNIST tutorial (http://caffe.berkeleyvision.org/gathered/examples/mnist.html). I downloaded the data and ran ..\caffe.exe train --solver=...examples\mnist\lenet_solver.prototxt. It ran 10.000 iterations, printed that the accuracy was 98.5, and generated two files: lenet_iter_10000.solverstate, and lenet_iter_10000.caffemodel.
So, I though it would be funny to try to classify my own image, it should be easy right?.
I can find resources such as: https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples telling how to prepare, train and time my model. But each time a tutorial/article comes to actually putting a single instance into the CNN, they skip to the next point and tell to download some new model. Some resources tell to use the classifier.bin/.exe, but this file takes a imagenet_mean.binaryproto or similar for mnist. I have no idea where to find or generated this file.
So in short: When I have trained a CNN using Caffe, how to I input a single image and get the output using the files I already have?
Update: Based on the help, I got the Net to recognize an image but the recognition is not correct even if the network had an accuracy of 99.0%. I used the following python code to recognice an image:
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'lenet_iter_10000.caffemodel'
net = caffe.Net(NET_FILE, MODEL_FILE, caffe.TEST)
im = Image.open("img4.jpg")
in_ = np.array(im, dtype=np.float32)
net.blobs['data'].data[...] = in_
out = net.forward() # Run the network for the given input image
print out;
I'm not sure if I format the image correctly for the MNIST example. The image is a 28x28 grayscale image with a basic 4. Do I have to do more transformations on the image?
The network (deploy) looks like this (start and end):
input: "data"
input_shape {
dim: 1 # batchsize
dim: 1 # number of colour channels - rgb
dim: 28 # width
dim: 28 # height
}
....
layer {
name: "loss"
type: "Softmax"
bottom: "ip2"
top: "loss"
}
If I understand the question correctly, you have a trained model and you want to test the model using your own input images. There are many ways to do this.
One method I commonly use is to run a python script similar to what I have here.
Just keep in mind that you have to build python in caffe using make pycaffe and point to the folder by editing the line sys.path.append('../../../python')
Also edit the following lines to your model filenames.
NET_FILE = 'deploy.prototxt'
MODEL_FILE = 'fcn8s-heavy-pascal.caffemodel'
Edit the following line. Instead of score you should use the last layer of your network to get the output.
out = net.blobs['score'].data
You need to create a deploy.prototxt file from your original network.prototxt file. The data layer has to look like this:
input: "data"
input_shape {
dim: 1
dim: [channles]
dim: [width]
dim: [height]
}
where you replace [channels], [width], and [height] with the correct values of your image.
You also need to remove any layers which get the "label" as its bottom input (this would usually be only your loss layer).
Then you can use this deploy.prototxt file to test your inputs using MATLAB or PYTHON.

MATLAB loading and saving a single image from a 32 bit tiff stack

I'm using MATLAB_R2011a_student. I have some image stacks saved as 32 bit tiffs, some over 1000 frames. I would like to be able to pull out a specific frame from the stack and save it as a 32 bit tiff or some readable format where there would be no data loss from the original. Currently my code looks like this:
clear, clc;
k=163;
image=('/Users/me/Filename.tiff');
A = uint8(imread(image, k));
B=A(:,:,1);
J=imadjust(B,stretchlim(B),[]);
imwrite(J,'/Users/me/163.tif','tif');
(I'm assuming reading it as 8 bit, and the way I'm saving are not the best way to do this)
Either way this code works for a seemingly random number of frames (for example in one file.tiff the above code works for frames 1-165 but none of the frames after 165, for a different file.tiff the code works for frames 1-8 but none of the frames after 8) I'm also getting a strange horizontal line in the output image when this does work:
??? Error using ==> rtifc
Invalid TIFF image index specified.
Error in ==> readtif at 52
[X, map, details] = rtifc(args);
Error in ==> imread at 443
[X, map] = feval(fmt_s.read, filename, extraArgs{:});
Thanks!
The best way (in my opinion) to handle tiff stacks is to use the Tiff library available since a few years. I must admit that I don't know much about OOP but I managed to understand enough to load a tiff stack and manipulate it.That's the kind of simple demo I wish I had seen a year ago haha.
I the following example I load a single stack and store it all into a 3D array. I use imfinfo to fetch infos about the images, notably the number of images/stack and the actual image dimensions. If you want you can choose to load only one image using appropriate indices. Please try the code below and play around with it; you'll understand what I mean.
clear
clc
%// Get tiff files you wish to open
hFiles = dir('*.tif');
%// Here I only have 1 multi-tiff file containing 30 images. Hence hInfo is a 30x1 structure.
hInfo = imfinfo(hFiles(1).name);
%// Set parameters.
ImageHeight = hInfo(1).Height;
ImageWidth = hInfo(1).Width;
SliceNumber = numel(hInfo);
%// Open Tiff object
Stack_TiffObject = Tiff(hFiles.name,'r');
%// Initialize array containing your images.
ImageMatrix = zeros(ImageHeight,ImageWidth,SliceNumber,'uint32');
for k = 1:SliceNumber
%// Loop through each image
Stack_TiffObject.setDirectory(k)
%// Put it in the array
ImageMatrix(:,:,k) = Stack_TiffObject.read();
end
%// Close the Tiff object
Stack_TiffObject.close
Hope that helps.