I have a trained caffe model with next files:
deploy.prototxt
mean_value.txt
model.caffemodel
How do i feed more images into it to increase its accuracy?
Retraining with more images to increase accuracy
Try finetuning with your caffemodel
Related
I am a new Matlab user and I would be grateful if you help me. I have converted a set of time series into pictural presentations using CWT (continuous wavelet transform) and trained a deep learning network with quite a reasonable accuracy. I have made use of classify to check the trained network performance for the output of a single image. Now I am going to use it for a series of images consecutively feeding on the main time series, so how do I have to use classify in this issue?
regards
I have 3 classes. (50k for training, 12k for validation)
By using pretrained vgg16 and resnet50, and freezing the models and only training a dense layer on top, I reach a validation accuracy of 99%.
Should I fine tune to improve features by unfreezing the layers or should I use the features as it is?
Also, is vgg16 a better feature extractor than Resnet50 or should I use features from Resnet?
Thanks!
It depends on your problem domain. If you are fine-tuning the pretrained model for the same problem domain and the training data size is small, then what you have done is correct.
Maybe if you freeze only the first layers, which are well trained on for general feature extraction (egdes, blobs, shapes ..etc), you can boost your performance. It also recommended to apply data augmentation if you are going to do this to avoid over fitting
I encourage you to check the following tutorial on Transfer Learning for more details:
http://cs231n.github.io/transfer-learning/
So here is there setup, I have a set of images (labeled train and test) and I want to train a conv net that tells me whether or not a specific object is within this image.
To do this, I followed the tensorflow tutorial on MNIST, and I train a simple conv net reduced to the area of interest (the object) which are training on image of size 128x128. The architecture is as follows : successively 3 layers consisting of 2 conv layers and 1 max pool down-sampling layers, and one fully connected softmax layers (with two class 0 and 1 whether the object is present or not)
I impleted it using tensorflow, and this works quite well, but since I have enough computing power I was wondering how I could improve the complexity of the classification:
- adding more layers ?
- adding more channel at each layer ? (currently 32,64,128 and 1024 for the fully connected)
- anything else ?
But the most important part is that now I want to detect this same object on larger images (roughle 600x600 whereas the size of the object should be around 100x100).
I was wondering how I could use the previously training "small" network used for small images, in order to pretrained a larger network on the large images ? One option could be to classify the image using a slicing window of size 128x128 and scan the whole image but I would like to try if possible to train a whole network on it.
Any suggestion on how to proceed ? Or an article / ressource tackling this kind of problem ? (I am really new to deep learning so sorry if this is stupid question...)
Thanks !
I suggest that you continue reading on the field overall. Your search keys include CNN, image classification, neural net, AlexNet, GoogleNet, and ResNet. This will return many articles, on-line classes and lectures, and other materials to help you learn about classification with neural nets.
Don't just add layers or filters: the complexity of the topology (net design) must be fitted to the task; a net that's too complex will over-fit the training data. The one you've been using is probably LeNet; the three I cite above are for the ImageNet image classification contest.
Since you are working on images, I would suggest you to use a pretrained image classification network (like VGG, Alexnet etc.)and fine tune this network with your 128x128 image data. In my experience until we have very large data set fine tuned network will give more accuracy and also save training time. After building a good image classifier on your data set you can use any popular algorithm to generate region of proposal from the image. Now take all regions of proposal and pass them to classification network one by one and check weather this network is classifying given region of proposal as positive or negative. If it classifying as positively then most probably your object is present in that region. Otherwise it's not. If there are a lot of region of proposal in which object is present according to classifier then you can use non maximal suppression algorithms to reduce number of positive proposals.
When using an AlexNet neural network, be it with caffe or CNTK, it needs a mean file as input. What is this mean file for ? How does it affect the training ? How is it generated, only from training sample ?
Mean subtraction removes the DC component from images. It has the geometric interpretation of centering the cloud of data around the origin along every dimension. It reduces the correlation between images which improves training. From my experience I can say that it improves the training accuracy significantly. It is computed from the training data. Computing mean from the testing data makes no sense.
I was wondering if it is possible to perform a deconvolution of images in Caffe using a point spread function of objective at a given focal point. Something along the lines of this approach.
If yes, what would be the best way to proceed?
It is possible to deconvolve images using Caffe (and CNN in general), but the approach may not be as general as you hope it to be.
CNNs can take blurry image as an input and output sharp image. As the networks are convolutional, the input can be of any size. This can be easily done in Caffe using Convolutional layers and Euclidean Loss layer. Optionally, you can experiment with adding some pooling and deconvolution layers.
CNNs can be trained to deconvolve images for specific blur PSF as in your link. (see: [Xu et al.:Deep Convolutional Neural Network for Image Deconvolution. NIPS 2014]). This works well but you have to re-train the CNN for each new PSF (which takes lot of time).
I've tried to train CNNs to do blind deconvolution (PSF is not known) and it works very well for text documents. You can get trained nets and python-Caffe scripts at [Hradiš et al.: Convolutional Neural Networks for Direct Text Deblurring. BMVC 2015]. This approach could work for other types of images, but it would not work for unrestricted photographs and diverse blurs. For general photos, I would guess It could work for small range of blurs.
Another possibility is to do inverse filtration (e.g. using Wiener filter) and process the output using a CNN. The advantage of this is that you can compute the inverse filter for new PSF very fast and the CNN stays the same. [Schuler et al.: A machine learning approach for non-blind image deconvolution. CVPR 2013]