I'm replicating the steps in
http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html
I want to change the network to VGG model which is obtained at
http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
does it suffice to simply substitute the model parameter as following?
./build/tools/caffe train -solver models/finetune_flickr_style/solver.prototxt -weights VGG_ISLVRC_16_layers.caffemodel -gpu 0
Or do I need to adjust learning rates, iterations, i.e. does it come with separate prototxt files?
There needs to be a 1-1 correspondence between the weights of the network you want to train and the weights you use for initializing/fine-tuning. The architecture of the old and new model have to match.
VGG-16 has a different architecture than the model described by models/finetune_flickr_style/train_val.prototxt (FlickrStyleCaffeNet). This is the network that the solver will try to optimize. Even if it doesn't crash, the weights you've loaded don't have any meaning in the new network.
The VGG-16 network is described in the deploy.prototxt file on this page in Caffe's Model Zoo.
Related
Lets say I have Neural Network (NN) that is trained to recognize cats given an image, is there a way to update my NN to recognize dogs as well?
More generally, my question is regarding a way to extend a NN by kind a "loading patterns library".
This is generally known as transfer learning, you basically train a neural network on a large dataset (like ImageNet) and then use the feature vector that is generated by the final convolutional layer to train another classifier (a multiclass SVM for example), and this works even if the objects are different.
Another way is to take a pretrained network and retrain the classifier part (the fully connected layers). It is still faster than training a network from scratch.
I was trying to implement the siamese network described in "Seeing the Forest from the Trees: A Holistic Approach to Near-infrared
Heterogeneous Face Recognition, CVPRW 2016". The approach involves initialising the two channels of siamese network with the same pretrained weights of a single channel model. This is pretty straigtforward in caffe when the weights are shared. But I'm looking to implement it in such a way that the weights are not shared (has to learn together without the same weights by employing the contrastive loss as mentioned in the above paper, but the initializations on both channels has to be the same). I couldn't find a way to implement it in caffe. Any of you have any suggestions for neat approaches or hacks to do this in caffe?
Thanks.
You can load source and siamese destination model in python using caffe :
netSrc = caffe.Net('deploySrc.prototxt',
'src.caffemodel',
caffe.TEST)
netDst = caffe.Net('deployDst.prototxt',
'dst.caffemodel',
caffe.TEST)
Then you can assign per layer weights from source to destination. Say you want to copy layer conv from source to convA and convB copies in siamese network:
netDst.params['convA'] = netSrc.params['conv']
netDst.params['convB'] = netSrc.params['conv']
Then save results to new caffemodel :
netDst.save('dstInitialized.caffemodel')
related to How to Create CaffeDB training data for siamese networks out of image directory
If I have N labels. How can I enforce, that the feature vector of size N right before the contrastive loss layer represents some kind of probability for each class? Or comes that automatically with the siamese net design?
If you only use contrastive loss in a Siamese network, there is no way of forcing the net to classify into the correct label - because the net is only trained using "same/not same" information and does not know the semantics of the different classes.
What you can do is train with multiple loss layers.
You should aim at training a feature representation that is reach enough for your domain, so that looking at the trained feature vector of some input (in some high dimension) you should be able to easily classify that input to the correct class. Moreover, given that feature representation of two inputs one should be able to easily say if they are "same" or "not same".
Therefore, I recommend that you train your deep network with two loss layer with "bottom" as the output of one of the "InnerProduct" layers. One loss is the contrastive loss. The other loss should have another "InnerProduct" layer with num_output: N and a "SoftmaxWithLoss" layer.
A similar concept was used in this work:
Sun, Chen, Wang and Tang Deep Learning Face Representation by Joint Identification-Verification NIPS 2014.
I have trained the neural network for a particular time series in MATLAB. After that I have saved the network. So if I want to simulate the network using different parameters like changing the number of neurons,number of hidden layer, transfer functions, learning ratio,momentum coefficient, Can I do it without again training the network?
If not what is the criteria to select the best parameter for my neural network?
How should I configure my neural network in MATLAB to do all these?
No because you save whole model to file, with including weights + activation function and whole structure (layers). You can train few networks, and save to file if you want to check in future on real data (validation data) which networks is better.
Check this also ;) http://people.cs.umass.edu/~btaylor/publications/PSI000008.pdf
Is it possible to train a Network with 2 inputs : one is the data and the other is a constant that we define.
We train the network with one set of datas and set the second input to '10' for example
then once it has converged, we train with another set of data and set the second input to '20' this time.
Now what if i input test data with the second parameter set to '15', will it automatically extrapolate between the two learned states?
If not, how do i do if i want to do what i explained above : extrapolate between two training states?
thanks a lot
Jeff
It is possible to add another input as a parameter into the neural network, but I am unsure what benefit you are trying to achieve by adding this input.
You would need to train the neural network with this input so that it would estimate the value between the two trained networks. This would involve training each individual network first and then training a second network that would extrapolate between states.
If you are trying to modularise each trained Neural Network for specific roles or classifications, and these classifications represent some kind of continuous relationship (for example, weather predictions that specialise for no rain, light rain, moderate rain and heavy rain), then perhaps this input could be used in some way to encourage the output of a particular network.
If you would like to adjust the weights of each network so that some Neural Networks have more preference than others, perhaps an Ensemble approach can assist with different weights for each network (static and dynamic options are available). If you just want to map the differences between the two networks with different weights, perhaps a linear or non-linear function can be applied between the two networks and mapped to detail the changes between the two trained networks.