While training a DNN with pretrained model, and tried to fine tune it, did it change the weights of downloaded pretrained model in my computer? - resnet

I do not know if the fine tuning of the pretrained model changes the weights value in the downloaded pretrained model or it changes the model weights of my network only.
I design a DNN with the help of ResNet50 pretrained model ... and I tried to fine tune it, so I turned the trainable into true ... did it change the weights of the downloaded ResNet model in my computer or these changes is local to my model?

Related

What is Fine Tuning in reference to Neural Network?

I'm going through a few research papers based on neural network, where I came across the word Fine Tuning on pre-trained CNN network. What does it actually do?
Pre-trained:
Firstly we have to understand pre-trained model. Pre-trained models are the models which weights are already trained by someone on a data-set. e.g VGG16 is trained on image-net. Now we want to classify imagenet images. than we can say that If we use pre-trained VGG16 we can classify them easily. Because VGG16 is already trained to classify imagenet objects we don't need to train that again.
Fine-Tuning:
Now I want to classify Cifar-10(classes-10) with VGG16 (classes-1000) and I want to use pre-trained models for this work. Now I have a model which is trained on Image-net which have 1000 classes. So Now I will change the last layer with 10 neurons with softmax activation because Now I want to classify 10 classes not 1000. Now I will fine-tune(change according to my need) my model. I will add a dense layer at the last of the model which have 10 neurons. Now I can use VGG16 (pre-trained for image-net). changing pre-trained model according to our need is known as fine-tuning.
Transfer Learning:
Now the whole concept using pre-trained model and use it to classify our data-set by fine-tuning model is known as transfer-learning
Transfer-learning Example(Using Pre-trained model and Fine-tune it for using it on my data-set)
Here I am using Dense-net pre-trained on image-net and fine-tune my model because I want to use VGG16 net model to classify images in my data-set. and my data set have 5 classes So I am adding last dense-layer having 5 neurons
model=Sequential()
dense_model=keras.applications.densenet.DenseNet121(include_top=False, weights='imagenet', input_tensor=None, input_shape=(224,224,3), pooling=None, classes=1000)
dense_model.trainable = False
dense_model.summary()
# Add the vgg convolutional base model
model.add(dense_model)
# Add new layers
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(5, activation='softmax'))
model.summary()
Pre-trained model link:
https://www.kaggle.com/sohaibanwaar1203/pretrained-densenet
Now what if I want to change the hyper-parameters of the pre-trained model. I want to check which (optimizer,loss-function,number of layers, number of neurons) is working well on my data-set if I use VGG16 (on my data-set). For this reason I will optimize my parameter known as hyper-parameter Optimization
Hyper-parameter Optimization:
if you have knowledge about neural networks you will know that we give random numbers to our neural network. e.g No of dense layers, Number of dense units, Activation's, Dropout percentage. We don't know that neural network with 3 layers will perform well on our data or neural network with 6 layers will perform well on our data. We do experimentation to get the best number for our model. Now experimentation in which you are finding best number for your model is known as fine tuning. Now we have some techniques to Optimize our model like
Grid Search, Random Search. I am sharing notebook by which you will be able to Optimize your model parameters with the help of code.
import math
from keras.wrappers.scikit_learn import KerasRegressor
import keras
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV, KFold
from sklearn.metrics import make_scorer
from keras.models import Sequential,Model
from keras.layers import Dense,Dropout,Activation,BatchNormalization
from keras import losses
from keras import optimizers
from keras.callbacks import EarlyStopping
from keras import regularizers
def Randomized_Model(lr=0.0001,dropout=0.5,optimizer='Adam',loss='mean_squared_error',
activation="relu",clipnorm=0.1,
decay=1e-2,momentum=0.5,l1=0.01,l2=0.001,
):
#Setting Numbers of units in Every dense layer according to the number of dense layers
no_of_units_in_dense_layer=[]
#backwards loop
#setting up loss functions
loss=losses.mean_squared_error
if(loss=='mean_squared_error'):
loss=losses.mean_squared_error
if(loss=="poisson"):
loss=keras.losses.poisson
if(loss=="mean_absolute_error"):
loss=keras.losses.mean_absolute_percentage_error
if(loss=="mean_squared_logarithmic_error"):
loss=keras.losses.mean_squared_logarithmic_error
if(loss=="binary_crossentropy"):
loss=keras.losses.binary_crossentropy
if(loss=="hinge"):
loss=keras.losses.hinge
#setting up Optimizers
opt=keras.optimizers.Adam(lr=lr, decay=decay, beta_1=0.9, beta_2=0.999)
if optimizer=="Adam":
opt=keras.optimizers.Adam(lr=lr, decay=decay, beta_1=0.9, beta_2=0.999)
if optimizer=="Adagrad":
opt=keras.optimizers.Adagrad(lr=lr, epsilon=None, decay=decay)
if optimizer=="sgd":
opt=keras.optimizers.SGD(lr=lr, momentum=momentum, decay=decay, nesterov=False)
if optimizer=="RMSprop":
opt=keras.optimizers.RMSprop(lr=lr, rho=0.9, epsilon=None, decay=0.0)
if optimizer=="Adamax":
opt=keras.optimizers.Adamax(lr=lr, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0)
#model sequential
model=Sequential()
model.add(Dense(units=64,input_dim=30,activation=activation))
model.add(Dense(units=32,activation=activation))
model.add(Dense(units=8,activation=activation))
model.add(Dense(units=1))
model.compile(loss=loss ,optimizer=opt)
return model
params = {'lr': (0.0001, 0.01,0.0009,0.001,0.002 ),
'epochs': [50,100,25],
'dropout': (0, 0.2,0.4, 0.8),
'optimizer': ['Adam','Adagrad','sgd','RMSprop','Adamax'],
'loss': ["mean_squared_error","hinge","mean_absolute_error","mean_squared_logarithmic_error","poisson"],
'activation' :["relu","selu","linear","sigmoid"],
'clipnorm':(0.0,0.5,1),
'decay':(1e-6,1e-4,1e-8),
'momentum':(0.9,0.5,0.2),
'l1': (0.01,0.001,0.0001),
'l2': (0.01,0.001,0.0001),
}
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV, KFold
from sklearn.metrics import make_scorer
# model class to use in the scikit random search CV
model = KerasRegressor(build_fn=Randomized_Model, epochs=30, batch_size=3, verbose=1)
RandomizedSearchfit = RandomizedSearchCV(estimator=model, cv=KFold(3), param_distributions=params, verbose=1, n_iter=10, n_jobs=1)
#having some problem in this line
RandomizedSearch_result = RandomizedSearchfit.fit(X, Y )
Now give your X and Y to this model it will find the best parameter selected by you in the param_dict variable. You can also check fine-tuning of CNN in this notebook (Click Here) In this Notebook I am using Talos Library to fine tune my model.
This is another notebook in which I am using SKLearn (Randomised and grid search )to fine tune my model (Click Here)
Fine-tuning is usually called the last step of more complex NN training when you only slightly modify a pre-trained network, usually to improve performance on a specific domain or re-use good input representation in a different task.
Often, it is mentioned in context of transfer learning. E.g., for image recognition, it may mean that you take a network that was trained to recognize 1k classes from ImageNet. You take the pre-trained network and only "fine-tune" the last layer on your task-specific (smaller and presumably simpler dataset).

how to deploy pytorch based GAN model to android

I have trained and tested the pytorch based GAN model and now I want to deploy it to Android.
I have read about converting cnn model to onnx format and then to cafe2
but I don't know how to use it if there is more than one model i.e. generator, discriminator, encoder-decoder models.

Change optimizer alghoritm in Keras

If you change the optimizer in Keras, you need to compile your model again. This compilation overrides the learned weights of the network. I know how to save weights, but I do not know how to restore them for a network. Can someone please help?
Here is a YouTube video that explains exactly what you're wanting to do: Save and load a Keras model
How you load the model weights is going to depend on how you saved the model or model weights. There are three different saving methods that Keras makes available. These are described in the video link above (with examples), as well as below.
The model.save('my_model.h5') function saves:
The architecture of the model, allowing to re-create the model.
The weights of the model.
The training configuration (loss, optimizer).
The state of the optimizer, allowing to resume training exactly where you left off.
To load this saved model, you would use the following:
from keras.models import load_model
new_model = load_model('my_model.h5')
The model.to_json() function only saves the architecture of the model. This will not save the weights. To load this saved model, you would use the following:
json_string = model.to_json()
from keras.models import model_from_json
model = model_from_json(json_string)
The model.save_weights('my_model_weights.h5') function only saves the weights of the model. To load these saved weights to a model, you would use the following:
model.load_weights('my_model_weights.h5')

How to improve the features from a convnnet for image retrieval?

I have 3 classes. (50k for training, 12k for validation)
By using pretrained vgg16 and resnet50, and freezing the models and only training a dense layer on top, I reach a validation accuracy of 99%.
Should I fine tune to improve features by unfreezing the layers or should I use the features as it is?
Also, is vgg16 a better feature extractor than Resnet50 or should I use features from Resnet?
Thanks!
It depends on your problem domain. If you are fine-tuning the pretrained model for the same problem domain and the training data size is small, then what you have done is correct.
Maybe if you freeze only the first layers, which are well trained on for general feature extraction (egdes, blobs, shapes ..etc), you can boost your performance. It also recommended to apply data augmentation if you are going to do this to avoid over fitting
I encourage you to check the following tutorial on Transfer Learning for more details:
http://cs231n.github.io/transfer-learning/

fine-tuning with VGG on caffe

I'm replicating the steps in
http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html
I want to change the network to VGG model which is obtained at
http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
does it suffice to simply substitute the model parameter as following?
./build/tools/caffe train -solver models/finetune_flickr_style/solver.prototxt -weights VGG_ISLVRC_16_layers.caffemodel -gpu 0
Or do I need to adjust learning rates, iterations, i.e. does it come with separate prototxt files?
There needs to be a 1-1 correspondence between the weights of the network you want to train and the weights you use for initializing/fine-tuning. The architecture of the old and new model have to match.
VGG-16 has a different architecture than the model described by models/finetune_flickr_style/train_val.prototxt (FlickrStyleCaffeNet). This is the network that the solver will try to optimize. Even if it doesn't crash, the weights you've loaded don't have any meaning in the new network.
The VGG-16 network is described in the deploy.prototxt file on this page in Caffe's Model Zoo.