Saving and retrieving the parameters of a Gpflow model - gpflow

I am currently implementing an algorithm with GPflow using GPR. I wanted to save the parameters after the GPR training and load the model for testing. Does anyone knows the command?

GPflow has a page with tips & tricks now. You can follow the link where you will find the answer on your question. But, I'm going to paste MWE here as well:
Let's say you want to store GPR model, you can do it with gpflow.Saver():
kernel = gpflow.kernels.RBF(1)
x = np.random.randn(100, 1)
y = np.random.randn(100, 1)
model = gpflow.models.GPR(x, y, kernel)
filename = "/tmp/gpr.gpflow"
path = Path(filename)
if path.exists():
path.unlink()
saver = gpflow.saver.Saver()
saver.save(filename, model)
To load it back you have to use either this solution:
with tf.Graph().as_default() as graph, tf.Session().as_default():
model_copy = saver.load(filename)
or if you want to load the model in the same session where you stored it before, you need to apply some tricks:
ctx_for_loading = gpflow.saver.SaverContext(autocompile=False)
model_copy = saver.load(filename, context=ctx_for_loading)
model_copy.clear()
model_copy.compile()
UPDATE 1 June 2020:
GPflow 2.0 doesn't provide custom saver. It relies on TensorFlow checkpointing and tf.saved_model. You can find examples here: GPflow intro.

One option that I employ for gpflow models is to just save and load the trainables. It assumes you have a function that builds and compiles the model.
I show this in the following, by saving the variables to an hdf5 file.
import h5py
def _load_model(model, load_file):
"""
Load a model given by model path
"""
vars = {}
def _gather(name, obj):
if isinstance(obj, h5py.Dataset):
vars[name] = obj[...]
with h5py.File(load_file) as f:
f.visititems(_gather)
model.assign(vars)
def _save_model(model, save_file):
vars = model.read_trainables()
with h5py.File(save_file) as f:
for name, value in vars.items():
f[name] = value

Related

Saving a trained Detectron2 model and making predictions on a single image

I am new to detectron2 and this is my first project. After reading the docs and using the tutorials as a guide, I trained my model on the custom dataset and performed the evaluation.
I would now like to make predictions on images I receive via an API by loading this saved model. I could not find any reading materials that could help me with this task.
To save my model, I have used this link as a reference - https://detectron2.readthedocs.io/en/latest/tutorials/models.html
I am able to save my trained model using the following code-
from detectron2.modeling import build_model
model = build_model(cfg) # returns a torch.nn.Module
from detectron2.checkpoint import DetectionCheckpointer
checkpointer = DetectionCheckpointer(model, save_dir="output")
checkpointer.save("model_final") # save to output/model_final.pth
But I am still confused as to how I can go about implementing what I want. I could use some guidance on what my next steps should be. Would be extremely grateful to anyone who can help.
for a single image, create a list of data. Put image path in the file_name as below:
test_data = [{'file_name': '.../image_1jpg',
'image_id': 10}]
Then do run the following:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
from detectron2.data import MetadataCatalog
from detectron2.utils.visualizer import Visualizer, ColorMode
import matplotlib.pyplot as plt
import cv2.cv2 as cv2
test_data = [{'file_name': '.../image_1jpg',
'image_id': 10}]
cfg = get_cfg()
cfg.merge_from_file("model config")
cfg.MODEL.WEIGHTS = "model_final.pth" # path for final model
predictor = DefaultPredictor(cfg)
im = cv2.imread(test_data[0]["file_name"])
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1],
metadata=MetadataCatalog.get(cfg.DATASETS.TRAIN[0]),
scale=0.5,
instance_mode=ColorMode.IMAGE_BW)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
img = cv2.cvtColor(out.get_image()[:, :, ::-1], cv2.COLOR_RGBA2RGB)
plt.imshow(img)
This will show the prediction for the single image

LSTM neural network with two sources of data

I have the following configuration: One lstm network that receives a text with n-grams with size 2. Below a simple schematic:
After some tests, I noticed that for some classes I have an significant incrise on accuracy when I use ngrams with size 3. Now I want to train a new LSTM neural network with both ngram sizes at same time, like the following schematic:
How can I provide the data and build this model, using keras to perform this task?
I assume you already have a function to split words into n-grams, as you already have the 2-grams and 3-grams model working? Therefor I just construct a one-sample example of the word "cool" for a working example. I had to use embedding for my example, as an LSTM layer with 26^3=17576 nodes was a little too much for my computer to handle. I expect you did the same in your 3-grams code?
Below is a complete working example:
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense, concatenate
from tensorflow.keras.models import Model
import numpy as np
# c->2 o->14 o->14 l->11
np_2_gram_in = np.array([[26*2+14,26*14+14,26*14+11]])#co,oo,ol
np_3_gram_in = np.array([[26**2*2+26*14+14,26**2*14+26*14+26*11]])#coo,ool
np_output = np.array([[1]])
output_shape=1
lstm_2_gram_embedding = 128
lstm_3_gram_embedding = 192
inputs_2_gram = Input(shape=(None,))
em_input_2_gram = Embedding(output_dim=lstm_2_gram_embedding, input_dim=26**2)(inputs_2_gram)
lstm_2_gram = LSTM(lstm_2_gram_embedding)(em_input_2_gram)
inputs_3_gram = Input(shape=(None,))
em_input_3_gram = Embedding(output_dim=lstm_3_gram_embedding, input_dim=26**3)(inputs_3_gram)
lstm_3_gram = LSTM(lstm_3_gram_embedding)(em_input_3_gram)
concat = concatenate([lstm_2_gram, lstm_3_gram])
output = Dense(output_shape,activation='sigmoid')(concat)
model = Model(inputs=[inputs_2_gram, inputs_3_gram], outputs=[output])
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit([np_2_gram_in, np_3_gram_in], [np_output], epochs=5)
model.predict([np_2_gram_in,np_3_gram_in])

How to implement exponentially decay learning rate in Keras by following the global steps

Look at the following example
# encoding: utf-8
import numpy as np
import pandas as pd
import random
import math
from keras import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import Adam, RMSprop
from keras.callbacks import LearningRateScheduler
X = [i*0.05 for i in range(100)]
def step_decay(epoch):
initial_lrate = 1.0
drop = 0.5
epochs_drop = 2.0
lrate = initial_lrate * math.pow(drop,
math.floor((1+epoch)/epochs_drop))
return lrate
def build_model():
model = Sequential()
model.add(Dense(32, input_shape=(1,), activation='relu'))
model.add(Dense(1, activation='linear'))
adam = Adam(lr=0.5)
model.compile(loss='mse', optimizer=adam)
return model
model = build_model()
lrate = LearningRateScheduler(step_decay)
callback_list = [lrate]
for ep in range(20):
X_train = np.array(random.sample(X, 10))
y_train = np.sin(X_train)
X_train = np.reshape(X_train, (-1,1))
y_train = np.reshape(y_train, (-1,1))
model.fit(X_train, y_train, batch_size=2, callbacks=callback_list,
epochs=1, verbose=2)
In this example, the LearningRateSchedule does not change the learning rate at all because in each iteration of ep, epoch=1. Thus the learning rate is just const (1.0, according to step_decay). In fact, instead of setting epoch>1 directly, I have to do outer loop as shown in the example, and insider each loop, I just run 1 epoch. (This is the case when I implement deep reinforcement learning, instead of supervised learning).
My question is how to set an exponentially decay learning rate in my example and how to get the learning rate in each iteration of ep.
You can actually pass two arguments to the LearningRateScheduler.
According to Keras documentation, the scheduler is
a function that takes an epoch index as input (integer, indexed from
0) and current learning rate and returns a new learning rate as output
(float).
So, basically, simply replace your initial_lr with a function parameter, like so:
def step_decay(epoch, lr):
# initial_lrate = 1.0 # no longer needed
drop = 0.5
epochs_drop = 2.0
lrate = lr * math.pow(drop,math.floor((1+epoch)/epochs_drop))
return lrate
The actual function you implement is not exponential decay (as you mention in your title) but a staircase function.
Also, you mention your learning rate does not change inside your loop. That's true because you set model.fit(..., epochs=1,...) and your epochs_drop = 2.0 at the same time. I am not sure this is your desired case or not. You are providing a toy example and it's not clear in that case.
I would like to add the more common case where you don't mix a for loop with fit() and just provide a different epochs parameter in your fit() function. In this case you have the following options:
First of all keras provides a decaying functionality itself with the predefined optimizers. For example in your case Adam() the actual code is:
lr = lr * (1. / (1. + self.decay * K.cast(self.iterations, K.dtype(self.decay))))
which is not exactly exponential either and it's somehow different than tensorflow's one. Also, it's used only when decay > 0.0 as it's obvious.
To follow the tensorflow convention of exponential decay you should implement:
decayed_learning_rate = learning_rate * ^ (global_step / decay_steps)
Depending on your needs you could choose to implement a Callback subclass and define a function within it (see 3rd bullet below) or use LearningRateScheduler which is actually exactly this with some checking: a Callback subclass which updates the learning rate at each epoch end.
If you want a finer handling of your learning rate policy (per batch for example) you would have to implement your subclass since as far as I know there is no implemented subclass for this task. The good part is that it's super easy:
Create a subclass
class LearningRateExponentialDecay(Callback):
and add the __init__() function which will initialize your instance with all needed parameters and also create a global_step variables to keep track of the iterations (batches):
def __init__(self, init_learining_rate, decay_rate, decay_steps):
self.init_learining_rate = init_learining_rate
self.decay_rate = decay_rate
self.decay_steps = decay_steps
self.global_step = 0
Finally, add the actual function inside the class:
def on_batch_begin(self, batch, logs=None):
actual_lr = float(K.get_value(self.model.optimizer.lr))
decayed_learning_rate = actual_lr * self.decay_rate ^ (self.global_step / self.decay_steps)
K.set_value(self.model.optimizer.lr, decayed_learning_rate)
self.global_step += 1
The really cool part is the if you want the above subclass to update every epoch you could use on_epoch_begin(self, epoch, logs=None) which nicely has epoch as parameter to it's signature. This case is even easier as you could skip global step altogether (no need to keep track of it now unless you want a fancier way to apply your decay) and use epoch in it's place.

Pytorch: NN function approximator, 2 in 1 out

[Please be aware of the Edit History below, as the major problem statement has changed.]
We are trying to implement a neural network in pytorch, that approximates a function f(x,y)=z. So there are two real numbers as input and one as ouput, we therefore want 2 nodes in the input layer and one in the output layer. We constructed a test set of 5050 samples and had pretty good results for that task in Keras with Tensorflow backend, with 3 hidden layers with a configuration of the nodes like: 2(in) - 4 - 16 - 4 - 1(out); and ReLU activation functions on all hidden layers, linear on in- and output.
Now in Pytorch we tried to implement a similar network but our loss function still literally explodes: It changes in the first few steps and converges then to some value around 10^7. In Keras we had an error around 10 percent. We already tried different network configurations without any improvement. Maybe someone could have a look on our code and suggest any change?
To explain: tr_data is a list, containing 5050 2*1 numpy arrays which are the inputs for the network. tr_labels is a list, containing 5050 numbers which are the outputs we want to learn. loadData() just load those two lists.
import torch.nn as nn
import torch.nn.functional as F
BATCH_SIZE = 5050
DIM_IN = 2
DIM_HIDDEN_1 = 4
DIM_HIDDEN_2 = 16
DIM_HIDDEN_3 = 4
DIM_OUT = 1
LEARN_RATE = 1e-4
EPOCH_NUM = 500
class Net(nn.Module):
def __init__(self):
#super(Net, self).__init__()
super().__init__()
self.hidden1 = nn.Linear(DIM_IN, DIM_HIDDEN_1)
self.hidden2 = nn.Linear(DIM_HIDDEN_1, DIM_HIDDEN_2)
self.hidden3 = nn.Linear(DIM_HIDDEN_2, DIM_HIDDEN_3)
self.out = nn.Linear(DIM_HIDDEN_3, DIM_OUT)
def forward(self, x):
x = F.relu(self.hidden1(x))
x = F.tanh(self.hidden2(x))
x = F.tanh(self.hidden3(x))
x = self.out(x)
return x
model = Net()
loss_fn = nn.MSELoss(size_average=False)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARN_RATE)
tr_data,tr_labels = loadData()
tr_data_torch = torch.zeros(BATCH_SIZE, DIM_IN)
tr_labels_torch = torch.zeros(BATCH_SIZE, DIM_OUT)
for i in range(BATCH_SIZE):
tr_data_torch[i] = torch.from_numpy(tr_data[i])
tr_labels_torch[i] = tr_labels[i]
for t in range(EPOCH_NUM):
labels_pred = model(tr_data_torch)
loss = loss_fn(labels_pred, tr_labels_torch)
#print(t, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
I have to say, those are our first steps in Pytorch, so please forgive me if there are some obvious, dumb mistakes. I appreciate any help or hint,
Thank you!
EDIT 1 ------------------------------------------------------------------
Following the comments and answers, we improved our code. The Loss function has now for the first time reasonable values, around 250. Our new class definition looks like:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
#super().__init__()
self.hidden1 = nn.Sequential(nn.Linear(DIM_IN, DIM_HIDDEN_1), nn.ReLU())
self.hidden2 = nn.Sequential(nn.Linear(DIM_HIDDEN_1, DIM_HIDDEN_2), nn.ReLU())
self.hidden3 = nn.Sequential(nn.Linear(DIM_HIDDEN_2, DIM_HIDDEN_3), nn.ReLU())
self.out = nn.Linear(DIM_HIDDEN_3, DIM_OUT)
def forward(self, x):
x = self.hidden1(x)
x = self.hidden2(x)
x = self.hidden3(x)
x = self.out(x)
return x
and the loss function:
loss_fn = nn.MSELoss(size_average=True, reduce=True)
As we stated before, we already had far more satisfying results in keras with tensorflow backend. The loss function was around 30, with a similar network configuration. I share the essential parts(!) of our keras code here:
model = Sequential()
model.add(Dense(4, activation="linear", input_shape=(2,)))
model.add(Dense(16, activation="relu"))
model.add(Dense(4, activation="relu"))
model.add(Dense(1, activation="linear" ))
model.summary()
model.compile ( loss="mean_squared_error", optimizer="adam", metrics=["mse"] )
history=model.fit ( np.array(tr_data), np.array(tr_labels), \
validation_data = ( np.array(val_data), np.array(val_labels) ),
batch_size=50, epochs=200, callbacks = [ cbk ] )
Thank your already for all the help! If anybody still has suggestions to improve the network, we would be happy about it. As somebody already asked for the data, we want to share a pickle file here:
https://mega.nz/#!RDYxSYLY!P4a9mEDtZ7A5Bl7ZRjRk8EzLXQt2gyURa3wN3NCWFPA
together with the code to access it:
import pickle
f=open("data.pcl","rb")
tr_data=pickle.load ( f )
tr_labels=pickle.load ( f )
val_data=pickle.load ( f )
val_labels=pickle.load ( f )
f.close()
It should be interesting for you to point out the differences between torch.nn and torch.nn.functional (see here). Essentially, it might be that your backpropagation graph might be executed not 100% correct due to a different specification.
As pointed out by previous commenters, I would suggest to define your layers including the activations. My personal favorite way is to use nn.Sequential(), which allows you to specify multiple opeations chained together, like so:
self.hidden1 = nn.Sequential(nn.Linear(DIM_IN, DIM_HIDDEN1), nn.ReLU())
and then simply calling self.hidden1 later (without wrapping it in F.relu()).
May I also ask why you do not call the commented super(Net, self).__init__() (which is the generally recommended way)?
Additionally, if that should not fix the problem, can you maybe just share the code for Keras in comparison?

Keras 2.0.0: how to access and modify the training data in the callback function "on_epoch_end()"?

I am using Keras 2.0.0 with Theano.
I would like to update the training data between each epoch. I can do it in a for loop using nb_epochs=1 but it would be much more elegant using the on_epoch_end callback.
Here is my tentative code, based on a Keras 1 example(blog post):
class callback_change_X_train(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
X_train = my_function_to_update_X_train(...)
self.model.training_data[0] = X_train
Unfortunately, it seems that self.model.training_data does not exist anymore.
Any help much appreciated!
Just ran into this problem myself and solved it using the model.fit_generator method instead of the standard fit method. You can perform some modification to the data at each epoch this way. I'm not sure if you can get access to the model object itself this way, but for my problem I just needed to shuffle the data along one of the inner axes.
Example code:
def shuffle_gen(x_train, y_train):
while True:
shuffle_inner_axis(x_train, y_train)
for b_i in range(steps_per_epoch):
yield(x_train[b_i * batch_size:(b_i + 1) * batch_size],
y_train[b_i * batch_size:(b_i + 1) * batch_size])
return
model.fit_generator(shuffle_gen(x_train, y_train), steps_per_epoch, n_epochs)