Initialize GPFlow model with empty X and Y - gpflow

I'm using GPFlow for multidimensional regression and want to compare various kernels starting with an empty X and Y set. But it seems that the library needs set containing value pairs. I thought about initializing with a point far away from my input space but that point would be included when optimizing my hyper-parameters.
Is there any solution I'm missing or workaround?
Thanks for your help!
This is some standard code to initialize my model:
import gpflow
k = gpflow.kernels.RBF(input_dim=1, lengthscales=1, variance=1)
x_sample = np.array([])
y_sample = np.array([])
model = gpflow.models.GPR(x_sample, y_sample, kern=k)
which leads to the following error:
IndexError: tuple index out of range
And the following snippet leads to:
model = gpflow.models.GPR(kern=k)
TypeError: __init__() missing 2 required positional arguments: 'X' and 'Y'
It would be great if someone had an idea what I could do to initialize my model with an empty set

The library can deal with an empty X and Y set - but you have to respect the required shapes. Both X and Y need to have ndim=2. When writing x_sample = np.array([]), then x_sample.shape == (0,) and x_sample.ndim == 1. Instead, set x_sample = np.empty((0, 2)) (and likewise y_sample = np.empty((0, 2)), then ndim=2 and their shape is (0, 2) as required.
(Obviously with no data it does not make sense to optimize your hyperparameters, and there isn't anything you can do with the model, really; if you want to just compare kernels you don't need to construct a model to compute the kernel matrices... but that depends more specifically on what you want to achieve!)

Related

Convergence when utilizing scipy.odr module to find best-fit parameters when there is only horizontal errorbars

I am trying to fit a piecewise (otherwise linear) function to a set of experimental data. The form of the data is such that there is only horizontal error bars and no vertical error bars. I am familiar with scipy.optimize.curve_fit module but that works when there is only vertical error bars corresponding to the dependent variable y. After searching for my specific need, I came across the following post where it explains about the possibility of using scipy.odr module when errorbars are those of independent variable x. (Correct fitting with scipy curve_fit including errors in x?)
Attached is my version of the code which tries to find best-fit parameters using ODR methodology. It actually draws best-fit function and it seems it's working. However, after changing initial (educated guess) values and trying to extract best-fit parameters, I am getting the same guessed parameters I inserted initially. This means that the method is not convergent and you can verify this by printing output.stopreason and getting
['Numerical error detected']
So, my question is whether this methodology is consistent with my function being piecewise and if not, if there is any other correct methodology to adopt in such cases?
from numpy import *
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
from scipy.odr import ODR, Model, Data, RealData
x_array=array([8.2,8.6,9.,9.4,9.8,10.2,10.6,11.,11.4,11.8])
x_err_array=array([0.2]*10)
y_array=array([-2.05179545,-1.64998354,-1.49136169,-0.94200805,-0.60205999,0.,0.,0.,0.,0.])
y_err_array=array([0]*10)
# Linear Fitting Model
def func(beta, x):
return piecewise(x, [x < beta[0]], [lambda x:beta[1]*x-beta[1]*beta[0], lambda x:0.0])
data = RealData(x_array, y_array, x_err_array, y_err_array)
model = Model(func)
odr = ODR(data, model, [10.1,1.02])
odr.set_job(fit_type=0)
output = odr.run()
f, (ax1) = plt.subplots(1, sharex=True, sharey=True, figsize=(10,10))
ax1.errorbar(x_array, y_array, xerr = x_err_array, yerr = y_err_array, ecolor = 'blue', elinewidth = 3, capsize = 3, linestyle = '')
ax1.plot(x_array, func(output.beta, x_array), 'blue', linestyle = 'dotted', label='Best-Fit')
ax1.legend(loc='lower right', ncol=1, fontsize=12)
ax1.set_xlim([7.95, 12.05])
ax1.set_ylim([-2.1, 0.1])
ax1.yaxis.set_major_locator(MaxNLocator(prune='upper'))
ax1.set_ylabel('$y$', fontsize=12)
ax1.set_xlabel('$x$', fontsize=12)
ax1.set_xscale("linear", nonposx='clip')
ax1.set_yscale("linear", nonposy='clip')
ax1.get_xaxis().tick_bottom()
ax1.get_yaxis().tick_left()
f.subplots_adjust(top=0.98,bottom=0.14,left=0.14,right=0.98)
plt.setp([a.get_xticklabels() for a in f.axes[:-1]], visible=True)
plt.show()
An error of 0 for y is causing problems. Make it small but not zero, e.g. 1e-16. Doing so the fit converges. It also does if you omit the y_err_array when defining RealData but I am not sure what happens internally in that case.

Matlab: Change variable resolution and names for viewing regression trees

Using treeMine = fitctree(....) I can generate a decision tree but the tree is very big, and therefore very difficult to convey information, when using view(treeMine,'Mode','Graph')
Therefore my question is if it is possible to change variable names x1-x9 to other names to make it human understandable and if I could force the numbers to be represented by engineering notation meaning 10e3.
Does anybody know how this can be done?
Minimal Example
Minimal example can be to use Matlabs own car example:
load carsmall
idxNaN = isnan(MPG + Weight);
X = Weight(~idxNaN);
Y = MPG(~idxNaN);
n = numel(X);
rng(1) % For reproducibility
idxTrn = false(n,1);
idxTrn(randsample(n,round(0.5*n))) = true; % Training set logical indices
idxVal = idxTrn == false; % Validation set logical indices
Mdl = fitrtree(X(idxTrn),Y(idxTrn));
view(Mdl,'Mode','graph')
How do you then specify the value resolution and variable name
About the names: It's a bit a poor example because you use only one predictor (weight), but you can change the name with the 'PredictorNames' name-value pair, e.g.
Mdl = fitrtree(X(idxTrn),Y(idxTrn),'PredictorNames',{'weight'});
If you were to use more predictors you just have to add more elements to the cell array, e.g.
'PredictorNames',{'weight','age','women'}
I don't know about the numbers tough.

How do I actually execute a saved TensorFlow model?

Tensorflow newbie here. I'm trying to build an RNN. My input data is a set of vector instances of size instance_size representing the (x,y) positions of a set of particles at each time step. (Since the instances already have semantic content, they do not require an embedding.) The goal is to learn to predict the positions of the particles at the next step.
Following the RNN tutorial and slightly adapting the included RNN code, I create a model more or less like this (omitting some details):
inputs, self._input_data = tf.placeholder(tf.float32, [batch_size, num_steps, instance_size])
self._targets = tf.placeholder(tf.float32, [batch_size, num_steps, instance_size])
with tf.variable_scope("lstm_cell", reuse=True):
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(hidden_size, forget_bias=0.0)
if is_training and config.keep_prob < 1:
lstm_cell = tf.nn.rnn_cell.DropoutWrapper(
lstm_cell, output_keep_prob=config.keep_prob)
cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers)
self._initial_state = cell.zero_state(batch_size, tf.float32)
from tensorflow.models.rnn import rnn
inputs = [tf.squeeze(input_, [1])
for input_ in tf.split(1, num_steps, inputs)]
outputs, state = rnn.rnn(cell, inputs, initial_state=self._initial_state)
output = tf.reshape(tf.concat(1, outputs), [-1, hidden_size])
softmax_w = tf.get_variable("softmax_w", [hidden_size, instance_size])
softmax_b = tf.get_variable("softmax_b", [instance_size])
logits = tf.matmul(output, softmax_w) + softmax_b
loss = position_squared_error_loss(
tf.reshape(logits, [-1]),
tf.reshape(self._targets, [-1]),
)
self._cost = cost = tf.reduce_sum(loss) / batch_size
self._final_state = state
Then I create a saver = tf.train.Saver(), iterate over the data to train it using the given run_epoch() method, and write out the parameters with saver.save(). So far, so good.
But how do I actually use the trained model? The tutorial stops at this point. From the docs on tf.train.Saver.restore(), in order to read back in the variables, I need to either set up exactly the same graph I was running when I saved the variables out, or selectively restore particular variables. Either way, that means my new model will require inputs of size batch_size x num_steps x instance_size. However, all I want now is to do a single forward pass through the model on an input of size num_steps x instance_size and read out a single instance_size-sized result (the prediction for the next time step); in other words, I want to create a model that accepts a different-size tensor than the one I trained on. I can kludge it by passing the existing model my intended data batch_size times, but that doesn't seem like a best practice. What's the best way to do this?
You have to create a new graph that has the same structure but with the batch_size = 1 and import the saved variables with tf.train.Saver.restore(). You can take a look at how they define multiple models with variable batch size in ptb_word_lm.py: https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/rnn/ptb/ptb_word_lm.py
So you can have a separate file for instance, where you instantiate the graph with the batch_size that you want, then restore the saved variables. Then you can execute your graph.

trying to understand scipy's optimize.minimize function, getting indexerror

I'm trying to write code that will optimize a multivariate function using sklearn's optimize function, but it keeps returning an IndexError, and I'm not sure where to go from here.
The code is this:
revcoeff = coefficients[::-1]
xdot = np.zeros(0)
normfeat1 = normfeat1.reshape(-1,1)
xdot = np.append(normfeat1, normfeat2.reshape(-1,1), axis=1)
a = revcoeff[1:3]
b = xdot[0, :]
seed = np.zeros(5) #does seed need to be the coefficients? not sure
fun = lambda x: np.multiply((1/666), np.power(np.sum(np.dot(a, xdot[x, :])-medianv[x]),2)) #costfunction
optsol = optimize.minimize(fun, seed)
where there are two features I'm using in my nearest neighbors algorithm. Coefficients for the fitted regression model are given into the array "coefficients".
What I'm having trouble understanding is 1) why my code is throwing a "IndexError: arrays used as indicies must be of integer or boolean type"....and also partially I'm confused by the optimize.minimize function itself. It takes in two input values, the function and x0 (an ndarray with initial guesses). What should x0 be, the coefficients values? Or do I pick random values, and how many are necessary?
np.zeros() does not return integers by default. Try, for example, np.zeros(5, dtype=int) instead. It won't solve all of the problems with your code, though. You'll see some other error message.
Also, notice, that 1/666 returns 0 instead of 0.00150150. You probably want 1/666.0.
It would be helpful if you could clean up your code as half of it is of no use.

MatLab BayesNetToolbox parameter learning

My question is specific to the "learn_params()" function of the BayesNetToolbox in MatLab. In the user manual, "learn_params()" is stated to be suitable for use only if the input data is fully observed. I have tried it with a partially observed dataset where I represented unobserved values as NaN's.
It seems like "learn_params()" can deal with NaNs and the node state combinations that do not occur in the dataset. When I apply dirichlet priors to smoothen the 0 values, I get 'sensible' MLE distributions for all nodes. I have copied the script where I do this.
Can someone clarify whether what I am doing makes sense or if I am missing
something, i.e. the reason why "learn_params()" cannot be used with partially
observed data.
The MatLab Script where I test this is here:
% Incomplete dataset (where NaN's are unobserved)
Age = [1,2,2,NaN,3,3,2,1,NaN,2,1,1,3,NaN,2,2,1,NaN,3,1];
TNMStage = [2,4,2,3,NaN,1,NaN,3,1,4,3,NaN,2,4,3,4,1,NaN,2,4];
Treatment = [2,3,3,NaN,2,NaN,4,4,3,3,NaN,2,NaN,NaN,4,2,NaN,3,NaN,4];
Survival = [1,2,1,2,2,1,1,1,1,2,2,1,2,2,1,2,1,2,2,1];
matrixdata = [Age;TNMStage;Treatment;Survival];
node_sizes =[3,4,4,2];
% Enter the variablesmap
keys = {'Age', 'TNM','Treatment', 'Survival'};
v= 1:1:length(keys);
VariablesMap = containers.Map(keys,v);
% create the dag and the bnet
N = length(node_sizes); % Instead of entering it manually
dag2 = zeros(N,N);
dag2(VariablesMap('Treatment'),VariablesMap('Survival')) = 1;
bnet21 = mk_bnet(dag2, node_sizes);
draw_graph(bnet21.dag);
dirichletweight=1;
% define the CPD priors you want to use
bnet23.CPD{VariablesMap('Age')} = tabular_CPD(bnet23, VariablesMap('Age'), 'prior_type', 'dirichlet','dirichlet_type', 'unif', 'dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('TNM')} = tabular_CPD(bnet23, VariablesMap('TNM'), 'prior_type', 'dirichlet','dirichlet_type', 'unif', 'dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('Treatment')} = tabular_CPD(bnet23, VariablesMap('Treatment'), 'prior_type', 'dirichlet','dirichlet_type', 'unif','dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('Survival')} = tabular_CPD(bnet23, VariablesMap('Survival'), 'prior_type', 'dirichlet','dirichlet_type', 'unif','dirichlet_weight', dirichletweight);
% Find MLEs from incomplete data with Dirichlet prior CPDs
bnet24 = learn_params(bnet23, matrixdata);
% Look at the new CPT values after parameter estimation has been carried out
CPT24 = cell(1,N);
for i=1:N
s=struct(bnet24.CPD{i}); % violate object privacy
CPT24{i}=s.CPT;
end
According to my understanding of the BNT documentation, you need to make a couple of changes:
Missing values should be represented as empty cells instead of NaN values.
The learn_params_em function is the only one that supports missing values.
My previous response was incorrect, as I mis-recalled which of the BNT learning functions had support for missing values.