Keras + IndexError - neural-network

I am very new to keras. Trying to build a binary classifier for an NLP task. (My code is motivated from imdb example - https://github.com/fchollet/keras/blob/master/examples/imdb_cnn.py)
Below is my code snippet:
max_features = 30
maxlen = 30
batch_size = 32
embedding_dims = 30
nb_filter = 250
filter_length = 3
hidden_dims = 250
nb_epoch = 3
(Train_X, Train_Y, Test_X, Test_Y) = load_and_split_data()
model = Sequential()
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
model.add(Convolution1D(nb_filter=nb_filter,filter_length=filter_length,border_mode="valid",activation="relu",subsample_length=1))
model.add(MaxPooling1D(pool_length=2))
model.add(Flatten())
model.add(Dense(hidden_dims))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', class_mode="binary")
fitlog = model.fit(Train_X, Train_Y, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=2)
When I run model.fit(), I get the following error:
/.virtualenvs/nnet/lib/python2.7/site-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
857 t0_fn = time.time()
858 try:
--> 859 outputs = self.fn()
860 except Exception:
861 if hasattr(self.fn, 'position_of_error'):
IndexError: One of the index value is out of bound. Error code: 65535.\n
Apply node that caused the error: GpuAdvancedSubtensor1(<CudaNdarrayType(float32, matrix)>, Elemwise{Cast{int64}}.0)
Toposort index: 47
Inputs types: [CudaNdarrayType(float32, matrix), TensorType(int64, vector)]
Inputs shapes: [(30, 30), (3840,)]
Inputs strides: [(30, 1), (8,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuReshape{3}(GpuAdvancedSubtensor1.0, MakeVector{dtype='int64'}.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Can you please help me resolve this ?

You need to Pad the imdb sequences you are using, add those lines:
from keras.preprocessing import sequence
Train_X = sequence.pad_sequences(Train_X, maxlen=maxlen)
Test_X = sequence.pad_sequences(Test_X, maxlen=maxlen)
Before building the actual model.

Related

A bug is encountered with the "Maskable PPO" training with custom Env setup

I encountered an error while using SB3-contrib Maskable PPO action masking algorithm.
File ~\anaconda3\lib\site-packages\sb3_contrib\common\maskable\distributions.py:231, in MaskableMultiCategoricalDistribution.apply_masking(self, masks)
228 masks = th.as_tensor(masks)
230 # Restructure shape to align with logits
--> 231 masks = masks.view(-1, sum(self.action_dims))
233 # Then split columnwise for each discrete action
234 split_masks = th.split(masks, tuple(self.action_dims), dim=1)
RuntimeError: shape '[-1, 1600]' is invalid for input of size 800
I am running learning progamme with an action being a MultiBinary space with 800 selections of 0, 1.
The action space is defined as below:
self.action_space = spaces.MultiBinary(800)
Within the custom environment class, an "action_mask" function was created such that it returns a List of 800 boolean values.
Now, when I follow the document and start to train the model, the error message pops:
from sb3_contrib import MaskablePPO
from Equities_RL_Env import Equities_RL_Env
import time
from sb3_contrib.common.maskable.utils import get_action_masks
models_dir = f"models/V1 31-Jul/"
logdir = f"logs/{time.strftime('%d %b %Y %H-%M',time.localtime())}/"
if not os.path.exists(models_dir):
os.makedirs(models_dir)
if not os.path.exists(logdir):
os.makedirs(logdir)
env = Equities_RL_Env(Normalize_frame(historical_frame), pf)
env.reset()
model = MaskablePPO('MlpPolicy', env, verbose=1, tensorboard_log=logdir)
TIMESTEPS = 1000
iters = 0
while iters <= 1000000:
iters += 1
model.learn(total_timesteps=TIMESTEPS, reset_num_timesteps=False, tb_log_name=f"PPO")
model.save(f"{models_dir}/{TIMESTEPS*iters}")
May I know is there a way to define that shape within the custom environment?

Talos --> TypeError: __init__() got an unexpected keyword argument 'grid_downsample'

I am trying to run a hyperparameters optimization with Talos. As I have a lot of parameters to test, I want to use a 'grid_downsample' argument that will select 30% of all possible hyperparameters combinations. However when I run my code I get: TypeError: __init__() got an unexpected keyword argument 'grid_downsample'
I tested the code below without the 'grid_downsample' option and with less hyperparameters.
#load data
data = pd.read_csv('data.txt', sep="\t", encoding = "latin1")
# split into input (X) and output (y) variables
Y = np.array(data['Y'])
data_bis = data.drop(['Y'], axis = 1)
X = np.array(data_bis)
p = {'activation':['relu'],
'optimizer': ['Nadam'],
'first_hidden_layer': [12],
'second_hidden_layer': [12],
'batch_size': [20],
'epochs': [10,20],
'dropout_rate':[0.0, 0.2]}
def dnn_model(x_train, y_train, x_val, y_val, params):
model = Sequential()
#input layer
model.add(Dense(params['first_hidden_layer'], input_shape=(1024,)))
model.add(Dropout(params['dropout_rate']))
model.add(Activation(params['activation']))
#hidden layer 2
model.add(Dense(params['second_hidden_layer']))
model.add(Dropout(params['dropout_rate']))
model.add(Activation(params['activation']))
# output layer with one node
model.add(Dense(1))
model.add(Activation(params['activation']))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=params['optimizer'], metrics=['accuracy'])
out = model.fit(x_train, y_train,
batch_size=params['batch_size'],
epochs=params['epochs'],
validation_data=[x_val, y_val],
verbose=0)
return out, model
scan_object = ta.Scan(X, Y, model=dnn_model, params=p, experiment_name="test")
reporting = ta.Reporting(scan_object)
report = reporting.data
report.to_csv('./Random_search/dnn/report_talos.txt', sep = '\t')
This code works well. If I change the scan_object as the end to: scan_object = ta.Scan(X, Y, model=dnn_model, grid_downsample=0.3, params=p, experiment_name="test"), it gives me the error: TypeError: __init__() got an unexpected keyword argument 'grid_downsample' while I was expecting to have the same results format as a normal grid search but with less combinations. What am I missing? Did the name of the argument change? I'm using Talos 0.6.3 in a conda environment.
Thank you!
might be too late for you now but they've switched it to fraction_limit. It would give this for you
scan_object = ta.Scan(X, Y, model=dnn_model, params=p, experiment_name="test", fraction_limit = 0.1)
Sadly, the doc isn't well updated
Check out their examples on GitHub:
https://github.com/autonomio/talos/blob/master/examples/Hyperparameter%20Optimization%20with%20Keras%20for%20the%20Iris%20Prediction.ipynb

Callbackfunction modelcheckpoint causes error in keras

I seem to get this error when I am using the callback function modelcheckpoint..
I read from a github issue that the solution would be make use of model.get_weight, but I am implicitly only storing that since i am only storing the one with best weight.
Keras only seem to save weights using h5, which make me question is there any other way to do store them using the eras API, if so how? If not, how do i store it?
Made an example to recreate the problem:
#!/usr/bin/python
import glob, os
import sys
from os import listdir
from os.path import isfile, join
import numpy as np
import warnings
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from keras.utils import np_utils
from keras import metrics
import keras
from keras import backend as K
from keras.models import Sequential
from keras.optimizers import SGD, Adam
from keras.layers.core import Dense, Activation, Lambda, Reshape,Flatten
from keras.layers import Conv1D,Conv2D,MaxPooling2D, MaxPooling1D, Reshape
#from keras.utils.visualize_util import plot
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers.merge import Concatenate, Add
import h5py
import random
import tensorflow as tf
import math
from keras.callbacks import CSVLogger
from keras.callbacks import ModelCheckpoint
if len(sys.argv) < 5:
print "Missing Arguments!"
print "python keras_convolutional_feature_extraction.py <workspace> <totale_frames> <fbank-dim> <window-height> <batch_size>"
print "Example:"
print "python keras_convolutional_feature_extraction.py deltas 15 40 5 100"
sys.exit()
total_frames = int(sys.argv[2])
total_frames_with_deltas = total_frames*3
dim = int(sys.argv[3])
window_height = int(sys.argv[4])
inserted_batch_size = int(sys.argv[5])
stride = 1
splits = ((dim - window_height)+1)/stride
#input_train_data = "/media/carl/E2302E68302E443F/"+str(sys.argv[1])+"/fbank/org_train_total_frames_"+str(total_frames)+"_dim_"+str(dim)+"_winheig_"+str(window_height)+"_batch_"+str(inserted_batch_size)+"_fws_input"
#output_train_data ="/media/carl/E2302E68302E443F/"+str(sys.argv[1])+"/fbank/org_train_total_frames_"+str(total_frames)+"_dim_"+str(dim)+"_winheig_"+str(window_height)+"_batch_"+str(inserted_batch_size)+"_fws_output"
#input_test_data = "/media/carl/E2302E68302E443F/"+str(sys.argv[1])+"/fbank/org_test_total_frames_"+str(total_frames)+"_dim_"+str(dim)+"_winheig_"+str(window_height)+"_batch_"+str(1)+"_fws_input"
#output_test_data = "/media/carl/E2302E68302E443F/"+str(sys.argv[1])+"/fbank/org_test_total_frames_"+str(total_frames)+"_dim_"+str(dim)+"_winheig_"+str(window_height)+"_batch_"+str(1)+"_fws_output"
#train_files =[f for f in listdir(input_train_data) if isfile(join(input_train_data, f))]
#test_files =[f for f in listdir(input_test_data) if isfile(join(input_test_data, f))]
#print len(train_files)
np.random.seed(100)
print "hallo"
def train_generator():
while True:
# input = random.choice(train_files)
# h5f = h5py.File(input_train_data+'/'+input, 'r')
# train_input = h5f['train_input'][:]
# train_output = h5f['train_output'][:]
# h5f.close()
train_input = np.random.randint(100,size=((inserted_batch_size,splits*total_frames_with_deltas,window_height,3)))
train_list_list = []
train_input = train_input.reshape((inserted_batch_size,splits*total_frames_with_deltas,window_height,3))
train_input_list = np.split(train_input,splits*total_frames_with_deltas,axis=1)
for i in range(len(train_input_list)):
train_input_list[i] = train_input_list[i].reshape(inserted_batch_size,window_height,3)
#for i in range(len(train_input_list)):
# train_input_list[i] = train_input_list[i].reshape(inserted_batch_size,33,window_height,1,3)
train_output = np.random.randint(5, size = (1,total_frames,5))
middle = int(math.ceil(total_frames/2))
train_output = train_output[:,middle:middle+1,:].reshape((inserted_batch_size,1,5))
#print train_output.shape
#print len(train_input_list)
#print train_input_list[0].shape
yield (train_input_list, train_output)
print "hallo"
def test_generator():
while True:
# input = random.choice(test_files)
# h5f = h5py.File(input_test_data+'/'+input, 'r')
# test_input = h5f['test_input'][:]
# test_output = h5f['test_output'][:]
# h5f.close()
test_input = np.random.randint(100,size=((inserted_batch_size,splits*total_frames_with_deltas,window_height,3)))
test_input = test_input.reshape((inserted_batch_size,splits*total_frames_with_deltas,window_height,3))
test_input_list = np.split(test_input,splits*total_frames_with_deltas,axis=1)
#test_input_list = np.split(test_input,45,axis=3)
for i in range(len(test_input_list)):
test_input_list[i] = test_input_list[i].reshape(inserted_batch_size,window_height,3)
#for i in range(len(test_input_list)):
# test_input_list[i] = test_input_list[i].reshape(inserted_batch_size,33,window_height,1,3)
test_output = np.random.randint(5, size = (1,total_frames,5))
middle = int(math.ceil(total_frames/2))
test_output = test_output[:,middle:middle+1,:].reshape((inserted_batch_size,1,5))
yield (test_input_list, test_output)
print "hallo"
def fws():
#print "Inside"
# Params:
# batch , lr, decay , momentum, epochs
#
#Input shape: (batch_size,40,45,3)
#output shape: (1,15,50)
# number of unit in conv_feature_map = splitd
next(train_generator())
model_output = []
list_of_input = [Input(shape=(8,3)) for i in range(splits*total_frames_with_deltas)]
output = []
#Conv
skip = total_frames_with_deltas
for steps in range(total_frames_with_deltas):
conv = Conv1D(filters = 100, kernel_size = 8)
column = 0
for _ in range(splits):
#print "column " + str(column) + "steps: " + str(steps)
output.append(conv(list_of_input[(column*skip)+steps]))
column = column + 1
#print len(output)
#print splits*total_frames_with_deltas
conv = []
for section in range(splits):
column = 0
skip = splits
temp = []
for _ in range(total_frames_with_deltas):
temp.append(output[((column*skip)+section)])
column = column + 1
conv.append(Add()(temp))
#print len(conv)
output_conc = Concatenate()(conv)
#print output_conc.get_shape
output_conv = Reshape((splits, -1))(output_conc)
#print output_conv.get_shape
#Pool
pooled = MaxPooling1D(pool_size = 6, strides = 2)(output_conv)
reshape = Reshape((1,-1))(pooled)
#Fc
dense1 = Dense(units = 1024, activation = 'relu', name = "dense_1")(reshape)
#dense2 = Dense(units = 1024, activation = 'relu', name = "dense_2")(dense1)
dense3 = Dense(units = 1024, activation = 'relu', name = "dense_3")(dense1)
final = Dense(units = 5, activation = 'relu', name = "final")(dense3)
model = Model(inputs = list_of_input , outputs = final)
sgd = SGD(lr=0.1, decay=1e-1, momentum=0.9, nesterov=True)
model.compile(loss="categorical_crossentropy", optimizer=sgd , metrics = ['accuracy'])
print "compiled"
model_yaml = model.to_yaml()
with open("model.yaml", "w") as yaml_file:
yaml_file.write(model_yaml)
print "Model saved!"
log= CSVLogger('/home/carl/kaldi-trunk/dnn/experimental/yesno_cnn_50_training_total_frames_'+str(total_frames)+"_dim_"+str(dim)+"_window_height_"+str(window_height)+".csv")
filepath='yesno_cnn_50_training_total_frames_'+str(total_frames)+"_dim_"+str(dim)+"_window_height_"+str(window_height)+"weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_weights_only=True, mode='max')
print "log"
#plot_model(model, to_file='model.png')
print "Fit"
hist_current = model.fit_generator(train_generator(),
steps_per_epoch=444,#len(train_files),
epochs = 10000,
verbose = 1,
validation_data = test_generator(),
validation_steps=44,#len(test_files),
pickle_safe = True,
workers = 4,
callbacks = [log,checkpoint])
fws()
Execute the script by: python name_of_script.py yens 50 40 8 1
which give me a full traceback:
full traceback
Error:
carl#ca-ThinkPad-T420s:~/Dropbox$ python mini.py yesno 50 40 8 1
Using TensorFlow backend.
Couldn't import dot_parser, loading of dot files will not be possible.
hallo
hallo
hallo
compiled
Model saved!
log
Fit
/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py:2252: UserWarning: Expected no kwargs, you passed 1
kwargs passed to function are ignored with Tensorflow backend
warnings.warn('\n'.join(msg))
Epoch 1/10000
2017-05-26 13:01:45.851125: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 13:01:45.851345: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 13:01:45.851392: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
443/444 [============================>.] - ETA: 4s - loss: 100.1266 - acc: 0.3138Epoch 00000: saving model to yesno_cnn_50_training_total_frames_50_dim_40_window_height_8weights-improvement-00-0.48.hdf5
Traceback (most recent call last):
File "mini.py", line 205, in <module>
File "mini.py", line 203, in fws
File "/usr/local/lib/python2.7/dist-packages/keras/legacy/interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1933, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "/usr/local/lib/python2.7/dist-packages/keras/callbacks.py", line 77, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/usr/local/lib/python2.7/dist-packages/keras/callbacks.py", line 411, in on_epoch_end
self.model.save_weights(filepath, overwrite=True)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2503, in save_weights
save_weights_to_hdf5_group(f, self.layers)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2746, in save_weights_to_hdf5_group
f.attrs['layer_names'] = [layer.name.encode('utf8') for layer in layers]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2684)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2642)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 93, in __setitem__
self.create(name, data=value, dtype=base.guess_dtype(value))
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 183, in create
attr = h5a.create(self._id, self._e(tempname), htype, space)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2684)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2642)
File "h5py/h5a.pyx", line 47, in h5py.h5a.create (/tmp/pip-4rPeHA-build/h5py/h5a.c:1904)
RuntimeError: Unable to create attribute (Object header message is too large)
If you look at the amount of data Keras is trying to save under layer_names attribute (inside the output HDF5 file being create), you will find that it takes more than 64kB.
np.asarray([layer.name.encode('utf8') for layer in model.layers]).nbytes
>> 77100
I quote from https://support.hdfgroup.org/HDF5/faq/limits.html:
Is there an object header limit and how does that affect HDF5 ?
There is a limit (in HDF5-1.8) of the object header, which is 64 KB.
The datatype for a dataset is stored in the object header, so there is
therefore a limit on the size of the datatype that you can have. (See
HDFFV-1089)
The code above was (almost entirely) copied from the traceback:
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2746, in save_weights_to_hdf5_group
f.attrs['layer_names'] = [layer.name.encode('utf8') for layer in layers]
I am using numpy asarray method to get the figure fast but h5py gets similar figure (I guess), see https://github.com/h5py/h5py/blob/master/h5py/_hl/attrs.py#L102 if you want to find exact figure.
Anyway, either you will need to implement your own methods for saving/loading of the weights (or use existing workarounds), or you need to give a really short name to ALL the layers inside your model :), something like this:
list_of_input = [Input(shape=(8,3), name=('i%x' % i)) for i in range(splits*total_frames_with_deltas)]
conv = Conv1D(filters = 100, kernel_size = 8, name='cv%x' % steps)
conv.append(Add(name='add%x' % section)(temp))
output_conc = Concatenate(name='ct')(conv)
output_conv = Reshape((splits, -1), name='rs1')(output_conc)
pooled = MaxPooling1D(pool_size = 6, strides = 2, name='pl')(output_conv)
reshape = Reshape((1,-1), name='rs2')(pooled)
dense1 = Dense(units = 1024, activation = 'relu', name = "d1")(reshape)
dense2 = Dense(units
= 1024, activation = 'relu', name = "d2")(dense1)
dense3 = Dense(units = 1024, activation = 'relu', name = "d3")(dense1)
final = Dense(units = 5, activation = 'relu', name = "fl")(dense3)
You mustn't forget to name all the layers because the (numpy) string array into which the layer names are converted is using the size of the longest string for each individual string in it when it is saved!
After renaming the layers as proposed above (which takes almost 26kB) the model is saved successfully. Hope this elaborate answer helps someone.
Update: I have just made a PR to Keras which should fix the issue without implementing any custom loading/saving methods, see 7508
A simple solution, albeit possibly not the most elegant, could be to run a while loop with epochs = 1.
Get the weights at the end of every epoch together with the accuracy and the loss
Save the weights to file 1 with model.get_weight
if accuracy is greater than at the previous epoch (i.e. loop), store the weights to a different file (file 2)
Run the loop again loading the weights from file 1
Break the loops setting a manual early stopping so that it breaks if the loss does not improve for a certain number of loops
You can use get_weights() together with numpy.save.
It's not the best solution, because it will save several files, but it actually works.
The problem is that you won't have the "optimizer" saved with the current states. But you can perhaps work around that by using smaller learning rates after loading.
Custom callback using numpy.save:
def myCallback(epoch,logs):
global storedLoss
#do your comparisons here using the "logs" var.
print(logs)
if (logs['loss'] < storedLoss):
storedLoss = logs['loss']
for i in range(len(model.layers)):
WandB = model.layers[i].get_weights()
if len (WandB) > 0: #necessary because some layers have no weights
np.save("W" + "-" + str(i), WandB[0],False)
np.save("B" + "-" + str(i), WandB[1],False)
#remember that get and set weights use a list: [weights,biases]
#it may happen (not sure) that there is no bias, and thus you may have to check it (len(WandB)==1).
The logs var brings a dictionary with named metrics, such as "loss", and "accuracy", if you used it.
You can store the losses within the callback in a global var, and compare if each loss is better or worse than the last.
When fitting, use the lambda callback:
from keras.callbacks import LambdaCallback
model.fit(...,callbacks=[LambdaCallback(on_epoch_end=myCallback)])
In the example above, I used the LambdaCallback, which has more possibilities than just on_epoch_end.
For loading, do a similar loop:
#you have to create the model first and then set the layers
def loadModel(model):
for i in range(len(model.layers)):
WandBForCheck = model.layers[i].get_weights()
if len (WandBForCheck) > 0: #necessary because some layers have no weights
W = np.load(Wfile + str(i))
B = np.load(Bfile + str(i))
model.layers[i].set_weights([W,B])
See follow-up at https://github.com/fchollet/keras/issues/6766 and https://github.com/farizrahman4u/keras-contrib/pull/90.
I saw the YAML and the root cause is probably that you have so many Inputs. A few Inputs with many dimensions is preferred to many Inputs, especially if you can use scanning and batch operations to do everything efficiently.
Now, ignoring that entirely, here is how you can save and load your model if it has too much stuff to save as JSON efficiently:
You can pass save_weights_only=True. That won't save optimizer weights, so isn't a great solution.
Just put together a PR for saving model weights and optimizer weights but not configuration. When you want to load, first instantiate and compile the model as you did when you were going to train it, then use load_all_weights to load the model and optimizer weights into that model. I'll try to merge it soon so you can use it from the master branch.
You could use it something like this:
from keras.callbacks import LambdaCallback
from keras_contrib.utils.save_load_utils import save_all_weights, load_all_weights
# do some stuff to create and compile model
# use `save_all_weights` as a callback to checkpoint your model and optimizer weights
model.fit(..., callbacks=[LambdaCallback(on_epoch_end=lambda epoch, logs: save_all_weights(model, "checkpoint-{:05d}.h5".format(epoch))])
# use `load_all_weights` to load model and optimizer weights into an existing model
# if not compiled (no `model.optimizer`), this will just load model weights
load_all_weights(model, 'checkpoint-1337.h5')
So I don't endorse the model, but if you want to get it to save and load anyways this should probably work for you.
As a side note, if you want to save weights in a different format, something like this would work.
pickle.dump([K.get_value(w) for w in model.weights], open( "save.p", "wb" ) )
Cheers
Your model architecture must be too large to be saved.
USE get_weights AND set_weights TO SAVE AND LOAD MODEL, RESPECTIVELY.
Do not use callback model checkpoint. just once the training ends, save its weights with pickle.
Have a look at this link: Unable to save DataFrame to HDF5 ("object header message is too large")

Tensorflow: Cannot interpret feed_dict key as Tensor

I am trying to build a neural network model with one hidden layer (1024 nodes). The hidden layer is nothing but a relu unit. I am also processing the input data in batches of 128.
The inputs are images of size 28 * 28. In the following code I get the error in line
_, c = sess.run([optimizer, loss], feed_dict={x: batch_x, y: batch_y})
Error: TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder_64:0", shape=(128, 784), dtype=float32) is not an element of this graph.
Here is the code I have written
#Initialize
batch_size = 128
layer1_input = 28 * 28
hidden_layer1 = 1024
num_labels = 10
num_steps = 3001
#Create neural network model
def create_model(inp, w, b):
layer1 = tf.add(tf.matmul(inp, w['w1']), b['b1'])
layer1 = tf.nn.relu(layer1)
layer2 = tf.matmul(layer1, w['w2']) + b['b2']
return layer2
#Initialize variables
x = tf.placeholder(tf.float32, shape=(batch_size, layer1_input))
y = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
w = {
'w1': tf.Variable(tf.random_normal([layer1_input, hidden_layer1])),
'w2': tf.Variable(tf.random_normal([hidden_layer1, num_labels]))
}
b = {
'b1': tf.Variable(tf.zeros([hidden_layer1])),
'b2': tf.Variable(tf.zeros([num_labels]))
}
init = tf.initialize_all_variables()
train_prediction = tf.nn.softmax(model)
tf_valid_dataset = tf.constant(valid_dataset)
tf_test_dataset = tf.constant(test_dataset)
model = create_model(x, w, b)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(model, y))
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
#Process
with tf.Session(graph=graph1) as sess:
tf.initialize_all_variables().run()
total_batch = int(train_dataset.shape[0] / batch_size)
for epoch in range(num_steps):
loss = 0
for i in range(total_batch):
batch_x, batch_y = train_dataset[epoch * batch_size:(epoch+1) * batch_size, :], train_labels[epoch * batch_size:(epoch+1) * batch_size,:]
_, c = sess.run([optimizer, loss], feed_dict={x: batch_x, y: batch_y})
loss = loss + c
loss = loss / total_batch
if epoch % 500 == 0:
print ("Epoch :", epoch, ". cost = {:.9f}".format(avg_cost))
print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
valid_prediction = tf.run(tf_valid_dataset, {x: tf_valid_dataset})
print("Validation accuracy: %.1f%%" % accuracy(valid_prediction.eval(), valid_labels))
test_prediction = tf.run(tf_test_dataset, {x: tf_test_dataset})
print("TEST accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))
This worked for me
from keras import backend as K
and after predicting my data i inserted this part of code
then i had again loaded the model.
K.clear_session()
i faced this problem in production server,
but in my pc it was running fine
...........
from keras import backend as K
#Before prediction
K.clear_session()
#After prediction
K.clear_session()
Variable x is not in the same graph as model, try to define all of these in the same graph scope. For example,
# define a graph
graph1 = tf.Graph()
with graph1.as_default():
# placeholder
x = tf.placeholder(...)
y = tf.placeholder(...)
# create model
model = create(x, w, b)
with tf.Session(graph=graph1) as sess:
# initialize all the variables
sess.run(init)
# then feed_dict
# ......
If you use django server, just runserver with --nothreading
for example:
python manage.py runserver --nothreading
I had the same issue with flask. adding --without-threads flag to flask run or threaded=False to app.run() fixed it
In my case, I was using loop while calling in CNN multiple times, I fixed my problem by doing the following:
# Declare this as global:
global graph
graph = tf.get_default_graph()
# Then just before you call in your model, use this
with graph.as_default():
# call you models here
Note: In my case too, the app ran fine for the first time and then gave the error above. Using the above fix solved the problem.
Hope that helps.
The error message TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("...", dtype=dtype) is not an element of this graph can also arise in case you run a session outside of the scope of its with statement. Consider:
with tf.Session() as sess:
sess.run(logits, feed_dict=feed_dict)
sess.run(logits, feed_dict=feed_dict)
If logits and feed_dict are defined properly, the first sess.run command will execute normally, but the second will raise the mentioned error.
You can also experience this while working on notebooks hosted on online learning platforms like Coursera. So, implementing following code could help get over with the issue.
Implement this at the topmost block of Notebook file:
from keras import backend as K
K.clear_session()
Similar to #javan-peymanfard and #hmadali-shafiee, I ran into this issue when loading the model in an API. I was using FastAPI with uvicorn. To fix the issue I just set the API function definitions to async similar to this:
#app.post('/endpoint_name')
async def endpoint_function():
# Do stuff here, including possibly (re)loading the model

Keras: What is the correct data format for recurrent networks?

I am trying to build a recurrent network which classifies sequences (multidimensional data streams). I must be missing something, since while running my code:
from keras.models import Sequential
from keras.layers import LSTM, Dropout, Activation
import numpy as np
ils = 10 # input layer size
ilt = 11 # input layer time steps
hls = 12 # hidden layer size
nhl = 2 # number of hidden layers
ols = 1 # output layer size
p = 0.2 # dropout probability
f_a = 'relu' # activation function
opt = 'rmsprop' # optimizing function
#
# Building the model
#
model = Sequential()
# The input layer
model.add(LSTM(hls, input_shape=(ilt, ils), return_sequences=True))
model.add(Activation(f_a))
model.add(Dropout(p))
# Hidden layers
for i in range(nhl - 1):
model.add(LSTM(hls, return_sequences=True))
model.add(Activation(f_a))
model.add(Dropout(p))
# Output layer
model.add(LSTM(ols, return_sequences=False))
model.add(Activation('softmax'))
model.compile(optimizer=opt, loss='binary_crossentropy')
#
# Making test data and fitting the model
#
m_train, n_class = 1000, 2
data = np.array(np.random.random((m_train, ilt, ils)))
labels = np.random.randint(n_class, size=(m_train, 1))
model.fit(data, labels, nb_epoch=10, batch_size=32)
I get output (truncated):
Using Theano backend.
line 611, in __call__
node = self.make_node(*inputs, **kwargs)
File "/home/koala/.local/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 430, in make_node
new_inputs.append(format(outer_seq, as_var=inner_seq))
File "/home/koala/.local/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 422, in format
rval = tmp.filter_variable(rval)
File "/home/koala/.local/lib/python2.7/site-packages/theano/tensor/type.py", line 233, in filter_variable
self=self))
TypeError: Cannot convert Type TensorType(float32, 3D) (of Variable Subtensor{:int64:}.0) into Type TensorType(float32, (False, False, True)). You can try to manually convert Subtensor{:int64:}.0 into a TensorType(float32, (False, False, True)).
Is this a problem with the data format at all.
For me the problem was fixed when I went and tried it on my real dataset. The difference being that in the real dataset I have more than 1 label. So an example of dataset on which this code works is:
(...)
ols = 2 # Output layer size
(...)
m_train, n_class = 1000, ols
data = np.array(np.random.random((m_train, ilt, ils)))
labels = np.random.randint(n_class, size=(m_train, 1))
# Make labels onehot
onehot_labels = np.zeros(shape=(labels.shape[0], ols))
onehot_labels[np.arange(labels.shape[0]), labels.astype(np.int)] = 1