Keras IndexError: indices are out-of-bounds - neural-network

I'm new to Keras and im trying to do Binary MLP on a dataset, and keep getting indices out of bounds with no idea why.
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop')
model.fit(trainx, trainy, nb_epoch=20, batch_size=16) # THROWS INDICES ERROR
Error:
model.fit(trainx, trainy, nb_epoch=20, batch_size=16)
Epoch 1/20
Traceback (most recent call last):
File "<ipython-input-6-c81bd7606eb0>", line 1, in <module>
model.fit(trainx, trainy, nb_epoch=20, batch_size=16)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\keras\models.py", line 646, in fit
shuffle=shuffle, metrics=metrics)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\keras\models.py", line 271, in _fit
ins_batch = slice_X(ins, batch_ids)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\keras\models.py", line 65, in slice_X
return [x[start] for x in X]
File "C:\Users\Thiru\Anaconda3\lib\site-packages\keras\models.py", line 65, in <listcomp>
return [x[start] for x in X]
File "C:\Users\Thiru\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1963, in __getitem__
return self._getitem_array(key)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2008, in _getitem_array
return self.take(indexer, axis=1, convert=True)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1371, in take
convert=True, verify=True)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3619, in take
indexer = maybe_convert_indices(indexer, n)
File "C:\Users\Thiru\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1750, in maybe_convert_indices
raise IndexError("indices are out-of-bounds")
IndexError: indices are out-of-bounds
Does anyone have any idea why this is happening? Im able to run other models just fine

Answer from the comment - trainx and trainy should be numpy arrays. You can convert the data frame to numpy array using as_matrix() method. I also faced this issue. It's weird that Keras does not give meaningful error message.

I came here looking for the same issue resolution for the auto-sklearn and pandas dataframe. The solution is to pass the X dataframe as X.values. I.e. fit(X.values,y)

From the official Keras Page:
Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the fit function.
To convert a pandas dataframe to numpy array you can use np.array(dataframe). For example:
x_train = np.array(x_train)

Related

Python Keras MLP for Multi-class classification value error while model fit

Getting value error while error while running the Keras multi class classification model using below code:
model2 = Sequential()
model2.add(Dense(200, input_shape=(4132,), activation='relu'))
model2.add(Dense(200, activation='relu'))
model2.add(Dense(31, activation='softmax'))
SGD = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model2.compile(optimizer=SGD,
loss='categorical_crossentropy',
metrics=['accuracy'])
model2.fit(x_train, y_train, epochs=100, verbose=2) ---> Error on this line
Error:
Train Shape: (4132, 49)
Test Shape: (1033, 49)
Traceback (most recent call last):
File "ANN.py", line 213, in <module>
model2.fit(x_train, y_train, epochs=100, verbose=2)
File "C:\Users\C256121\AppData\Local\Programs\Python\Python36\lib\site-package
s\keras\models.py", line 960, in fit
validation_steps=validation_steps)
File "C:\Users\C256121\AppData\Local\Programs\Python\Python36\lib\site-package
s\keras\engine\training.py", line 1574, in fit
batch_size=batch_size)
File "C:\Users\C256121\AppData\Local\Programs\Python\Python36\lib\site-package
s\keras\engine\training.py", line 1407, in _standardize_user_data
exception_prefix='input')
File "C:\Users\C256121\AppData\Local\Programs\Python\Python36\lib\site-package
s\keras\engine\training.py", line 128, in _standardize_input_data
arrays[i] = array
ValueError: could not broadcast input array from shape (49,1) into shape (49)
I have 31 classes in the target variable. Please help.

keras neural network architecture incorrect

Here is a simple neural network that contains 3 input values and 3 output values.
The error :
ValueError: Error when checking model target: expected dense_78 to have shape (None, 3) but got array with shape (3, 1)
Is thrown when I execute this network. I've the set the final layer to have 3 possible outputs which match the number of labels :
model.add(Dense(3, activation='softmax'))
I've not architected this network correctly, where is my mistake ?
data = ([[ 0.29365378],
[ 0.27958957],
[ 0.27946938]])
labels = [[1], [2], [3]]
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
model.add(Dense(64, activation='relu', input_dim=1))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(data, labels,
epochs=20,
batch_size=32)
A Dense(3...) will give you three outputs per sample.
The output of a Dense(3...) has shape (BatchSize,3), or (None,3) as Keras says it.
If you want one among 3 possible classes for each sample, then you must have labels with shape (BatchSize,3). Where in your case the batch size also seems to be 3.
You must format your labels in one-hot vectors:
class 1 = [1,0,0]
class 2 = [0,1,0]
class 3 = [0,0,1]
The to_categorical in keras.utils can help you with transforming numerical classes into one-hot vector classes.
If you have three samples, you must have labels as:
labels = [[1,0,0],[0,1,0],[0,0,1]]
Three samples, each sample with three possible classes, being the first sample class 1, the second sample class 2 and the third sample class 3.
This has shape (3,3) which will match the (None,3) demanded by Dense(3...).

Why does my CIFAR 100 CNN model mainly predict two classes?

I am currently trying to get a decent score (> 40% accuracy) with Keras on CIFAR 100. However, I'm experiencing a weird behaviour of a CNN model: It tends to predict some classes (2 - 5) much more often than others:
The pixel at position (i, j) contains the count how many elements of the validation set from class i were predicted to be of class j. Thus the diagonal contains the correct classifications, everything else is an error. The two vertical bars indicate that the model often predicts those classes, although it is not the case.
CIFAR 100 is perfectly balanced: All 100 classes have 500 training samples.
Why does the model tend to predict some classes MUCH more often than other classes? How can this be fixed?
The code
Running this takes a while.
#!/usr/bin/env python
from __future__ import print_function
from keras.datasets import cifar100
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from sklearn.model_selection import train_test_split
import numpy as np
batch_size = 32
nb_classes = 100
nb_epoch = 50
data_augmentation = True
# input image dimensions
img_rows, img_cols = 32, 32
# The CIFAR10 images are RGB.
img_channels = 3
# The data, shuffled and split between train and test sets:
(X, y), (X_test, y_test) = cifar100.load_data()
X_train, X_val, y_train, y_val = train_test_split(X, y,
test_size=0.20,
random_state=42)
# Shuffle training data
perm = np.arange(len(X_train))
np.random.shuffle(perm)
X_train = X_train[perm]
y_train = y_train[perm]
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_val.shape[0], 'validation samples')
print(X_test.shape[0], 'test samples')
# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
Y_val = np_utils.to_categorical(y_val, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_val /= 255
X_test /= 255
if not data_augmentation:
print('Not using data augmentation.')
model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_val, y_val),
shuffle=True)
else:
print('Using real-time data augmentation.')
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# Compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(X_train)
# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_val, Y_val))
model.save('cifar100.h5')
Visualization code
#!/usr/bin/env python
"""Analyze a cifar100 keras model."""
from keras.models import load_model
from keras.datasets import cifar100
from sklearn.model_selection import train_test_split
import numpy as np
import json
import io
import matplotlib.pyplot as plt
try:
to_unicode = unicode
except NameError:
to_unicode = str
n_classes = 100
def plot_cm(cm, zero_diagonal=False):
"""Plot a confusion matrix."""
n = len(cm)
size = int(n / 4.)
fig = plt.figure(figsize=(size, size), dpi=80, )
plt.clf()
ax = fig.add_subplot(111)
ax.set_aspect(1)
res = ax.imshow(np.array(cm), cmap=plt.cm.viridis,
interpolation='nearest')
width, height = cm.shape
fig.colorbar(res)
plt.savefig('confusion_matrix.png', format='png')
# Load model
model = load_model('cifar100.h5')
# Load validation data
(X, y), (X_test, y_test) = cifar100.load_data()
X_train, X_val, y_train, y_val = train_test_split(X, y,
test_size=0.20,
random_state=42)
# Calculate confusion matrix
y_val_i = y_val.flatten()
y_val_pred = model.predict(X_val)
y_val_pred_i = y_val_pred.argmax(1)
cm = np.zeros((n_classes, n_classes), dtype=np.int)
for i, j in zip(y_val_i, y_val_pred_i):
cm[i][j] += 1
acc = sum([cm[i][i] for i in range(100)]) / float(cm.sum())
print("Validation accuracy: %0.4f" % acc)
# Create plot
plot_cm(cm)
# Serialize confusion matrix
with io.open('cm.json', 'w', encoding='utf8') as outfile:
str_ = json.dumps(cm.tolist(),
indent=4, sort_keys=True,
separators=(',', ':'), ensure_ascii=False)
outfile.write(to_unicode(str_))
Red herrings
tanh
I've replaced tanh by relu. The history csv looks ok, but the visualization has the same problem:
Please also note that the validation accuracy here is only 3.44%.
Dropout + tanh + border mode
Removing dropout, replacing tanh by relu, setting border mode to same everywhere: history csv
The visualization code still gives a much lower accuracy (8.50% this time) than the keras training code.
Q & A
The following is a summary of the comments:
The data is evenly distributed over the classes. So there is no "over training" of those two classes.
Data augmentation is used, but without data augmentation the problem persists.
The visualization is not the problem.
If you get good accuracy during training and validation, but not when testing, make sure you do exactly the same preprocessing on your dataset in both cases.
Here you have when training:
X_train /= 255
X_val /= 255
X_test /= 255
But no such code when predicting for your confusion matrix. Adding to testing:
X_val /= 255.
Gives the following nice looking confusion matrix:
I don't have a good feeling with this part of the code:
model.add(Dense(1024))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
The remaining model is full of relus, but here there is a tanh.
tanh sometimes vanishes or explodes (saturates at -1 and 1), which might lead to your 2-class overimportance.
keras-example cifar 10 basically uses the same architecture (dense-layer sizes might be different), but also uses a relu there (no tanh at all). The same goes for this external keras-based cifar 100 code.
One important part of the problem was that my ~/.keras/keras.json was
{
"image_dim_ordering": "th",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
Hence I had to change image_dim_ordering to tf. This leads to
and an accuracy of 12.73%. Obviously, there is still a problem as the validation history gave 45.1% accuracy.
I don't see you doing mean-centering, even in datagen. I suspect this is the main cause. To do mean centering using ImageDataGenerator, set featurewise_center = 1. Another way is to subtract the ImageNet mean from each RGB pixel. The mean vector to be subtracted is [103.939, 116.779, 123.68].
Make all activations relus, unless you have a specific reason to have a single tanh.
Remove two dropouts of 0.25 and see what happens. If you want to apply dropouts to convolution layer, it is better to use SpatialDropout2D. It is somehow removed from Keras online documentation but you can find it in the source.
You have two conv layers with same and two with valid. There is nothing wrong in this, but it would be simpler to keep all conv layers with same and control your size just based on max-poolings.

How to input data into Keras? Specifically what is the x_train and y_train if I have more than 2 columns?

How can I input data into keras? What is the structure? Specifically what is the x_train and y_train if I have more than 2 columns?
This is the data I want to input:
I am trying to define Xtrain in this example Multi Layer Perceptron Neural Network code Keras has in its documentation. (http://keras.io/examples/) Here is the code:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
EDIT (additional information):
Looking here: What is data type for Python Keras deep learning package?
Keras uses numpy arrays containing the theano.config.floatX floating point type. This can be configured in your .theanorc file. Typically, it will be float64 for CPU computations and float32 for GPU computations, although you can also set it to float32 when working on the CPU if you prefer. You can create a zero-filled array of the proper type by the command
X = numpy.zeros((4,3), dtype=theano.config.floatX)
Question: Step 1 looks like create a floating point numpy array using my above data from the excel file. What do I do with the winner column?
It all depends on your need.
It looks like that you want to predict the winner based on the parameters shown in column A - N. Then you should define input_dim to be 14, and X_train should be an (N,14) numpy array like this:
[
[9278, 37.9, ...],
[18594, 36.3, ...],
...
]
It seems that your prediction set only contains 2 items ( 2 president candidates LOL), so you should encode the answer Y_train in an (N,2) numpy array like this:
[
[1, 0],
[1, 0],
...
[0, 1],
[0, 1],
...
]
where [1,0] indicates that Barack Obama is the winner and vice versa.

Test Data Prediction Error in SciPy sparse matrix

I input data in LIBSVM format like this into a SciPy sparse matrix. The training set is multi-label and multi-class as described in this question I asked:
Understanding format of data in scikit-learn
from sklearn.datasets import load_svmlight_file
X,Y = load_svmlight_file("train-subset100.csv.csv", multilabel = True, zero_based = True)
Then I employ OneVsRestClassifier with LinearSVC to train the data.
clf = OneVsRestClassifier(LinearSVC())
clf.fit(X, Y)
Now when I want to test the data, I do the following.
X_, Y_ = load_svmlight_file("train-subset10.csv", multilabel = True, zero_based = False)
predicted = clf.predict(X_)
Here it gives me error. I dump the traceback here as it is.
Traceback (most recent call last):
File "test.py", line 36, in
predicted = clf.predict(X_)
File "/usr/lib/pymodules/python2.7/sklearn/multiclass.py", line 151, in predict
return predict_ovr(self.estimators_, self.label_binarizer_, X)
File "/usr/lib/pymodules/python2.7/sklearn/multiclass.py", line 67, in predict_ovr
Y = np.array([_predict_binary(e, X) for e in estimators])
File "/usr/lib/pymodules/python2.7/sklearn/multiclass.py", line 40, in _predict_binary
return np.ravel(estimator.decision_function(X))
File "/usr/lib/pymodules/python2.7/sklearn/svm/base.py", line 728, in decision_function
self._check_n_features(X)
File "/usr/lib/pymodules/python2.7/sklearn/svm/base.py", line 748, in _check_n_features
X.shape[1]))
ValueError: X.shape[1] should be 3421, not 690.
I do not understand why is it looking for more features when the input format is a sparse matrix? How can I get it to predict test labels correctly?
I solved the issue myself. The problem was that loading datasets one by one using SVMLIGHT/LIBSVM format expects the training matrices to have feature set of the same size. So there are two workarounds for it. One is that you input all data at once using load_svmlight_files command.
X,Y,X_,Y_ = load_svmlight_files("train-subset100.csv", "train-subset10.csv",...
multilabel = True, zero_based = False)
Secondly you can mention the number of features explicitly.
X,Y=load_svmlight_file("train-subset100.csv",multilabel=True, zero_based = False)
X_,Y_ = load_svmlight_file("train-subset10.csv", n_features = X.shape[1],...
multilabel = True, zero_based = False, )