DNN model - strange results. "Loss" results different in Train/Testing and in Kaggle - neural-network

I have trained a simple DNN model - using Keras with Theano backend, having:
2 * (3x3 conv layer, 2x2 max-pool, 15% dropout),
96-FC layer,
softmax activation on the final layer,
optimizer: adam.
The goal was to perform a classification into 6 classes.
I have a dataset which consist of 116000 equally distributed among all classes images.
After 14 epochs - the validation results were great:
loss: 0.1, accuracy 92% on a test data - and during the training I had a similar results on my training and validation data.
The task is for learning purpose only - defined in Kaggle website. They want a minimum log-loss result. When I upload my predictions there - the calculated loss is very high ( > 2)
Do you have any suggestions?
some code representing the network and how I use it:
1. Inputs:
116000 images of size 160x160
2. Model
model = Sequential()
model.add(Convolution2D(64, 3, 3, activation='relu', input_shape=(1, 160, 160)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Flatten())
model.add(Dense(96, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation='softmax'))
3. Compile and fit:
model.compile(loss='categorical_crossentropy',#categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(np.array(trainData[0]), np.array(trainData[1]),
batch_size=13, nb_epoch=14, verbose=1, shuffle=True, callbacks=callbacks_list, validation_data = (np.array(validationData[0]), np.array(validationData[1])))
4. Test with a new set of samples:
scores = model.evaluate(np.array(testData[0]), np.array(testData[1]), verbose=1)

Related

Bad performance of the model Keras

I am working with a Boston housing price. I have my X and Y with a shape of (506, 13). Then, i define my model
def basic_model_1():
t_model = Sequential()
t_model.add(Dense(13, activation="tanh", input_dim = 13))
t_model.add(Dense(10, activation="tanh"))
t_model.add(Dropout(0.2))
t_model.add(Dense(6, activation="tanh"))
t_model.add(Dense(3, activation="tanh"))
t_model.add(Dense(1))
print(t_model.summary())
t_model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
t_model.fit(X,Y, nb_epoch = 200 , batch_size= 10, validation_split= 0.20)
return(t_model)
When i run this model, i get pretty bad performance of val_acc 0.0098. I changed activation function to sigmoid or relu. Performance increases slightly. What do i need to increase model performance?
In my opinion you could:
1) Add more neurons at every layer (use a multiple of 2 for better performance, try 64, 128, 256).
2) Add more dropout layers, one after every Dense layer.
3) Add much more data.
There is nothing wrong with your model architecture. Only thing I would suggest is to use kernel_initializer='normal', activation='relu' in all Dense layers (specially in output layer) since it's a regression model.

Keras Regression to approximate function (goal: loss < 1e-7)

I'm working on a neural network which approximates a function f(X)=y, with X a vector [x0, .., xn] and y in [-inf, +inf]. This approximated function needs to have an accuracy (sum of errors) around 1e-8. In fact, I need my neural network to overfit.
X is composed of random points in the interval -500 and 500. Before putting these points into the input layer I normalized them between [0, 1].
I use keras as follow:
dimension = 10 #example
self.model = Sequential()
self.model.add(Dense(128, input_shape=(dimension,), init='uniform', activation='relu'))
self.model.add(Dropout(.2))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Dense(1))
X_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
y_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
X_scaled = (X_scaler.fit_transform(train_dataset))
y_scaled = (y_scaler.fit_transform(train_labels))
self.model.compile(loss='mse', optimizer='adam')
self.model.fit(X_scaled, y_scaled, epochs=10000, batch_size=10, verbose=1)
I tried different NN, first [n] -> [2] -> [1] with Relu activation function, then [n] -> [128] -> [64] -> [1].
I tried the SGB Optimizer and I slowly increase the learning rate from 1e-9 to 0.1.
I also tried without normalized the data but, in this case, the loss is very high.
My best loss (MSE) is 0.037 with the current setup but i'm far from my goal (1e-8).
First, I would like to know if I did something wrong. I'm in the good way ?
If not, how can I reach my goal ?
Thanks you very much
Try #2
I tried this new configuration:
model = Sequential()
model.add(Dense(128, input_shape=(10,), init='uniform', activation='relu'))
model.add(Dropout(.2))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
On a sample of 50 elements, batch_size at 10 and during 100000 epochs.
I get a loss around 1e-4.
Try #3
model.add(Dense(128, input_shape=(10,), activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(1, activation='sigmoid'))
batch_size=1000
epochs=1e5
result: Loss around 1.e-7

How to have parallel convolutional layers in keras?

I am a little new to neural networks and keras. I have some images with size 6*7 and the size of the filter is 15. I want to have several filters and train a convolutional layer separately on each and then combine them. I have looked at one example here:
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten(input_shape=input_shape))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('tanh'))
This model works with one filter. Can anybody give me some hints on how to modify the model to work with parallel convolutional layers.
Thanks
Here is an example of designing a network of parallel convolution and sub sampling layers in keras version 2. I hope this resolves your problem.
rows, cols = 100, 15
def create_convnet(img_path='network_image.png'):
input_shape = Input(shape=(rows, cols, 1))
tower_1 = Conv2D(20, (100, 5), padding='same', activation='relu')(input_shape)
tower_1 = MaxPooling2D((1, 11), strides=(1, 1), padding='same')(tower_1)
tower_2 = Conv2D(20, (100, 7), padding='same', activation='relu')(input_shape)
tower_2 = MaxPooling2D((1, 9), strides=(1, 1), padding='same')(tower_2)
tower_3 = Conv2D(20, (100, 10), padding='same', activation='relu')(input_shape)
tower_3 = MaxPooling2D((1, 6), strides=(1, 1), padding='same')(tower_3)
merged = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1)
merged = Flatten()(merged)
out = Dense(200, activation='relu')(merged)
out = Dense(num_classes, activation='softmax')(out)
model = Model(input_shape, out)
plot_model(model, to_file=img_path)
return model
The image of this network will look like
My approach is to create other model that defines all parallel convolution and pulling operations and concat all parallel result tensors to single output tensor. Now you can add this parallel model graph in your sequential model just like layer. Here is my solution, hope it solves your problem.
# variable initialization
from keras import Input, Model, Sequential
from keras.layers import Conv2D, MaxPooling2D, Concatenate, Activation, Dropout, Flatten, Dense
nb_filters =100
kernel_size= {}
kernel_size[0]= [3,3]
kernel_size[1]= [4,4]
kernel_size[2]= [5,5]
input_shape=(32, 32, 3)
pool_size = (2,2)
nb_classes =2
no_parallel_filters = 3
# create seperate model graph for parallel processing with different filter sizes
# apply 'same' padding so that ll produce o/p tensor of same size for concatination
# cancat all paralle output
inp = Input(shape=input_shape)
convs = []
for k_no in range(len(kernel_size)):
conv = Conv2D(nb_filters, kernel_size[k_no][0], kernel_size[k_no][1],
border_mode='same',
activation='relu',
input_shape=input_shape)(inp)
pool = MaxPooling2D(pool_size=pool_size)(conv)
convs.append(pool)
if len(kernel_size) > 1:
out = Concatenate()(convs)
else:
out = convs[0]
conv_model = Model(input=inp, output=out)
# add created model grapg in sequential model
model = Sequential()
model.add(conv_model) # add model just like layer
model.add(Conv2D(nb_filters, kernel_size[1][0], kernel_size[1][0]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten(input_shape=input_shape))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('tanh'))
For more information refer similar question: Combining the outputs of multiple models into one model

Keras model classifying well but predicted probabilities are always 1.0 or 0.0

I am using Keras to build a multi-class (3 classes) image classifier.
I trained the following model with a dataset of approximately 2000 images (1500 training/ 500 validation).
batch_size = 128
nb_classes = 3
nb_epoch = 25
img_rows, img_cols = 128, 128
input_shape = (1, img_rows, img_cols)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 5, 5, border_mode='same', input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(64, 5, 5, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(nb_classes, activation='softmax'))
lrate = 0.001
decay = lrate/nb_epoch
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
history = model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True,
callbacks=early_stopping)
These are the figures for the training/validation accuracy and loss.
I achieve a 95% training accuracy, 92% validation accuracy and 94% on another separate chunk of images (aside from the 2000 of the dataset) that I have.
Therefore, the model seems to classify reasonably well. However, my problem is that the predicted probabilities for an input image (obtained with the function predict_proba()) are always either 1.0 or 0.0.
Likewise, if I give as an input an image that doesn't belong to any of the 3 classes I would expect some low probabilities (perhaps higher in the most similar class) but I still get 1.0 in one of the classes and 0.0 in the others.
What could be causing that? It seems to me that there is no over fitting. Is there any issue with the model?
Could it be that the images of each class are quite similar between them so somehow the model is quickly too confident on its decision?

How to build simple neural network on keras (not image recognition)

I am new to keras and I am trying to built my own neural network.
A task:
I need to write a system that can make decisions for the character, which may meet one or more enemies. The system can be known:
Percentage Health character
Presence of the pistol;
The number of enemies.
The answer must be in the form of one of the following:
Attack
Run
Hide (for a surprise attack)
To do nothing
To train up I made a table of "lessons":
https://i.stack.imgur.com/lD0WX.png
So here is my code:
# Create first network with Keras
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# split into input (X) and output (Y) variables
X = numpy.array([[0.5,1,1], [0.9,1,2], [0.8,0,1], [0.3,1,1], [0.6,1,2], [0.4,0,1], [0.9,1,7], [0.5,1,4], [0.1,0,1], [0.6,1,0], [1,0,0]])
Y = numpy.array([[1],[1],[1],[2],[2],[2],[3],[3],[3],[4],[4]])
# create model
model = Sequential()
model.add(Dense(3, input_dim=3, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
sgd = SGD(lr=0.001)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
# Fit the model
model.fit(X, Y, nb_epoch=150)
# calculate predictions
predictions = model.predict(X)
# round predictions
rounded = [round(x) for x in predictions]
print(rounded)
Here the predictions I get.
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
The accuracy on each epoch is 0.2727 and the loss is decrease.
It's not right.
I was trying to devide learning rate by 10, changing activations and optimizers. Even data I input manually.
Can anyone tell me how to solve my simple problem. thx.
There are several problems in your code.
Number of data entries are very small compared to the NN model.
Y is represented as classes number and not as class vector. A regression model can be learnt on this but its a poor design choice.
output of softmax function is always between 0-1 .. as this is used your model only knows to spew out values between 0-1.
Here below is a bit better modified code:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# split into input (X) and output (Y) variables
X = numpy.array([[0.5,1,1], [0.9,1,2], [0.8,0,1], [0.3,1,1], [0.6,1,2], [0.4,0,1], [0.9,1,7], [0.5,1,4], [0.1,0,1], [0.6,1,0], [1,0,0]])
y = numpy.array([[1],[1],[1],[2],[2],[2],[3],[3],[3],[0],[0]])
from keras.utils import np_utils
Y = np_utils.to_categorical(y, 4)
# print Y
# create model
model = Sequential()
model.add(Dense(3, input_dim=3, activation='relu'))
model.add(Dense(4, activation='softmax'))
# Compile model
# sgd = SGD(lr=0.1)
# model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, nb_epoch=700)
# calculate predictions
predictions = model.predict(X)
predictions_class = predictions.argmax(axis=-1)
print(predictions_class)
Note I have used the softmax activation as the classes are mutually exclusive