Keras Regression to approximate function (goal: loss < 1e-7) - neural-network

I'm working on a neural network which approximates a function f(X)=y, with X a vector [x0, .., xn] and y in [-inf, +inf]. This approximated function needs to have an accuracy (sum of errors) around 1e-8. In fact, I need my neural network to overfit.
X is composed of random points in the interval -500 and 500. Before putting these points into the input layer I normalized them between [0, 1].
I use keras as follow:
dimension = 10 #example
self.model = Sequential()
self.model.add(Dense(128, input_shape=(dimension,), init='uniform', activation='relu'))
self.model.add(Dropout(.2))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Dense(1))
X_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
y_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
X_scaled = (X_scaler.fit_transform(train_dataset))
y_scaled = (y_scaler.fit_transform(train_labels))
self.model.compile(loss='mse', optimizer='adam')
self.model.fit(X_scaled, y_scaled, epochs=10000, batch_size=10, verbose=1)
I tried different NN, first [n] -> [2] -> [1] with Relu activation function, then [n] -> [128] -> [64] -> [1].
I tried the SGB Optimizer and I slowly increase the learning rate from 1e-9 to 0.1.
I also tried without normalized the data but, in this case, the loss is very high.
My best loss (MSE) is 0.037 with the current setup but i'm far from my goal (1e-8).
First, I would like to know if I did something wrong. I'm in the good way ?
If not, how can I reach my goal ?
Thanks you very much
Try #2
I tried this new configuration:
model = Sequential()
model.add(Dense(128, input_shape=(10,), init='uniform', activation='relu'))
model.add(Dropout(.2))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
On a sample of 50 elements, batch_size at 10 and during 100000 epochs.
I get a loss around 1e-4.
Try #3
model.add(Dense(128, input_shape=(10,), activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(1, activation='sigmoid'))
batch_size=1000
epochs=1e5
result: Loss around 1.e-7

Related

How explain that no negative is predicted?

I have built this model with Keras :
model = Sequential()
model.add(LSTM(50, return_sequences=True,input_shape=(look_back, trainX.shape[2])))
model.add(LSTM(50))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics=['accuracy'])
model.fit(trainX, trainY,validation_split=0.3, epochs=50, batch_size=1000, verbose=1)
and the results are surprising... When I do this :
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
print(confusion_matrix(trainY, trainPredict.round()))
print(confusion_matrix(testY, testPredict.round()))
I respectively get :
[[129261 0]
[ 172 129138]]
and
[[10822 0]
[10871 0]]
In other words, My training confusion matrix is quite fine while my testing confusion matrix classifies everybody as "positive". What is surprising is that I have quite perfectly balanced instances, both in training and testing set...
Why do I have this ?
Thanks

DNN model - strange results. "Loss" results different in Train/Testing and in Kaggle

I have trained a simple DNN model - using Keras with Theano backend, having:
2 * (3x3 conv layer, 2x2 max-pool, 15% dropout),
96-FC layer,
softmax activation on the final layer,
optimizer: adam.
The goal was to perform a classification into 6 classes.
I have a dataset which consist of 116000 equally distributed among all classes images.
After 14 epochs - the validation results were great:
loss: 0.1, accuracy 92% on a test data - and during the training I had a similar results on my training and validation data.
The task is for learning purpose only - defined in Kaggle website. They want a minimum log-loss result. When I upload my predictions there - the calculated loss is very high ( > 2)
Do you have any suggestions?
some code representing the network and how I use it:
1. Inputs:
116000 images of size 160x160
2. Model
model = Sequential()
model.add(Convolution2D(64, 3, 3, activation='relu', input_shape=(1, 160, 160)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Flatten())
model.add(Dense(96, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation='softmax'))
3. Compile and fit:
model.compile(loss='categorical_crossentropy',#categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(np.array(trainData[0]), np.array(trainData[1]),
batch_size=13, nb_epoch=14, verbose=1, shuffle=True, callbacks=callbacks_list, validation_data = (np.array(validationData[0]), np.array(validationData[1])))
4. Test with a new set of samples:
scores = model.evaluate(np.array(testData[0]), np.array(testData[1]), verbose=1)

Keras model classifying well but predicted probabilities are always 1.0 or 0.0

I am using Keras to build a multi-class (3 classes) image classifier.
I trained the following model with a dataset of approximately 2000 images (1500 training/ 500 validation).
batch_size = 128
nb_classes = 3
nb_epoch = 25
img_rows, img_cols = 128, 128
input_shape = (1, img_rows, img_cols)
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 5, 5, border_mode='same', input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(64, 5, 5, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(nb_classes, activation='softmax'))
lrate = 0.001
decay = lrate/nb_epoch
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
history = model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True,
callbacks=early_stopping)
These are the figures for the training/validation accuracy and loss.
I achieve a 95% training accuracy, 92% validation accuracy and 94% on another separate chunk of images (aside from the 2000 of the dataset) that I have.
Therefore, the model seems to classify reasonably well. However, my problem is that the predicted probabilities for an input image (obtained with the function predict_proba()) are always either 1.0 or 0.0.
Likewise, if I give as an input an image that doesn't belong to any of the 3 classes I would expect some low probabilities (perhaps higher in the most similar class) but I still get 1.0 in one of the classes and 0.0 in the others.
What could be causing that? It seems to me that there is no over fitting. Is there any issue with the model?
Could it be that the images of each class are quite similar between them so somehow the model is quickly too confident on its decision?

How to build simple neural network on keras (not image recognition)

I am new to keras and I am trying to built my own neural network.
A task:
I need to write a system that can make decisions for the character, which may meet one or more enemies. The system can be known:
Percentage Health character
Presence of the pistol;
The number of enemies.
The answer must be in the form of one of the following:
Attack
Run
Hide (for a surprise attack)
To do nothing
To train up I made a table of "lessons":
https://i.stack.imgur.com/lD0WX.png
So here is my code:
# Create first network with Keras
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# split into input (X) and output (Y) variables
X = numpy.array([[0.5,1,1], [0.9,1,2], [0.8,0,1], [0.3,1,1], [0.6,1,2], [0.4,0,1], [0.9,1,7], [0.5,1,4], [0.1,0,1], [0.6,1,0], [1,0,0]])
Y = numpy.array([[1],[1],[1],[2],[2],[2],[3],[3],[3],[4],[4]])
# create model
model = Sequential()
model.add(Dense(3, input_dim=3, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
sgd = SGD(lr=0.001)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
# Fit the model
model.fit(X, Y, nb_epoch=150)
# calculate predictions
predictions = model.predict(X)
# round predictions
rounded = [round(x) for x in predictions]
print(rounded)
Here the predictions I get.
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
The accuracy on each epoch is 0.2727 and the loss is decrease.
It's not right.
I was trying to devide learning rate by 10, changing activations and optimizers. Even data I input manually.
Can anyone tell me how to solve my simple problem. thx.
There are several problems in your code.
Number of data entries are very small compared to the NN model.
Y is represented as classes number and not as class vector. A regression model can be learnt on this but its a poor design choice.
output of softmax function is always between 0-1 .. as this is used your model only knows to spew out values between 0-1.
Here below is a bit better modified code:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# split into input (X) and output (Y) variables
X = numpy.array([[0.5,1,1], [0.9,1,2], [0.8,0,1], [0.3,1,1], [0.6,1,2], [0.4,0,1], [0.9,1,7], [0.5,1,4], [0.1,0,1], [0.6,1,0], [1,0,0]])
y = numpy.array([[1],[1],[1],[2],[2],[2],[3],[3],[3],[0],[0]])
from keras.utils import np_utils
Y = np_utils.to_categorical(y, 4)
# print Y
# create model
model = Sequential()
model.add(Dense(3, input_dim=3, activation='relu'))
model.add(Dense(4, activation='softmax'))
# Compile model
# sgd = SGD(lr=0.1)
# model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, nb_epoch=700)
# calculate predictions
predictions = model.predict(X)
predictions_class = predictions.argmax(axis=-1)
print(predictions_class)
Note I have used the softmax activation as the classes are mutually exclusive

How to input data into Keras? Specifically what is the x_train and y_train if I have more than 2 columns?

How can I input data into keras? What is the structure? Specifically what is the x_train and y_train if I have more than 2 columns?
This is the data I want to input:
I am trying to define Xtrain in this example Multi Layer Perceptron Neural Network code Keras has in its documentation. (http://keras.io/examples/) Here is the code:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
EDIT (additional information):
Looking here: What is data type for Python Keras deep learning package?
Keras uses numpy arrays containing the theano.config.floatX floating point type. This can be configured in your .theanorc file. Typically, it will be float64 for CPU computations and float32 for GPU computations, although you can also set it to float32 when working on the CPU if you prefer. You can create a zero-filled array of the proper type by the command
X = numpy.zeros((4,3), dtype=theano.config.floatX)
Question: Step 1 looks like create a floating point numpy array using my above data from the excel file. What do I do with the winner column?
It all depends on your need.
It looks like that you want to predict the winner based on the parameters shown in column A - N. Then you should define input_dim to be 14, and X_train should be an (N,14) numpy array like this:
[
[9278, 37.9, ...],
[18594, 36.3, ...],
...
]
It seems that your prediction set only contains 2 items ( 2 president candidates LOL), so you should encode the answer Y_train in an (N,2) numpy array like this:
[
[1, 0],
[1, 0],
...
[0, 1],
[0, 1],
...
]
where [1,0] indicates that Barack Obama is the winner and vice versa.