How explain that no negative is predicted? - neural-network

I have built this model with Keras :
model = Sequential()
model.add(LSTM(50, return_sequences=True,input_shape=(look_back, trainX.shape[2])))
model.add(LSTM(50))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics=['accuracy'])
model.fit(trainX, trainY,validation_split=0.3, epochs=50, batch_size=1000, verbose=1)
and the results are surprising... When I do this :
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
print(confusion_matrix(trainY, trainPredict.round()))
print(confusion_matrix(testY, testPredict.round()))
I respectively get :
[[129261 0]
[ 172 129138]]
and
[[10822 0]
[10871 0]]
In other words, My training confusion matrix is quite fine while my testing confusion matrix classifies everybody as "positive". What is surprising is that I have quite perfectly balanced instances, both in training and testing set...
Why do I have this ?
Thanks

Related

What is wrong with my siamese network? Why does it output the same value(appx 0.5) irrespective of the input pairs?

I'm trying to build a Siamese Network for https://www.kaggle.com/moltean/fruits dataset. I've picked 10 Images per class from this dataset. There are a total of 131 classes in this dataset. I'm using the below model to train my network. However, it is failing to converge. I saw a strange behaviour, after 3000 epochs my results are 0.5000003 irrespective of the input pair I give and my loss stops at 0.61. The specifications of the network are as specified in the paper. I tried changing the following things,
Changing Denes layer activation to ReLU
Importing 'ImageNet' weights of ResNet50
Tried increasing and decreasing learning rate.
I also checked the batch inputs to see if the correct input pair (x) is paired with the correct y value. However, I think I'm doing something basically wrong. Glad if you could help me. Thank you :)
The notebook is hosted in Kaggle https://www.kaggle.com/krishnaprasad96/siamese-network.
If you have some doubts on how certain parts of the code works refer https://medium.com/#krishnaprasad_54871/siamese-networks-line-by-line-explanation-for-beginners-55b8be1d2fc6
#Building a sequential model
input_shape=(100, 100, 3)
left_input = Input(input_shape)
right_input = Input(input_shape)
W_init = keras.initializers.RandomNormal(mean = 0.0, stddev = 1e-2)
b_init = keras.initializers.RandomNormal(mean = 0.5, stddev = 1e-2)
model = keras.models.Sequential([
keras.layers.Conv2D(64, (10,10), activation='relu', input_shape=input_shape, kernel_initializer=W_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (7,7), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(256, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='sigmoid', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(1e-3))
])
encoded_l = model(left_input)
encoded_r = model(right_input)
subtracted = keras.layers.Subtract()([encoded_l, encoded_r])
prediction = Dense(1, activation='sigmoid', bias_initializer=b_init)(subtracted)
siamese_net = Model(input=[left_input, right_input], output=prediction)
optimizer= Adam(learning_rate=0.0006)
siamese_net.compile(loss='binary_crossentropy', optimizer=optimizer)
plot_model(siamese_net, show_shapes=True, show_layer_names=True)
I have seen the notebook on kaggle. Thanks for all the information. But it seems that training and validation spilt is wrong. As this model trains on initial 91 classes only. What about remaining 40 classes. Train and validation spilt should be from the same class. Suppose I have 10 images in a class. I can use 8 image for train and 2 images for validation. Train and validation spilt should be on images not on classes. Also I couldn't see the testing script. It would be a great help if you can provide that also.

Keras Regression to approximate function (goal: loss < 1e-7)

I'm working on a neural network which approximates a function f(X)=y, with X a vector [x0, .., xn] and y in [-inf, +inf]. This approximated function needs to have an accuracy (sum of errors) around 1e-8. In fact, I need my neural network to overfit.
X is composed of random points in the interval -500 and 500. Before putting these points into the input layer I normalized them between [0, 1].
I use keras as follow:
dimension = 10 #example
self.model = Sequential()
self.model.add(Dense(128, input_shape=(dimension,), init='uniform', activation='relu'))
self.model.add(Dropout(.2))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Activation("linear"))
self.model.add(Dense(64, init='uniform', activation='relu'))
self.model.add(Dense(1))
X_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
y_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
X_scaled = (X_scaler.fit_transform(train_dataset))
y_scaled = (y_scaler.fit_transform(train_labels))
self.model.compile(loss='mse', optimizer='adam')
self.model.fit(X_scaled, y_scaled, epochs=10000, batch_size=10, verbose=1)
I tried different NN, first [n] -> [2] -> [1] with Relu activation function, then [n] -> [128] -> [64] -> [1].
I tried the SGB Optimizer and I slowly increase the learning rate from 1e-9 to 0.1.
I also tried without normalized the data but, in this case, the loss is very high.
My best loss (MSE) is 0.037 with the current setup but i'm far from my goal (1e-8).
First, I would like to know if I did something wrong. I'm in the good way ?
If not, how can I reach my goal ?
Thanks you very much
Try #2
I tried this new configuration:
model = Sequential()
model.add(Dense(128, input_shape=(10,), init='uniform', activation='relu'))
model.add(Dropout(.2))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(64, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
On a sample of 50 elements, batch_size at 10 and during 100000 epochs.
I get a loss around 1e-4.
Try #3
model.add(Dense(128, input_shape=(10,), activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(1, activation='sigmoid'))
batch_size=1000
epochs=1e5
result: Loss around 1.e-7

NN with Keras predicts classes as dtype=float32 as oppose to true class values of 1,2,3, why?

I am implementing a simple NN on wine data set. The NN works well and produces the prediction score, however, when I am trying to explore the actual predicted values on the test data set, I receive an array with dtype=float32 values, as oppose to values of the classes.
The classes are labelled as 1, 2, 3
I have 13 attributes and 178 observations (small data set)
Below is the the code on the implementation and the outcome I get:
df.head()
enter image description here
X=df.ix[:,1:13]
y= np.ravel(df.Type)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
scale the data:
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
define the NN
model = Sequential()
model.add(Dense(13, activation='relu', input_shape=(12,)))
model.add(Dense(4, activation='softmax'))
fit the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train1,epochs=20, batch_size=1, verbose=1)
Now this is where I store my predictions into y_pred and get the final score:
`y_pred = model.predict(X_test)`
`score = model.evaluate(X_test, y_test1,verbose=1)`
`59/59 [==============================] - 0s 2ms/step
[0.1106848283591917, 0.94915255247536356]`
When i explore y_pred I see the following:
`y_pred[:5]`
`array([[ 3.86571424e-04, 9.97601926e-01, 1.96467945e-03,
4.67598657e-05],
[ 2.67244829e-03, 9.87006545e-01, 7.04612210e-03,
3.27492505e-03],
[ 9.50196641e-04, 1.42343721e-04, 4.57215495e-02,
9.53185916e-01],
[ 9.03929677e-03, 9.63497698e-01, 2.62350030e-02,
1.22799736e-03],
[ 1.39460826e-05, 3.24015366e-03, 9.96408522e-01,
3.37353966e-04]], dtype=float32)`
Not sure why I do not see the actual predicted classes as 1,2,3?
After trying to convert into int I just get an array of zeros, as all values are so small.
Really appreciate your help!!
You are seeing the probabilities for each class. To convert probabilities to class just take the max of each case.
import numpy as np
y_pred_class = np.argmax(y_pred,axis=1)

DNN model - strange results. "Loss" results different in Train/Testing and in Kaggle

I have trained a simple DNN model - using Keras with Theano backend, having:
2 * (3x3 conv layer, 2x2 max-pool, 15% dropout),
96-FC layer,
softmax activation on the final layer,
optimizer: adam.
The goal was to perform a classification into 6 classes.
I have a dataset which consist of 116000 equally distributed among all classes images.
After 14 epochs - the validation results were great:
loss: 0.1, accuracy 92% on a test data - and during the training I had a similar results on my training and validation data.
The task is for learning purpose only - defined in Kaggle website. They want a minimum log-loss result. When I upload my predictions there - the calculated loss is very high ( > 2)
Do you have any suggestions?
some code representing the network and how I use it:
1. Inputs:
116000 images of size 160x160
2. Model
model = Sequential()
model.add(Convolution2D(64, 3, 3, activation='relu', input_shape=(1, 160, 160)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Flatten())
model.add(Dense(96, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation='softmax'))
3. Compile and fit:
model.compile(loss='categorical_crossentropy',#categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(np.array(trainData[0]), np.array(trainData[1]),
batch_size=13, nb_epoch=14, verbose=1, shuffle=True, callbacks=callbacks_list, validation_data = (np.array(validationData[0]), np.array(validationData[1])))
4. Test with a new set of samples:
scores = model.evaluate(np.array(testData[0]), np.array(testData[1]), verbose=1)

How to input data into Keras? Specifically what is the x_train and y_train if I have more than 2 columns?

How can I input data into keras? What is the structure? Specifically what is the x_train and y_train if I have more than 2 columns?
This is the data I want to input:
I am trying to define Xtrain in this example Multi Layer Perceptron Neural Network code Keras has in its documentation. (http://keras.io/examples/) Here is the code:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(64, input_dim=20, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
score = model.evaluate(X_test, y_test, batch_size=16)
EDIT (additional information):
Looking here: What is data type for Python Keras deep learning package?
Keras uses numpy arrays containing the theano.config.floatX floating point type. This can be configured in your .theanorc file. Typically, it will be float64 for CPU computations and float32 for GPU computations, although you can also set it to float32 when working on the CPU if you prefer. You can create a zero-filled array of the proper type by the command
X = numpy.zeros((4,3), dtype=theano.config.floatX)
Question: Step 1 looks like create a floating point numpy array using my above data from the excel file. What do I do with the winner column?
It all depends on your need.
It looks like that you want to predict the winner based on the parameters shown in column A - N. Then you should define input_dim to be 14, and X_train should be an (N,14) numpy array like this:
[
[9278, 37.9, ...],
[18594, 36.3, ...],
...
]
It seems that your prediction set only contains 2 items ( 2 president candidates LOL), so you should encode the answer Y_train in an (N,2) numpy array like this:
[
[1, 0],
[1, 0],
...
[0, 1],
[0, 1],
...
]
where [1,0] indicates that Barack Obama is the winner and vice versa.