Validation accuracy in Keras not consistent - neural-network

I am working on a binary classification model (CNN) and I have separated my classes in different folders in /data/train and /data/validation accordingly
Last layers:
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
I get this message which indicates that classes are correctly linked to my data:
Found 638 images belonging to 2 classes. Found 214 images belonging to
2 classes.
The weird thing is that I get really good val_acc:
loss: 0.2453 - acc: 0.8858 - val_loss: 0.2000 - val_acc: 0.9231
Which does not seem valid. In random tests in 15 images of all the dataset produce the same result: 1 for model.predict_classes(x)
Why is val_acc so high?

Related

why 1st epoch takes a lot of time in training neural network model?

I have a doubt about the training my neural network, so my first epoch takes the most time, for example right now first epoch takes around 50 mins while subsequent epoch takes only 2 mins, why is it the case?
where should I look for to resolve this if its the problem?
Here is the model code for the reference :
model = Sequential()
model.add(Conv3D(2, (3,3,3), padding = 'same', input_shape= [num_of_frame,
img_rows,img_cols, img_channels] ))
model.add(Activation('relu'))
model.add(Conv3D(64, (3,3,3)))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(32))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
I am using adam for optimizer and batch size is 30, running it on Google Colab.
Here is the code for training and validation :
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1,
callbacks=callbacks_list, validation_data=val_generator,
validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)
so it turns out, as my training code is merged with generator, the overall time was more. Once I called next() manually on my datagenerator, All epochs started behaving in similar way.

Mixed-effects linear regression model using multiple independent measurements

I am trying to implement a linear mixed effect (LME) regression model for an x-ray imaging quality metric "CNR" (contrast-to-noise ratio) for which I measured for various tube potentials (kV) and filtration materials (Filter). CNR was measured for 3 consecutive slices so I have a standard deviation of the CNR from these independent measurements as well. I am wondering how I can incorporate these multiple independent measurements in my analysis. A representation of the data for a single measurement and my first attempt using fitlme is shown below. I tried looking at online resources but could not find an answer to my specific questions.
kV=[80 90 100 80 90 100 80 90 100]';
Filter={'Al','Al','Al','Cu','Cu','Cu','Ti','Ti','Ti'}';
CNR=[10 9 8 10.1 8.9 7.9 7 6 5]';
T=table(kV,Filter,CNR);
kV Filter CNR
___ ______ ___
80 'Al' 10
90 'Al' 9
100 'Al' 8
80 'Cu' 10.1
90 'Cu' 8.9
100 'Cu' 7.9
80 'Ti' 7
90 'Ti' 6
100 'Ti' 5
OUTPUT
Linear mixed-effects model fit by ML
Model information:
Number of observations 9
Fixed effects coefficients 4
Random effects coefficients 0
Covariance parameters 1
Formula:
CNR ~ 1 + kV + Filter
Model fit statistics:
AIC BIC LogLikelihood Deviance
-19.442 -18.456 14.721 -29.442
Fixed effects coefficients (95% CIs):
Name Estimate SE pValue
'(Intercept)' 18.3 0.17533 1.5308e-09
'kV' -0.10333 0.0019245 4.2372e-08
'Filter_Cu' -0.033333 0.03849 -0.86603
'Filter_Ti' -3 0.03849 -77.942
Random effects covariance parameters (95% CIs):
Group: Error
Name Estimate Lower Upper
'Res Std' 0.04714 0.0297 0.074821
Questions/Issues with current implementation:
How is the fixed effects coefficients for '(Intercept)' with P=1.53E-9 interpreted?
I only included fixed effects. Should the standard deviation of the ROI measurements somehow be incorporated into the random effects as well?
How do I incorporate the three independent measurements of CNR for three consecutive slices for a give kV/filter combination? Should I just add more rows to the table "T"? This would result in a total of 27 observations.

univariate time series multi step ahead prediction using multi-layer-perceptron(MLP)

I have a univariate time series data. I want to do a multistep prediction.
I came across this question which explains time series one step prediction.
but I am interested in multistep ahead prediction.
e.g typical univariate time series data looks like
time value
---- ------
t1 a1
t2 a2
..........
..........
t100 a100.
Suppose, I want 3 step ahead prediction.
Can I frame my problem like
TrainX TrainY
[a1,a2,a3,a4,a5,a6] -> [a7,a8,a9]
[a2,a3,a4,a5,a6,a7] -> [a8,a9,a10]
[a3,a4,a5,a6,a7,a8] -> [a9,a10,a11]
.................. ...........
.................. ...........
I am using keras and tensorflow as backend
First layer has 50 neurons and expects 6 inputs.
hidden layer has 30 neurons
output layer has 3 neurons i.e (outputs three time series values)
model = Sequential()
model.add(Dense(50, input_dim=6, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(30, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(3))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(TrainX, TrainY, epochs=300, batch_size=16)
My model will be able to predict a107,a108,a109
,when my input is a101,a102,a103,a104,a105,a106
Is this a valid model ? Am I missing some thing?
That model might do it, but you should probably benefit from using LSTM layers (recurrent networks for sequences).
#TrainX.shape = (total of samples, time steps, features per step)
#TrainX.shape = (total of samples, 6, 1)
model.add(LSTM(50,input_shape=(6,1),return_sequences=True, ....))
model.add(LSTM(30,return_sequences=True, ....))
model.add(LSTM(3,return_sequences=False, ....))
You may be missing an activation function that limits the result to the possible range of the value you want to predict.
Often we work with values from 0 to 1 (activation='sigmoid') or from -1 to 1 (activation='tanh').
This would also require that the input be limited to these values, since inputs and outputs are the same.

Using Torch's ClassNLLCriterion

I am currently using Torch and just trying to get a simple neural network program running. Each one of my inputs has 3 attributes and the output is supposed to be a classification between the numbers 1 and 7. I've extracted my data from a CSV file and have put it into 2 Tensors (1 with the inputs and 1 with the outputs). The data is in this format.
**Data**
1914 1993 2386
1909 1990 2300
.....
1912 1989 2200
[torch.DoubleTensor of size 99999x3]
**Class**
1
1
2
.....
7
[torch.DoubleTensor of size 99999]
For the model I'm using to train the network, I simply have
model = nn.Sequential()
model:add(nn.Linear(3, 7))
model:add(nn.LogSoftMax())
criterion = nn.ClassNLLCriterion()
And this is the code I have to train the network
for int i = 1, 10 do
prediction = model:forward(data)
loss = criterion:forward(prediction, class)
model:zeroGradParameters()
grad = criterion:backward(prediction, class)
model:backward(data, grad)
model:updateParameters(.1)
end
In my test data tensor, I have formatted it in the same way as I formatted the test data (Tensor of 99999x3). I want the program to give me a prediction of what the classification would be when I run this line.
print (model:forward(test_data))
However, I am getting negative numbers (which shouldn't happen with ClassNLLCriterion?) and the sums of the probabilities are not adding to 0. My doubt is that I have either not formatted the data correctly or that I wasn't able to perform the training process correctly. If anyone could help me figure out what the issue is, I'd be very grateful.
Thank you!
The reason why you cannot see the prediction yields on the layer model:add(nn.LogSoftMax()) which implements the log function, that's why you have negative values (they are not probabilities). As an example, to get the probabilities back you should do:
model = nn.Sequential()
model:add(nn.Linear(3, 7));
model:add(nn.LogSoftMax());
criterion = nn.ClassNLLCriterion();
data = torch.Tensor{1914, 1993 , 2386}
print (model:forward(data):exp())
>> 0.0000
0.0000
1.0000
0.0000
0.0000
0.0000
0.0000 [torch.DoubleTensor of size 7]
Sorry for the late answer.
Here is what I currently use which maybe the wrong way of using classnllcriterion but atleast it will get you somewhere to understanding it.
Make the targets to be either
(7,1,1,1,1,1,1) <--First class representation
.......
(1,1,1,1,1,1,7) <--Last class representation
or
(1,1,1,1,1,1,1) <--First class representation
.......
(7,7,7,7,7,7,7) <--Last class representation
I figured it's much easier to train the last representation as targets, but I have a feeling we should use the first one instead.
EDIT: I've just found out that classnllcriterion only accepts a scalars as targets, hence using the above is wrong!
You should instead use either
1 .. 7 as target values, either just 1 or just 7.
That's

Interpretation of Probability Estimate for Multi-class classification in LibSVM for MATLAB

Problem: 3 class classification with labels 1,2,3.
Tool: LibSVM for MATLAB
svmModel = svmtrain(<Trainfeatures>, <TrainclassLabels>, '-b 1 -c <someCValue> -g <someGammaValue>');
[predLabels, classAccuracy, **probEstimates**] = svmpredict(<TestFeatures>, <TestClassLabels>, '-b 1');
AFter this step, I get the first ten rows of probEstimates to be,
0.9129 0.0749 0.0122
0.9059 0.0552 0.0389
0.8231 0.0183 0.1586
0.9077 0.0098 0.0825
0.9074 0.0668 0.0257
0.8685 0.0146 0.1169
0.8962 0.0664 0.0374
0.9074 0.0548 0.0377
0.9474 0.0054 0.0472
0.9178 0.0642 0.0180
but the first ten predicted labels to be:
2
2
2
2
2
2
2
2
2
2
Questions:
My understanding was that the probability estimate was the probability that a particular item would belong to a particular class, given its feature vector. However, if that were true, then these items should belong to class 1 and not class 2. Does the libsvm change the order of classes or am I missing something here? If I am wrong, can someone please explain what the real interpretation of probability estimate is?
If I have to move the decision boundary to increase the precision of class 1 (have less items to be predicted to be class 1 and hence be more conservative in the decision boundary), which of these class probabilities should I have to deal with and how?
I came across the same problem recently.
The reason is related to the order of training data.
If you want the index of post-probability vector to correspond to the label of training data, the training data should be sorted according to the label.
For example, if the label of the the first data point is 4, then the first entry of post-probability vector is related to data points labeled 4.
The order of the the labels stored in the model may different from what we thought it should be. You can check using svmModel.Label. And the probability estimates are outputted according to this order.