Is it normal in PyTorch for accuracy to increase and decrease repeatedly? - neural-network
I am new to PyTorch, currently working on a Transfer Learning simple code. When I am training my model, I am getting a big variance between increase and decrease of the accuracy and loss. I trained the network for 50 epochs, and below is the result:
Epoch [1/50], Loss: 0.5477, Train Accuracy: 63%
Epoch [2/50], Loss: 2.1935, Train Accuracy: 75%
Epoch [3/50], Loss: 1.8811, Train Accuracy: 79%
Epoch [4/50], Loss: 0.0671, Train Accuracy: 77%
Epoch [5/50], Loss: 0.2522, Train Accuracy: 80%
Epoch [6/50], Loss: 0.0962, Train Accuracy: 88%
Epoch [7/50], Loss: 1.8883, Train Accuracy: 74%
Epoch [8/50], Loss: 0.3565, Train Accuracy: 83%
Epoch [9/50], Loss: 0.0228, Train Accuracy: 81%
Epoch [10/50], Loss: 0.0124, Train Accuracy: 81%
Epoch [11/50], Loss: 0.0252, Train Accuracy: 84%
Epoch [12/50], Loss: 0.5184, Train Accuracy: 81%
Epoch [13/50], Loss: 0.1233, Train Accuracy: 86%
Epoch [14/50], Loss: 0.1704, Train Accuracy: 82%
Epoch [15/50], Loss: 2.3164, Train Accuracy: 79%
Epoch [16/50], Loss: 0.0294, Train Accuracy: 85%
Epoch [17/50], Loss: 0.2860, Train Accuracy: 85%
Epoch [18/50], Loss: 1.5114, Train Accuracy: 81%
Epoch [19/50], Loss: 0.1136, Train Accuracy: 86%
Epoch [20/50], Loss: 0.0062, Train Accuracy: 80%
Epoch [21/50], Loss: 0.0748, Train Accuracy: 84%
Epoch [22/50], Loss: 0.1848, Train Accuracy: 84%
Epoch [23/50], Loss: 0.1693, Train Accuracy: 81%
Epoch [24/50], Loss: 0.1297, Train Accuracy: 77%
Epoch [25/50], Loss: 0.1358, Train Accuracy: 78%
Epoch [26/50], Loss: 2.3172, Train Accuracy: 75%
Epoch [27/50], Loss: 0.1772, Train Accuracy: 79%
Epoch [28/50], Loss: 0.0201, Train Accuracy: 80%
Epoch [29/50], Loss: 0.3810, Train Accuracy: 84%
Epoch [30/50], Loss: 0.7281, Train Accuracy: 79%
Epoch [31/50], Loss: 0.1918, Train Accuracy: 81%
Epoch [32/50], Loss: 0.3289, Train Accuracy: 88%
Epoch [33/50], Loss: 1.2363, Train Accuracy: 81%
Epoch [34/50], Loss: 0.0362, Train Accuracy: 89%
Epoch [35/50], Loss: 0.0303, Train Accuracy: 90%
Epoch [36/50], Loss: 1.1700, Train Accuracy: 81%
Epoch [37/50], Loss: 0.0031, Train Accuracy: 81%
Epoch [38/50], Loss: 0.1496, Train Accuracy: 81%
Epoch [39/50], Loss: 0.5070, Train Accuracy: 76%
Epoch [40/50], Loss: 0.1984, Train Accuracy: 77%
Epoch [41/50], Loss: 0.1152, Train Accuracy: 79%
Epoch [42/50], Loss: 0.0603, Train Accuracy: 82%
Epoch [43/50], Loss: 0.2293, Train Accuracy: 84%
Epoch [44/50], Loss: 0.1304, Train Accuracy: 80%
Epoch [45/50], Loss: 0.0381, Train Accuracy: 82%
Epoch [46/50], Loss: 0.1833, Train Accuracy: 84%
Epoch [47/50], Loss: 0.0222, Train Accuracy: 84%
Epoch [48/50], Loss: 0.0010, Train Accuracy: 81%
Epoch [49/50], Loss: 1.0852, Train Accuracy: 79%
Epoch [50/50], Loss: 0.0167, Train Accuracy: 83%
There are some epochs that have a much better accuracy and loss than others. However, the model loses them in later epochs. As I know, the accuracy should improve every epoch. Did I write the training code wrongly? If not, then is that normal? Any way to solve it? Shall the previous accuracy be saved and only if the accuracy of the next epoch is greater than the previous one then train one more epoch? I have been working on Keras previously, and I haven't experienced that problem. I am fine tuning the resent by freezing previous weights and adding only 2 classes for the final layer. Below is my code:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)
num_epochs = 50
for epoch in range (num_epochs):
#Reset the correct to 0 after passing through all the dataset
correct = 0
for images,labels in dataloaders['train']:
images = Variable(images)
labels = Variable(labels)
if torch.cuda.is_available():
images = images.cuda()
labels = labels.cuda()
optimizer.zero_grad()
outputs = model_conv(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
_, predicted = torch.max(outputs, 1)
correct += (predicted == labels).sum()
train_acc = 100 * correct / dataset_sizes['train']
print ('Epoch [{}/{}], Loss: {:.4f}, Train Accuracy: {}%'
.format(epoch+1, num_epochs, loss.item(), train_acc))
I would say it depends on dataset and architecture. Hence, fluctuations are normal, but in general loss should improve.It could be a result of noise in the test dataset, i.e. wrongly labeled examples.
If the test accuracy starts to decrease it might be that your network is overfitting.
You might want to stop the learning just before you reach that point or take other steps to counter the overfitting problem.
Is it normal in PyTorch for accuracy to increase and decrease repeatedly
It should always go down compared on the one epoch level.
Compared to the one batch level it may fluctuate, but generally it should get smaller over time since this is the whole point when we minimize the loss we are improving accuracy.
Related
changing the loss function leads to neural network returns nans
I'm using the Deep SVDD on CIFAR10 for one-class classification. When I change the L2 norm to Lp for p<1 I got nans after some epochs. It is working for loss= torch.mean((outputs - inputs)2) But I got nan for loss= torch.mean((abs(outputs - inputs))(0.9)) The loss for each epoch is shown here: INFO:root: Epoch 1/50 Time: 1.514 Loss: 84.51767029 INFO:root: Epoch 2/50 Time: 1.617 Loss: 82.70055634 INFO:root: Epoch 3/50 Time: 1.528 Loss: 80.92372467 INFO:root: Epoch 4/50 Time: 1.612 Loss: 79.23560699 INFO:root: Epoch 5/50 Time: 1.495 Loss: 77.56893951 INFO:root: Epoch 6/50 Time: 1.596 Loss: 75.95311737 INFO:root: Epoch 7/50 Time: 1.504 Loss: 74.40722260 INFO:root: Epoch 8/50 Time: 1.593 Loss: 72.84329010 INFO:root: Epoch 9/50 Time: 1.639 Loss: 71.34644287 INFO:root: Epoch 10/50 Time: 1.578 Loss: 69.86484253 INFO:root: Epoch 11/50 Time: 1.553 Loss: 68.41005692 INFO:root: Epoch 12/50 Time: 1.670 Loss: 66.96582977 INFO:root: Epoch 13/50 Time: 1.607 Loss: 65.56927887 INFO:root: Epoch 14/50 Time: 1.573 Loss: 64.20584961 INFO:root: Epoch 15/50 Time: 1.605 Loss: 62.85230591 INFO:root: Epoch 16/50 Time: 1.483 Loss: 61.53305466 INFO:root: Epoch 17/50 Time: 1.616 Loss: 60.22836166 INFO:root: Epoch 18/50 Time: 1.499 Loss: 58.94760498 INFO:root: Epoch 19/50 Time: 1.611 Loss: 57.73990845 INFO:root: Epoch 20/50 Time: 1.507 Loss: 56.51732086 INFO:root: Epoch 21/50 Time: 1.624 Loss: 55.30994400 INFO:root: Epoch 22/50 Time: 1.482 Loss: 54.13251587 INFO:root: Epoch 23/50 Time: 1.606 Loss: 52.98952118 INFO:root: Epoch 24/50 Time: 1.508 Loss: 51.86713654 INFO:root: Epoch 25/50 Time: 1.587 Loss: 50.76639069 INFO:root: Epoch 26/50 Time: 1.523 Loss: 49.68750381 INFO:root: Epoch 27/50 Time: 1.574 Loss: 48.62197098 INFO:root: Epoch 28/50 Time: 1.537 Loss: 47.59307220 INFO:root: Epoch 29/50 Time: 1.560 Loss: 46.58890167 INFO:root: Epoch 30/50 Time: 1.607 Loss: 45.59774643 INFO:root: Epoch 31/50 Time: 1.504 Loss: 44.61755203 INFO:root: Epoch 32/50 Time: 1.592 Loss: 43.67579239 INFO:root: Epoch 33/50 Time: 1.480 Loss: 42.76135941 INFO:root: Epoch 34/50 Time: 1.577 Loss: 41.84933487 INFO:root: Epoch 35/50 Time: 1.488 Loss: 40.96647171 INFO:root: Epoch 36/50 Time: 1.596 Loss: 40.10220779 INFO:root: Epoch 37/50 Time: 1.534 Loss: 39.26658310 INFO:root: Epoch 38/50 Time: 1.615 Loss: 38.44916168 INFO:root: Epoch 39/50 Time: 1.518 Loss: nan INFO:root: Epoch 40/50 Time: 1.574 Loss: nan INFO:root: Epoch 41/50 Time: 1.511 Loss: nan INFO:root: Epoch 42/50 Time: 1.556 Loss: nan INFO:root: Epoch 43/50 Time: 1.565 Loss: nan INFO:root: Epoch 44/50 Time: 1.561 Loss: nan INFO:root: Epoch 45/50 Time: 1.600 Loss: nan INFO:root: Epoch 46/50 Time: 1.518 Loss: nan INFO:root: Epoch 47/50 Time: 1.618 Loss: nan INFO:root: Epoch 48/50 Time: 1.540 Loss: nan INFO:root: Epoch 49/50 Time: 1.591 Loss: nan INFO:root: Epoch 50/50 Time: 1.504 Loss: nan For different learning rates and output dimensions still, the network returns nan after some epochs
What should I consider in my situation to decrease the val_loss?
I am new in cnn, and I wanted to know how may I improve my model? Augmentation is already done. Thanks in advance. model = Sequential() model.add(Conv2D(16, (3,3), activation='relu', strides=(1,1), padding='same', input_shape=input_shape)) model.add(Conv2D(32, (3,3), activation='relu', strides=(1,1), padding='same')) model.add(Conv2D(64, (3,3), activation='relu', strides=(1,1), padding='same')) #model.add(Conv2D(128, (3,3), activation='relu', strides=(1,1), # padding='same')) #model.add(MaxPool2D(2,2))#AveragePooling2D model.add(AveragePooling2D(2,2))#AveragePooling2D model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(64, activation='relu')) model.add(Dense(10, activation='softmax')) model.summary() #opt = keras.optimizers.Adam(learning_rate=0.001) model.compile(loss='categorical_crossentropy', optimizer= "Adam", metrics=\['acc'\] )][1]][1] history = model.fit(X, y, epochs=150, batch_size=32, shuffle=True, validation_split=0.1 callbacks = [checkpoint]) Epoch 00140: val_acc did not improve from 0.93082 Epoch 141/150 28620/28620 [==============================] - 37s 1ms/step - loss: 0.1654 - acc: 0.9401 - val_loss: 0.2388 - val_acc: 0.9267 Epoch 00141: val_acc did not improve from 0.93082 Epoch 142/150 28620/28620 [==============================] - 38s 1ms/step - loss: 0.1314 - acc: 0.9516 - val_loss: 0.2728 - val_acc: 0.9091 Epoch 00142: val_acc did not improve from 0.93082 Epoch 143/150 28620/28620 [==============================] - 37s 1ms/step - loss: 0.1425 - acc: 0.9476 - val_loss: 0.2439 - val_acc: 0.9242 Epoch 00143: val_acc did not improve from 0.93082 Epoch 144/150 28620/28620 [==============================] - 37s 1ms/step - loss: 0.1434 - acc: 0.9473 - val_loss: 0.3709 - val_acc: 0.8824 Epoch 00144: val_acc did not improve from 0.93082 Epoch 145/150 28620/28620 [==============================] - 37s 1ms/step - loss: 0.1483 - acc: 0.9468 - val_loss: 0.2544 - val_acc: 0.9208 Epoch 00145: val_acc did not improve from 0.93082 Epoch 146/150 28620/28620 [==============================] - 35s 1ms/step - loss: 0.1366 - acc: 0.9501 - val_loss: 0.2872 - val_acc: 0.9110 Epoch 00146: val_acc did not improve from 0.93082 Epoch 147/150 28620/28620 [==============================] - 36s 1ms/step - loss: 0.1476 - acc: 0.9465 - val_loss: 0.3147 - val_acc: 0.9013 Epoch 00147: val_acc did not improve from 0.93082 Epoch 148/150 28620/28620 [==============================] - 36s 1ms/step - loss: 0.1391 - acc: 0.9486 - val_loss: 0.2838 - val_acc: 0.9069 Epoch 00148: val_acc did not improve from 0.93082 Epoch 149/150 28620/28620 [==============================] - 35s 1ms/step - loss: 0.1392 - acc: 0.9486 - val_loss: 0.2541 - val_acc: 0.9211 Epoch 00149: val_acc did not improve from 0.93082 Epoch 150/150 28620/28620 [==============================] - 37s 1ms/step - loss: 0.1401 - acc: 0.9489 - val_loss: 0.2213 - val_acc: 0.9308 Epoch 00150: val_acc did not improve from 0.93082
Why does loss decrease but accuracy decreases too (Pytorch, LSTM)?
I have built a model with LSTM - Linear modules in Pytorch for a classification problem (10 classes). I am training the model and for each epoch I output the loss and accuracy in the training set. The ouput is as follows: epoch: 0 start! Loss: 2.301875352859497 Acc: 0.11388888888888889 epoch: 1 start! Loss: 2.2759320735931396 Acc: 0.29 epoch: 2 start! Loss: 2.2510263919830322 Acc: 0.4872222222222222 epoch: 3 start! Loss: 2.225804567337036 Acc: 0.6066666666666667 epoch: 4 start! Loss: 2.199286699295044 Acc: 0.6511111111111111 epoch: 5 start! Loss: 2.1704766750335693 Acc: 0.6855555555555556 epoch: 6 start! Loss: 2.1381614208221436 Acc: 0.7038888888888889 epoch: 7 start! Loss: 2.1007182598114014 Acc: 0.7194444444444444 epoch: 8 start! Loss: 2.0557992458343506 Acc: 0.7283333333333334 epoch: 9 start! Loss: 1.9998993873596191 Acc: 0.7427777777777778 epoch: 10 start! Loss: 1.9277743101119995 Acc: 0.7527777777777778 epoch: 11 start! Loss: 1.8325848579406738 Acc: 0.7483333333333333 epoch: 12 start! Loss: 1.712520718574524 Acc: 0.7077777777777777 epoch: 13 start! Loss: 1.6056485176086426 Acc: 0.6305555555555555 epoch: 14 start! Loss: 1.5910680294036865 Acc: 0.4938888888888889 epoch: 15 start! Loss: 1.6259561777114868 Acc: 0.41555555555555557 epoch: 16 start! Loss: 1.892195224761963 Acc: 0.3655555555555556 epoch: 17 start! Loss: 1.4949012994766235 Acc: 0.47944444444444445 epoch: 18 start! Loss: 1.4332982301712036 Acc: 0.48833333333333334 For loss function I have used nn.CrossEntropyLoss and Adam Optimizer. Although the loss is constantly decreasing, the accuracy increases until epoch 10 and then begins for some reason to decrease. Why is this happening ? Even if my model is overfitting, doesn't that mean that the accuracy should be high ?? (always speaking for accuracy and loss measured on the training set, not the validation set)
Decreasing loss does not mean improving accuracy always. I will try to address this for the cross-entropy loss. CE-loss= sum (-log p(y=i)) Note that loss will decrease if the probability of correct class increases and loss increases if the probability of correct class decreases. Now, when you compute average loss, you are averaging over all the samples, some of the probabilities may increase and some of them can decrease, making overall loss smaller but also accuracy drops.
Test score vs test accuracy when evaluating model using Keras
Im using a neural network implemented with the Keras library and below is the results during training. At the end it prints a test score and a test accuracy. I can't figure out exactly what the score represents, but the accuracy I assume to be the number of predictions that was correct when running the test. Epoch 1/15 1200/1200 [==============================] - 4s - loss: 0.6815 - acc: 0.5550 - val_loss: 0.6120 - val_acc: 0.7525 Epoch 2/15 1200/1200 [==============================] - 3s - loss: 0.5481 - acc: 0.7250 - val_loss: 0.4645 - val_acc: 0.8025 Epoch 3/15 1200/1200 [==============================] - 3s - loss: 0.5078 - acc: 0.7558 - val_loss: 0.4354 - val_acc: 0.7975 Epoch 4/15 1200/1200 [==============================] - 3s - loss: 0.4603 - acc: 0.7875 - val_loss: 0.3978 - val_acc: 0.8350 Epoch 5/15 1200/1200 [==============================] - 3s - loss: 0.4367 - acc: 0.7992 - val_loss: 0.3809 - val_acc: 0.8300 Epoch 6/15 1200/1200 [==============================] - 3s - loss: 0.4276 - acc: 0.8017 - val_loss: 0.3884 - val_acc: 0.8350 Epoch 7/15 1200/1200 [==============================] - 3s - loss: 0.3975 - acc: 0.8167 - val_loss: 0.3666 - val_acc: 0.8400 Epoch 8/15 1200/1200 [==============================] - 3s - loss: 0.3916 - acc: 0.8183 - val_loss: 0.3753 - val_acc: 0.8450 Epoch 9/15 1200/1200 [==============================] - 3s - loss: 0.3814 - acc: 0.8233 - val_loss: 0.3505 - val_acc: 0.8475 Epoch 10/15 1200/1200 [==============================] - 3s - loss: 0.3842 - acc: 0.8342 - val_loss: 0.3672 - val_acc: 0.8450 Epoch 11/15 1200/1200 [==============================] - 3s - loss: 0.3674 - acc: 0.8375 - val_loss: 0.3383 - val_acc: 0.8525 Epoch 12/15 1200/1200 [==============================] - 3s - loss: 0.3624 - acc: 0.8367 - val_loss: 0.3423 - val_acc: 0.8650 Epoch 13/15 1200/1200 [==============================] - 3s - loss: 0.3497 - acc: 0.8475 - val_loss: 0.3069 - val_acc: 0.8825 Epoch 14/15 1200/1200 [==============================] - 3s - loss: 0.3406 - acc: 0.8500 - val_loss: 0.2993 - val_acc: 0.8775 Epoch 15/15 1200/1200 [==============================] - 3s - loss: 0.3252 - acc: 0.8600 - val_loss: 0.2960 - val_acc: 0.8775 400/400 [==============================] - 0s Test score: 0.299598811865 Test accuracy: 0.88 Looking at the Keras documentation, I still don't understand what score is. For the evaluate function, it says: Returns the loss value & metrics values for the model in test mode. One thing I noticed is that when the test accuracy is lower, the score is higher, and when accuracy is higher, the score is lower.
For reference, the two relevant parts of the code: model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) score, acc = model.evaluate(x_test, y_test, batch_size=batch_size) print('Test score:', score) print('Test accuracy:', acc) Score is the evaluation of the loss function for a given input. Training a network is finding parameters that minimize a loss function (or cost function). The cost function here is the binary_crossentropy. For a target T and a network output O, the binary crossentropy can defined as f(T,O) = -(T*log(O) + (1-T)*log(1-O) ) So the score you see is the evaluation of that. If you feed it a batch of inputs it will most likely return the mean loss. So yeah, if your model has lower loss (at test time), it should often have lower prediction error.
Loss is often used in the training process to find the "best" parameter values for your model (e.g. weights in neural network). It is what you try to optimize in the training by updating weights. Accuracy is more from an applied perspective. Once you find the optimized parameters above, you use this metrics to evaluate how accurate your model's prediction is compared to the true data. This answer provides a detailed info: How to interpret "loss" and "accuracy" for a machine learning model
TFLEARN multivariable regression does not converge (attempting to duplicate fitlab fitnet)
I am trying to write a model in TFLEARN to fit to 16 parameters. I have previously run this same experiment in Matlab using the "fitnet" function with 2 hidden layers of 2000 and 1500 nodes. I am attempting to replicate these results in tensorflow before exploring other architectures/descent algos/hyperparameter tuning. I have done some research and determined the matlab fitnet function uses tanh nodes for hidden layers and linear for output. Also, the descent algorithm is defaulted to levenberg-Marquardt, but worked for me with other (sgd) algorithms as well. It appears that the accuracy is maxing out around .2, and then oscillating below this over successive epochs. I did not see this behavior in matlab. My TFLEARN code looks like: tnorm = tflearn.initializations.uniform_scaling() adam = tflearn.optimizers.Adam (learning_rate=0.1, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False, name='Adam') # network building input_data = tflearn.input_data(shape=[None, np.shape(prepared_x)[1]]) fc1 = tflearn.fully_connected(input_data, 2000,activation='tanh',weights_init=tnorm) fc2 = tflearn.fully_connected(fc1,1500,activation='tanh',weights_init=tnorm) output = tflearn.fully_connected(fc2, 16, activation='linear',weights_init=tnorm) network = tflearn.regression(output, optimizer=adam, loss='mean_square') #define model with checkpoints model = tflearn.DNN(network, tensorboard_dir='output/', tensorboard_verbose=3, checkpoint_path='output') #Train Model model.fit(prepared_x, prepared_t, n_epoch=5, batch_size=100,shuffle=True, show_metric=True, snapshot_epoch=False,validation_set=0.1 ) #save model.save('TFLEARN_FC_final.tfl') The output of the traing session looks like: Run id: UTSD6N Log directory: output/ [?25l--------------------------------- Training samples: 43200 Validation samples: 4800 -- Training Step: 1 [2K | Adam | epoch: 000 | loss: 0.00000 - acc: 0.0000 -- iter: 00100/43200 [A[ATraining Step: 2 | total loss: [1m[32m0.67871[0m[0m [2K | Adam | epoch: 000 | loss: 0.67871 - acc: 0.0455 -- iter: 00200/43200 [A[ATraining Step: 3 | total loss: [1m[32m33.14599[0m[0m [2K | Adam | epoch: 000 | loss: 33.14599 - acc: 0.0082 -- iter: 00300/43200 [A[ATraining Step: 4 | total loss: [1m[32m28.01067[0m[0m [2K | Adam | epoch: 000 | loss: 28.01067 - acc: 0.0021 -- iter: 00400/43200 [A[ATraining Step: 5 | total loss: [1m[32m17.35706[0m[0m [2K | Adam | epoch: 000 | loss: 17.35706 - acc: 0.0006 -- iter: 00500/43200 [A[ATraining Step: 6 | total loss: [1m[32m9.73368[0m[0m [2K | Adam | epoch: 000 | loss: 9.73368 - acc: 0.0002 -- iter: 00600/43200 [A[ATraining Step: 7 | total loss: [1m[32m5.19867[0m[0m [2K | Adam | epoch: 000 | loss: 5.19867 - acc: 0.0001 -- iter: 00700/43200 [A[ATraining Step: 8 | total loss: [1m[32m3.54779[0m[0m [2K | Adam | epoch: 000 | loss: 3.54779 - acc: 0.0113 -- iter: 00800/43200 [A[ATraining Step: 9 | total loss: [1m[32m3.80998[0m[0m [2K | Adam | epoch: 000 | loss: 3.80998 - acc: 0.0106 -- iter: 00900/43200 [A[ATraining Step: 10 | total loss: [1m[32m4.33370[0m[0m [2K | Adam | epoch: 000 | loss: 4.33370 - acc: 0.0053 -- iter: 01000/43200 [A[ATraining Step: 11 | total loss: [1m[32m4.24100[0m[0m [2K ... [2K | Adam | epoch: 004 | loss: 0.02448 - acc: 0.1817 -- iter: 42800/43200 [A[ATraining Step: 2157 | total loss: [1m[32m0.02633[0m[0m [2K | Adam | epoch: 004 | loss: 0.02633 - acc: 0.1875 -- iter: 42900/43200 [A[ATraining Step: 2158 | total loss: [1m[32m0.02509[0m[0m [2K | Adam | epoch: 004 | loss: 0.02509 - acc: 0.1688 -- iter: 43000/43200 [A[ATraining Step: 2159 | total loss: [1m[32m0.02525[0m[0m [2K | Adam | epoch: 004 | loss: 0.02525 - acc: 0.1529 -- iter: 43100/43200 [A[ATraining Step: 2160 | total loss: [1m[32m0.02695[0m[0m [2K | Adam | epoch: 005 | loss: 0.02695 - acc: 0.1456 -- iter: 43200/43200 image of accuracy/loss from tensorboard Any suggestions would be much appreciated.
For any future lurkers -- I solved my own problem by fixing the descent algorithm. The default learning rate for the Adam optimizer is .001 but this was too high, I had to switch to .005 for convergence.