Validation Accuracy Higher than Training Accuracy #85

bjfranks · 2019-01-07T15:19:17Z

Hi,

I was working with this model and extending it to answer some other research questions and I noticed odd training behaviour, with my model.

When I tried if the same would happen with your model it id. So to be specific
Epoch 1/50 … capsnet_acc: 0.1035 …
Epoch 00001: val_capsnet_acc improved from -inf to 0.08920, … (Nothing Special here it didnt learn anything in the first epoch)

Epoch 2/50 … capsnet_acc: 0.7618 … (It finally improved after the first epoch hiccup)
Epoch 00002: val_capsnet_acc improved from 0.08920 to 0.96000, … (Wait what, were did this jump come from)

Epoch 3/50 … capsnet_acc: 0.9307 … (Why is the accuracy on training worse than on validation after an epoch)
Epoch 00003: val_capsnet_acc improved from 0.96000 to 0.96650, … (And why is there another jump in validation accuracy again much higher than training accuracy)

This behaviour isnt odd in and of itself when using for example dropout layers, validation accuracy is likely to be higher in that case. However I cant find any layer or regularization we do during training, but dont do during Testing, so what is going on here? Why does the accuracy do these jumps?

XifengGuo · 2019-02-28T11:18:12Z

@bjfranks
First, the validation accuracy is usually close to or even higher than the training accuracy at the first few epochs, which indicates the model is underfitted or well generalized. An extreme case is when where's only one validation sample, the validation accuracy will be 0 or 1. So the vali_acc>train_acc is possible. But I'm not quite sure if this holds for most models.
Second, the train_acc is the average of the accuracies of mini-batch data at different training time. But the vali_loss is calculated by using the model at the end of an epoch. Suppose the training data is divided into two mini-batches. The model is first trained by one mini-batch data, called model-1, outputs the accuracy of this mini-batch data acc_1. Then the model is trained by the second mini-batch data, called model-2, corresponding acc_2. The training accuracy over the whole training data will be train_acc=(acc_1+acc_2)/2. However, the validation accuracy is calculated by model-2. Since model-2 is better than model-1, at each epoch vali_acc>train_acc is always possible. If you use the training data also as validation data, you may observe train_acc(epoch i) < vali_acc(epoch i) < train_acc(epoch i+1) < vali_acc(epoch i+1). If your training data and validation are different, vali_acc(epoch i) > train_acc(epoch i+1) maybe happens due to the first reason (i.e., generalization).

XifengGuo closed this as completed Nov 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation Accuracy Higher than Training Accuracy #85

Validation Accuracy Higher than Training Accuracy #85

bjfranks commented Jan 7, 2019

XifengGuo commented Feb 28, 2019

Validation Accuracy Higher than Training Accuracy #85

Validation Accuracy Higher than Training Accuracy #85

Comments

bjfranks commented Jan 7, 2019

XifengGuo commented Feb 28, 2019