Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation Accuracy Higher than Training Accuracy #85

Closed
bjfranks opened this issue Jan 7, 2019 · 1 comment
Closed

Validation Accuracy Higher than Training Accuracy #85

bjfranks opened this issue Jan 7, 2019 · 1 comment

Comments

@bjfranks
Copy link

bjfranks commented Jan 7, 2019

Hi,

I was working with this model and extending it to answer some other research questions and I noticed odd training behaviour, with my model.

When I tried if the same would happen with your model it id. So to be specific
Epoch 1/50 … capsnet_acc: 0.1035 …
Epoch 00001: val_capsnet_acc improved from -inf to 0.08920, … (Nothing Special here it didnt learn anything in the first epoch)

Epoch 2/50 … capsnet_acc: 0.7618 … (It finally improved after the first epoch hiccup)
Epoch 00002: val_capsnet_acc improved from 0.08920 to 0.96000, … (Wait what, were did this jump come from)

Epoch 3/50 … capsnet_acc: 0.9307 … (Why is the accuracy on training worse than on validation after an epoch)
Epoch 00003: val_capsnet_acc improved from 0.96000 to 0.96650, … (And why is there another jump in validation accuracy again much higher than training accuracy)

This behaviour isnt odd in and of itself when using for example dropout layers, validation accuracy is likely to be higher in that case. However I cant find any layer or regularization we do during training, but dont do during Testing, so what is going on here? Why does the accuracy do these jumps?
oddtrainingbehaviour

@XifengGuo
Copy link
Owner

@bjfranks
First, the validation accuracy is usually close to or even higher than the training accuracy at the first few epochs, which indicates the model is underfitted or well generalized. An extreme case is when where's only one validation sample, the validation accuracy will be 0 or 1. So the vali_acc>train_acc is possible. But I'm not quite sure if this holds for most models.
Second, the train_acc is the average of the accuracies of mini-batch data at different training time. But the vali_loss is calculated by using the model at the end of an epoch. Suppose the training data is divided into two mini-batches. The model is first trained by one mini-batch data, called model-1, outputs the accuracy of this mini-batch data acc_1. Then the model is trained by the second mini-batch data, called model-2, corresponding acc_2. The training accuracy over the whole training data will be train_acc=(acc_1+acc_2)/2. However, the validation accuracy is calculated by model-2. Since model-2 is better than model-1, at each epoch vali_acc>train_acc is always possible. If you use the training data also as validation data, you may observe train_acc(epoch i) < vali_acc(epoch i) < train_acc(epoch i+1) < vali_acc(epoch i+1). If your training data and validation are different, vali_acc(epoch i) > train_acc(epoch i+1) maybe happens due to the first reason (i.e., generalization).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants