You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I train a model using the 'run_vmaf_training' process with some dataset, and then I run the 'run_testing' process with that trained model and the same dataset. Will I get the same results (SRCC, PCC, and RMSE) ?
I thought the results should be the same, but actually they are sometimes different, especially for the RMSE number. The maximum difference I have found is 3% for the RMSE (e.g., train=10.0, test=9.7).
Allen
The text was updated successfully, but these errors were encountered:
This is due to the slightly different workflows used by run_vmaf_training and run_testing. In run_vmaf_training, the feature scores (elementary metric scores) from each frame are first extracted, each feature is then temporally pooled (by arithmetic mean) to form a feature score per clip. The per-clip feature scores are then fit with the subjective scores to obtain the trained model. The reported SRCC, PCC and RMSE are the fitting result. In run_testing, the per-frame feature scores are first extracted, then the prediction model is applied on a per-frame basis, resulting "per-frame VMAF score". The final score for the clip is arithmetic mean of the per-frame scores. As you can see, there is a re-ordering of the 'temporal pooling' and 'prediction' operators. If the features from a clip are constant, the re-ordering will not have an impact. In practice, we find the numeric difference to be small.
Dear Zhi Li,
If I train a model using the 'run_vmaf_training' process with some dataset, and then I run the 'run_testing' process with that trained model and the same dataset. Will I get the same results (SRCC, PCC, and RMSE) ?
I thought the results should be the same, but actually they are sometimes different, especially for the RMSE number. The maximum difference I have found is 3% for the RMSE (e.g., train=10.0, test=9.7).
Allen
The text was updated successfully, but these errors were encountered: