Discrepancy of 0.04% in ActivityNet-QA evaluation code #28

israwal · 2023-05-05T16:33:24Z

Hi, I am consistently finding a difference of -0.04% in the reported performance on ActivityNet-QA dataset when using the official code for evaluation (https://github.com/MILVLG/activitynet-qa)
Replicating the results on ActivityNet-QA:

Accuracy using Singularity-Temporal (n=12 frames, num_temporal_layers=2, ckpt: ft_anet_qa_singularity_temporal_17m.pth): 44.01%
Accuracy using ActivityNet-QA: 43.97%

Bonus: ActivityNet-QA evaluation code provides evaluation of each question sub-type :)
Accuracy (per question type):
Motion: 32.2500%
Spatial Relation: 22.6250%
Temporal Relation: 4.1250%
Free: 75.7523%
All: 43.9750%
Accuracy of the Free type questions(per answer type):
Yes/No: 75.1194%
Color: 51.3630%
Object: 27.6730%
Location: 39.8964%
Number: 54.4554%
Other: 36.2241%

P.S.: The difference of -0.04% is consistent for all my experiments on ActivityNet-QA.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy of 0.04% in ActivityNet-QA evaluation code #28

Discrepancy of 0.04% in ActivityNet-QA evaluation code #28

israwal commented May 5, 2023 •

edited

Loading

Discrepancy of 0.04% in ActivityNet-QA evaluation code #28

Discrepancy of 0.04% in ActivityNet-QA evaluation code #28

Comments

israwal commented May 5, 2023 • edited Loading

israwal commented May 5, 2023 •

edited

Loading