Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the full_scale video data (5TB) needed for the VQ2D task? #3

Open
fcakyon opened this issue Mar 27, 2022 · 2 comments
Open

Is the full_scale video data (5TB) needed for the VQ2D task? #3

fcakyon opened this issue Mar 27, 2022 · 2 comments

Comments

@fcakyon
Copy link

fcakyon commented Mar 27, 2022

Thanks for this wonderful work!

How to reduce the download size if I want to work only for VQ2D task?

Command given here downloads more than 5TB data: https://github.com/EGO4D/episodic-memory/blob/main/VQ2D/README.md#running-experiments

@miguelmartin75
Copy link
Collaborator

miguelmartin75 commented Mar 28, 2022

You should be able to just download the subset required for EM VQ by providing the --benchmarks em flag to the CLI (see here). This will download more videos than necessary (as it's for the entire EM) - sitting at 2.75TB, e.g.

python3 -m ego4d.cli.cli --output_directory=<dir> --dataset full_scale --benchmark vq

If you want to download less, I would reccomend passing in the video uids to download via --video_uid_file, with the video uids derived from the annotation JSON files.

There are also canonical clips. These are clips specific to the benchmark task and are subsets of the videos. For VQ they are ~5FPS clips with frames for where there are annotations. They are much less in size, sitting at around 700GB (for all of EM).

cc @ebyrne

@fcakyon
Copy link
Author

fcakyon commented Apr 4, 2022

@miguelmartin75 thanks for the response! What is the purpose of canonical clips? Should use them training my proposed model? Or would it result in suboptimal training?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@miguelmartin75 @fcakyon and others