-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MusicCaps train-test splitting details #4
Comments
I can see the track_split.json with the contents below, but it isn't clear from that file what the validation/test sets are, nor where this splitting is used in the experiments.
|
@jpgard Thank you for reporting this issue! First, I apologize for any confusion caused by the paper.
If you have any further questions or concerns, please feel free to reply! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Congrats on the great work! This is a really useful model and the demo is super handy as well.
I am wondering how you performed train-test splitting on the MusicCaps dataset? Specifically, which parts of the MusicCaps dataset were used for LP-MusicCaps training, and which were used for evaluation? Was the audio from the eval set used during training (with generated captions)? Is there code in the repo somewhere that could be used to replicate your splitting process?
The paper says "we present the captioning result for MusicCaps [12] evaluation set", but it is not clear if the audio and tags from that evaluation set were used during model training? MusicCaps also contains a few different fields ("is_balanced_subset", "is_audioset_eval") that seem like they could possibly be used in the test set partitioning, it would be great to know how you divided the dataset and what audio/tags/captions were used at various stages of the experiments.
Thank you for the clarification!
The text was updated successfully, but these errors were encountered: