MusicCaps train-test splitting details #4

jpgard · 2023-09-14T23:47:53Z

Congrats on the great work! This is a really useful model and the demo is super handy as well.

I am wondering how you performed train-test splitting on the MusicCaps dataset? Specifically, which parts of the MusicCaps dataset were used for LP-MusicCaps training, and which were used for evaluation? Was the audio from the eval set used during training (with generated captions)? Is there code in the repo somewhere that could be used to replicate your splitting process?

The paper says "we present the captioning result for MusicCaps [12] evaluation set", but it is not clear if the audio and tags from that evaluation set were used during model training? MusicCaps also contains a few different fields ("is_balanced_subset", "is_audioset_eval") that seem like they could possibly be used in the test set partitioning, it would be great to know how you divided the dataset and what audio/tags/captions were used at various stages of the experiments.

Thank you for the clarification!

jpgard · 2023-09-15T00:00:43Z

I can see the track_split.json with the contents below, but it isn't clear from that file what the validation/test sets are, nor where this splitting is used in the experiments.

{
    "train_track": [
        "[rOOBAGxxjBk]-[10-20]",
        "[OmjfHQB_lcs]-[30-40]",
        "[KxVbdGPAfjE]-[30-40]",
        "[WyGJdstaxK4]-[30-40]",
        "[qEGNzCWQdqo]-[30-40]",
        "[Zbmm_hXcrA0]-[160-170]",
        "[AHmcuClSTL4]-[100-110]",
        "[OMcoFfaCaGM]-[30-40]",
        "[pIwn0udLJXI]-[120-130]",
        "[60OIHit4Q-M]-[30-40]",
        "[kh6rmFg3U4k]-[480-490]",
        "[24cmo2fEQo8]-[60-70]",
        "[-kpR93atgd8]-[30-40]",
        "[4zZiWBp0b08]-[30-40]",
        "[yreWOyWr6Uk]-[330-340]",
        "[aKhM6zyL--k]-[330-340]",
        "[XEIP1OUXU8E]-[140-150]",
        "[WTVC7ZI9WtY]-[30-40]",
        "[yRWndZvIAHc]-[30-40]",
        "[sOJSjVp6UTc]-[30-40]"
    ],
    "valid_track": [],
    "test_track": []
}

seungheondoh · 2023-09-18T02:56:39Z

@jpgard Thank you for reporting this issue! First, I apologize for any confusion caused by the paper.

MusicCaps Evaluation Split: We used 2.86k items from "is_audioset_eval" as the evaluation split. Please use the link below for reference. Additionally, we only used "caption_ground_truth" as the evaluation captions.

https://huggingface.co/datasets/seungheondoh/LP-MusicCaps-MC/viewer/default/test

Pseudo captions from the MusicCaps dataset are only used in Section 3, "EVALUATION OF PSEUDO CAPTIONS," and pseudo captions based on the MSD dataset are used in Section 5, "AUTOMATIC MUSIC CAPTIONING."

If you have any further questions or concerns, please feel free to reply!

seungheondoh mentioned this issue Apr 29, 2024

MusicCap Dataset Testing: Overfitting Issues in Model Predictions #9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MusicCaps train-test splitting details #4

MusicCaps train-test splitting details #4

jpgard commented Sep 14, 2023

jpgard commented Sep 15, 2023

seungheondoh commented Sep 18, 2023 •

edited

Loading

MusicCaps train-test splitting details #4

MusicCaps train-test splitting details #4

Comments

jpgard commented Sep 14, 2023

jpgard commented Sep 15, 2023

seungheondoh commented Sep 18, 2023 • edited Loading

seungheondoh commented Sep 18, 2023 •

edited

Loading