Preposition synthesis script for ICASSP 2023 paper: Audio-Text Models Do Not Yet Leverage Natural Language
- Download AudioSet from https://research.google.com/audioset/
- Download AudioCaps metadata file from https://github.com/cdjkim/audiocaps/blob/master/dataset/train.csv
- Edit the variables in the script
AUDIOSET_FILE_PATH
,AUDIOCAPS_METADATA_PATH
, andOUTPUT_PATH
- Run
python preposition_synthesis.py