Allowed Inputs

The guideline below shows the input fields that are allowed (default) and disallowed (marked as 'X') at inference time, for each subtask.
Participants are free to use any of the fields below during training though as additional supervision signals, and e.g. at the inference time use the reconstructed / predicted values instead.

NOTE: In general, at inference time for a given turn, participants are not allowed to use any of the ground-truth information from its future turns. For instance, for a coreference resolution task at turn i, models shouldn't directly make use of the mentioned object IDs from its direct Assistant response turn at i or any information from turn i+1 and on -- which would essentially be regarded as "peeking into the future" and thus unfair/invalid.

Key	Subtask #1 (Ambiguous Candidate Identification)	Subtask #2 (Multimodal Coreference Resolution)	Subtask #3 (MM-DST)	Subtask #4 (Response Generation)
Dialog JSON File (Turn Level Input Fields)
`system_transcript` (previous turns)
`system_transcript` (current turn)	✗	✗	✗	✗ (prediction target)
`system_transcript_annotated` (previous turns)	✗ (except mentioned object IDs)	✗ (except mentioned object IDs)	✗ (except mentioned object IDs)	✗
`system_transcript_annotated` (current turn)	✗	✗	✗
`transcript`
`transcript_annotated`	✗ (prediction target)	✗ (prediction target)	✗ (prediction target)	✗
`turn_idx`
`disambiguation_label`	✗	✗	✗	✗
`disambiguation_transcript` (in older data only)	✗	✗	✗	✗
`disambiguation_candidates_raw`	✗	✗	✗	✗
`scene_ids`
Dialog JSON File (Dialog Level Input Fields)
`mentioned_object_ids` (* defined at a dialog level)	✗	✗	✗	✗
`dialogue_idx`
`domain`
Scene JSON Files
`objects` (index, bbox, ...)
`relationships`
Prefab Metadata Files
`url` (raw image)
Non-visual Metadata (`customerReview`,`brand`,`price`,`size`,`materials`)
Visual Metadata (`assetType`,`color`,`pattern`,`sleeveLength`,`type`)	✗	✗	✗	✗
`prefab_path`	✗	✗	✗

Notes

transcript_annotated provides the detailed structural intents, slots and values for each USER turn. system_transcript_annotated provides the similar information for ASSISTANT turns.
object field in transcript_annotated includes a list of object IDs referred to in each turn - each marked with a local index as defined for each scene.
You cannot use prefab_path at inference time for any task apart from response generation.
For more details, please refer to the full description in the data README document.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TASK_INPUTS.md

TASK_INPUTS.md

Files

TASK_INPUTS.md

Latest commit

History

TASK_INPUTS.md

File metadata and controls