MovieCore

MovieCORE is a video question answering (VQA) dataset designed to probe deeper cognitive understanding of movie content.

For more details, please refer to our paper.

Data Preparation

Please download the videos from MovieChat's HF repos. Training Data and Test Data. Extract them as fits your model and use our annotations.

Run some baselines

Coming soon

Evaluation Dimensions

The evaluation is performed across the following dimensions:

Accuracy: Measures the semantic similarity between the predicted answer and the ground truth.
Comprehensiveness: Assesses whether the predicted answer covers all key aspects mentioned in the ground truth.
Depth: Evaluates the level of reasoning and insight demonstrated in the predicted answer.
Evidence: Checks the quality and relevance of evidence provided in the predicted answer.
Coherence: Measures the logical flow, organization, and clarity of the predicted answer.

Usage

To evaluate the MovieCore dataset, use the evaluate_moviecore.py script. The script processes the dataset, evaluates each QA pair across the specified dimensions, and calculates overall and classification-specific scores.

Running the Evaluation

export OPENAI_API_KEY='sk******'
python evaluate_moviecore.py --pred_path path/to/your/predictions.json

Input Format

{
    "video_1.mp4": [
        {
            "question": "How does the video depict the unique adaptations of the species in the Sahara Desert, and what roles do these species play in their ecosystem?",
            "answer": "The GT answer.",
            "pred": "Your pred.",
            "classification": "the classification"
        },
        {
            "question": "The second question of video 1?",
            "answer": "The GT answer.",
            "pred": "Your pred.",
            "classification": "the classification"
        }
    ],
    "video_2.mp4": [
        {
            "question": "The only question of video 2",
            "answer": "The GT answer.",
            "pred": "Your pred.",
            "classification": "the classification"
        }
    ]
}

License

This dataset is provided under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MovieCore

Data Preparation

Run some baselines

Evaluation Dimensions

Usage

Running the Evaluation

Input Format

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

MovieCore

Data Preparation

Run some baselines

Evaluation Dimensions

Usage

Running the Evaluation

Input Format

License