Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug fix in data #6

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

xingdi-eric-yuan
Copy link

@xingdi-eric-yuan xingdi-eric-yuan commented Apr 12, 2023

Some bugs in splitting options. I only looked at ruin_names but there might be similar things in other tasks.

It's quite easy to target, look for datapoints where 1) the "target" is not an option label; or 2) the length of options is different from BigBench.

The buggy data points are:

  1. bbh:

    {"input": "Which of the following is a humorous edit of this artist or movie name: 'rita, sue and bob too'?\nOptions:\n(A) rita\n(B) sue and bob too\n(C) rita\n(D) sue and bob poo\n(E) rita\n(F) sue and box too\n(G) rita,y sue and bob too", "target": "rita, sue and bob poo"}

    bb:
    https://github.com/google/BIG-bench/blob/main/bigbench/benchmark_tasks/ruin_names/task.json#L1437

  2. bbh:

    {"input": "Which of the following is a humorous edit of this artist or movie name: 'earth, wind, & fire'?\nOptions:\n(A) eareth\n(B) wind\n(C) & fire\n(D) earth\n(E) bind\n(F) & fire\n(G) earthm wind\n(H) & fire\n(I) dearth\n(J) wind\n(K) & fire", "target": "dearth, wind, & fire"}

    bb:
    https://github.com/google/BIG-bench/blob/main/bigbench/benchmark_tasks/ruin_names/task.json#L1743

It's quite difficult to see the diff in github, you can use this diff checker:
https://www.diffchecker.com/xMl0kTzr/

@xingdi-eric-yuan
Copy link
Author

xingdi-eric-yuan commented Apr 12, 2023

Similar thing in movie_recommendation

  1. bbh:

    {"input": "Find a movie similar to Terminator 2 Judgment Day, The Fugitive, The Shawshank Redemption, Dead Man Walking:\nOptions:\n(A) Walk\n(B) Don't Run\n(C) Shaun the Sheep Movie\n(D) Rocky IV\n(E) Braveheart", "target": "(E)"}

    bb:
    https://github.com/google/BIG-bench/blob/main/bigbench/benchmark_tasks/movie_recommendation/task.json#L1423

  2. bbh:

    {"input": "Find a movie similar to Minority Report, Shrek, Catch Me If You Can, Aladdin:\nOptions:\n(A) Monsters\n(B) Inc\n(C) Children of the Night\n(D) The Incredible Shrinking Man\n(E) Town & Country", "target": "Monsters, Inc"}

    bb:
    https://github.com/google/BIG-bench/blob/main/bigbench/benchmark_tasks/movie_recommendation/task.json#L2593

and a dozen+ more..

@xingdi-eric-yuan
Copy link
Author

From these few examples, maybe the bug happens when there's a , in the option?

@xingdi-eric-yuan xingdi-eric-yuan changed the title bug fix in ruin_names data bug fix in data Apr 12, 2023
@Brian-Ckwu
Copy link

I stumbled across these bugs, too. Thank @xingdi-eric-yuan for addressing these first!

@RylanSchaeffer
Copy link

@suzgunmirac can this pull request be merged please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants