-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PubMedQA dataset #740
Add PubMedQA dataset #740
Conversation
905889b
to
4c61b10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @kurbanrita 🎉 here are some initial comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @kurbanrita! Left a couple of nitpicks, after those I think this should be ready to go.
tests/eva/language/data/datasets/classification/test_pubmedqa.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start @kurbanrita 🎉 Lets address @nkaenzig comments and merge :D
Add PubMedQA dataset
PubMedQA is a biomedical question-answering dataset collected from PubMed abstracts. The task of PubMedQA is to answer research questions with yes/no/maybe. The subset that is used for validation has 1k expert-annotated QA instances.
datasets
as a new language dependency)Remaining questions:
core
?