-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create dataset loader for CHEBI (Chapati) #113
Comments
#self-assign |
Hi @napsternxg, can you let us know if you are still working on this so we can update our project board? Please just notify us the status by Friday April 8, no worries if you are not finished but intend to work on this. Please either ping me here at @hakunanatasha or ping the discord admins (with @admins) |
Hi @hakunanatasha yes I plan to work on this over the weekend. |
I have started working on this dataset. I will send a PR soon. |
Hi @hakunanatasha and @leonweber I have a few questions on how to parse the data. Code related to my questions is in: https://colab.research.google.com/drive/1Ne8A76yn0vxwKkpU7l_OzGI968B-YieJ?usp=sharing
filepath = "./scrapbook/WO2007000651/source.xml"
reader = biocxml.BioCXMLDocumentReader(str(filepath)) I get the error:
|
Hi @napsternxg
|
Hi @jason-fries thanks for the response. I plan to submit it early next week. |
Downloaded the files from CVS and uploading it here for usage. We can later move it to HF datasets and update the URL in the code. |
Added PR: #525 |
Task: NER
License: Creative Commons
Format: custom
Language: English
Citation: ???
Referenced and used by "Habibi, Maryam, et al. "Deep learning with word embeddings improves biomedical named entity recognition." Bioinformatics"
Source: http://chebi.cvs.sourceforge.net/viewvc/chebi/chapati/
The text was updated successfully, but these errors were encountered: