Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
pvrijen committed Oct 12, 2023
1 parent 8ae5e8b commit 692e822
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 5 deletions.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ The project aims at using state-of-the-art machine learning methods, and in part

## Data
### Raw
- cs_45.json; cell line terms, extracted from Cellosaurus; 145673 terms
- cs_pos_pmid_set.tsv; curated positive samples, extracted from Cellosaurus, 22719 PMIDs
- gs_neg_pmid.tsv; curated negative samples, extracted from Google Scholar, 475 PMIDs
- ls_neg_pmid.tsv; uncurated negative samples, extracted from LitSuggest, 645 PMIDs
- cs_term_45.0; cell line terms, extracted from Cellosaurus, 145673 terms
#### PMID
- Cellosaurus; positive samples, extracted from Cellosaurus, 22719 PMIDs
- CellosaurusAB; positive samples, extracted from Cellosaurus, curated, high portion of seminal papers, 10.000 PMIDs
- GoogleScholar; negative samples, extracted from rejected Google Scholar results, curated, 509 PMIDs
- LitSuggest; negative samples, extract from rejected LitSuggest results, 645 PMIDs
3 changes: 2 additions & 1 deletion notebook/data_proc_pmid.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "code",
"execution_count": 130,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -17,6 +17,7 @@
"import random\n",
"from munch import Munch\n",
"import numpy as np\n",
"from tab\n",
"\n",
"with initialize(\n",
" version_base=None,\n",
Expand Down

0 comments on commit 692e822

Please sign in to comment.