Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test setup for a TRAPI #50

Merged
merged 7 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions notebooks/clinical_ontologies.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# This file is a list of ontologies in Bioportal
# with relevance to clinical research use cases.
# They are named based on their Bioportal IDs.
# Those with the test_set=true flag are part of
# a test set for setting up a demo TRAPI.
---

ontologies:
- name: "MEDDRA"
is_umls: true
- name: "SNOMEDCT"
is_umls: true
- name: "RXNORM"
is_umls: true
- name: "NDDF"
is_umls: true
- name: "RCD"
is_umls: true
- name: "LOINC"
is_umls: true
- name: "NCIT"
is_umls: true
- name: "MESH"
is_umls: true
- name: "ICD10CM"
is_umls: true
- name: "VANDF"
is_umls: true
- name: "RADLEX"
is_umls: true
test_set: true
- name: "HCPCS"
is_umls: true
- name: "HL7"
- name: "ATC"
test_set: true
309 changes: 309 additions & 0 deletions notebooks/trapi_setup.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "bf54bcd3-511a-4771-a763-5177d0ba34fd",
"metadata": {},
"source": [
"# Setting up a test subgraph and TRAPI endpoint for KG-Bioportal"
]
},
{
"cell_type": "markdown",
"id": "1a6b4ea7",
"metadata": {},
"source": [
"## Prepare subgraph for ontologies relevant to clinical data"
]
},
{
"cell_type": "markdown",
"id": "60e563ec",
"metadata": {},
"source": [
"Load the set of ontologies to work with. This is defined in `clinical_ontologies.yaml`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38879a98",
"metadata": {},
"outputs": [],
"source": [
"import yaml"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44ab5b49",
"metadata": {},
"outputs": [],
"source": [
"test_ontologies = []\n",
"with open ('clinical_ontologies.yaml', 'r') as infile:\n",
" ontologies_dict = yaml.safe_load(infile)\n",
"\n",
"for ontology in ontologies_dict['ontologies']:\n",
" try:\n",
" if ontology['test_set']:\n",
" test_ontologies.append(ontology['name'])\n",
" except KeyError:\n",
" continue\n",
"\n",
"test_ontologies"
]
},
{
"cell_type": "markdown",
"id": "9f9cd2f4",
"metadata": {},
"source": [
"Now build a graph of these alone. This assumes that the transformed Bioportal graphs are in `../transformed/ontologies/` by default, "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dcde626f",
"metadata": {},
"outputs": [],
"source": [
"test_ontologies_str = \",\".join(test_ontologies)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "916ee4c6",
"metadata": {},
"outputs": [],
"source": [
"%cd ../\n",
"!python run.py catmerge --include_only {test_ontologies_str}"
]
},
{
"cell_type": "markdown",
"id": "f9d4e996",
"metadata": {},
"source": [
"See how the result looks."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "29fb331c",
"metadata": {},
"outputs": [],
"source": [
"!tar -xvzf data/merged/merged-kg.tar.gz"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79ec57c7",
"metadata": {},
"outputs": [],
"source": [
"!head merged-kg_edges.tsv"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8871d4d4",
"metadata": {},
"outputs": [],
"source": [
"!head merged-kg_nodes.tsv"
]
},
{
"cell_type": "markdown",
"id": "76af035b",
"metadata": {},
"source": [
"These didn't really get CURIE-d properly (an issue with the transform not using the expected prefix set) but let's go ahead anyway."
]
},
{
"cell_type": "markdown",
"id": "c3679824",
"metadata": {},
"source": [
"## Prepare as Neo4j"
]
},
{
"cell_type": "markdown",
"id": "0b920348",
"metadata": {},
"source": [
"Start a local Neo4j instance first."
]
},
{
"cell_type": "markdown",
"id": "e1995a23",
"metadata": {},
"source": [
"Install may look like the directions here:\n",
"https://neo4j.com/docs/operations-manual/current/installation/linux/debian/#debian-installation\n",
"\n",
"On my system, I had to enable systemctl first:\n",
"https://askubuntu.com/questions/1379425/system-has-not-been-booted-with-systemd-as-init-system-pid-1-cant-operate\n",
"\n",
"On the first run, there may be a message like this:\n",
"```bash\n",
"$ systemctl start neo4j.service\n",
"Failed to start neo4j.service: Interactive authentication required.\n",
"See system logs and 'systemctl status neo4j.service' for details.\n",
"```\n",
"\n",
"Just run:\n",
"```bash\n",
"$ sudo systemctl enable neo4j.service\n",
"$ sudo systemctl start neo4j.service\n",
"$ sudo systemctl status neo4j.service\n",
"```\n",
"\n",
"and it should be `active (running)`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46676351",
"metadata": {},
"outputs": [],
"source": [
"from kgx.transformer import Transformer\n",
"\n",
"input_args = {\n",
" 'filename': [\"../merged-kg_edges.tsv\",\n",
" \"../merged-kg_nodes.tsv\"],\n",
" 'format': 'tsv'\n",
"}\n",
"output_args = {\n",
" 'uri': 'neo4j://localhost:7474',\n",
" 'username': 'neo4j',\n",
" 'password': 'demo',\n",
" 'format': 'neo4j'\n",
"}\n",
"\n",
"t = Transformer()\n",
"t.transform(input_args, output_args)\n"
]
},
{
"cell_type": "markdown",
"id": "f7f789b5",
"metadata": {},
"source": [
"## Set up and run Plater on its own"
]
},
{
"cell_type": "markdown",
"id": "98322a31",
"metadata": {},
"source": [
"See https://github.com/TranslatorSRI/Plater"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2159f6c5",
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/TranslatorSRI/Plater"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d93c1ecf",
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install -r Plater/PLATER/requirements.txt"
]
},
{
"cell_type": "markdown",
"id": "db79d9fa",
"metadata": {},
"source": [
"Need a config file available as env variables.\n",
"Should look something like this:\n",
"```bash\n",
" WEB_HOST=0.0.0.0\n",
" WEB_PORT=8080\n",
" NEO4J_HOST=neo4j\n",
" NEO4J_USERNAME=neo4j\n",
" NEO4J_PASSWORD=demo\n",
" NEO4J_HTTP_PORT=7474\n",
" PLATER_TITLE='Plater'\n",
" PLATER_VERSION='1.2.0-7'\n",
" BL_VERSION='1.6.1'\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbfdbab9",
"metadata": {},
"outputs": [],
"source": [
"!source Plater/.env"
]
},
{
"cell_type": "markdown",
"id": "1963bafd",
"metadata": {},
"source": [
"Running the FastAPI app can be erratic in a notebook, but run this:\n",
"```bash\n",
"$ uvicorn PLATER.services.server:APP --host 0.0.0.0 --port 8080 --reload\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9e2a8cdc",
"metadata": {},
"source": [
"## Set up and run Automat"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading