diff --git a/notebooks/clinical_ontologies.yaml b/notebooks/clinical_ontologies.yaml new file mode 100644 index 0000000..20c563e --- /dev/null +++ b/notebooks/clinical_ontologies.yaml @@ -0,0 +1,36 @@ +# This file is a list of ontologies in Bioportal +# with relevance to clinical research use cases. +# They are named based on their Bioportal IDs. +# Those with the test_set=true flag are part of +# a test set for setting up a demo TRAPI. +--- + +ontologies: + - name: "MEDDRA" + is_umls: true + - name: "SNOMEDCT" + is_umls: true + - name: "RXNORM" + is_umls: true + - name: "NDDF" + is_umls: true + - name: "RCD" + is_umls: true + - name: "LOINC" + is_umls: true + - name: "NCIT" + is_umls: true + - name: "MESH" + is_umls: true + - name: "ICD10CM" + is_umls: true + - name: "VANDF" + is_umls: true + - name: "RADLEX" + is_umls: true + test_set: true + - name: "HCPCS" + is_umls: true + - name: "HL7" + - name: "ATC" + test_set: true diff --git a/notebooks/trapi_setup.ipynb b/notebooks/trapi_setup.ipynb new file mode 100644 index 0000000..2521c25 --- /dev/null +++ b/notebooks/trapi_setup.ipynb @@ -0,0 +1,309 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "bf54bcd3-511a-4771-a763-5177d0ba34fd", + "metadata": {}, + "source": [ + "# Setting up a test subgraph and TRAPI endpoint for KG-Bioportal" + ] + }, + { + "cell_type": "markdown", + "id": "1a6b4ea7", + "metadata": {}, + "source": [ + "## Prepare subgraph for ontologies relevant to clinical data" + ] + }, + { + "cell_type": "markdown", + "id": "60e563ec", + "metadata": {}, + "source": [ + "Load the set of ontologies to work with. This is defined in `clinical_ontologies.yaml`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "38879a98", + "metadata": {}, + "outputs": [], + "source": [ + "import yaml" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "44ab5b49", + "metadata": {}, + "outputs": [], + "source": [ + "test_ontologies = []\n", + "with open ('clinical_ontologies.yaml', 'r') as infile:\n", + " ontologies_dict = yaml.safe_load(infile)\n", + "\n", + "for ontology in ontologies_dict['ontologies']:\n", + " try:\n", + " if ontology['test_set']:\n", + " test_ontologies.append(ontology['name'])\n", + " except KeyError:\n", + " continue\n", + "\n", + "test_ontologies" + ] + }, + { + "cell_type": "markdown", + "id": "9f9cd2f4", + "metadata": {}, + "source": [ + "Now build a graph of these alone. This assumes that the transformed Bioportal graphs are in `../transformed/ontologies/` by default, " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dcde626f", + "metadata": {}, + "outputs": [], + "source": [ + "test_ontologies_str = \",\".join(test_ontologies)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "916ee4c6", + "metadata": {}, + "outputs": [], + "source": [ + "%cd ../\n", + "!python run.py catmerge --include_only {test_ontologies_str}" + ] + }, + { + "cell_type": "markdown", + "id": "f9d4e996", + "metadata": {}, + "source": [ + "See how the result looks." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "29fb331c", + "metadata": {}, + "outputs": [], + "source": [ + "!tar -xvzf data/merged/merged-kg.tar.gz" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "79ec57c7", + "metadata": {}, + "outputs": [], + "source": [ + "!head merged-kg_edges.tsv" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8871d4d4", + "metadata": {}, + "outputs": [], + "source": [ + "!head merged-kg_nodes.tsv" + ] + }, + { + "cell_type": "markdown", + "id": "76af035b", + "metadata": {}, + "source": [ + "These didn't really get CURIE-d properly (an issue with the transform not using the expected prefix set) but let's go ahead anyway." + ] + }, + { + "cell_type": "markdown", + "id": "c3679824", + "metadata": {}, + "source": [ + "## Prepare as Neo4j" + ] + }, + { + "cell_type": "markdown", + "id": "0b920348", + "metadata": {}, + "source": [ + "Start a local Neo4j instance first." + ] + }, + { + "cell_type": "markdown", + "id": "e1995a23", + "metadata": {}, + "source": [ + "Install may look like the directions here:\n", + "https://neo4j.com/docs/operations-manual/current/installation/linux/debian/#debian-installation\n", + "\n", + "On my system, I had to enable systemctl first:\n", + "https://askubuntu.com/questions/1379425/system-has-not-been-booted-with-systemd-as-init-system-pid-1-cant-operate\n", + "\n", + "On the first run, there may be a message like this:\n", + "```bash\n", + "$ systemctl start neo4j.service\n", + "Failed to start neo4j.service: Interactive authentication required.\n", + "See system logs and 'systemctl status neo4j.service' for details.\n", + "```\n", + "\n", + "Just run:\n", + "```bash\n", + "$ sudo systemctl enable neo4j.service\n", + "$ sudo systemctl start neo4j.service\n", + "$ sudo systemctl status neo4j.service\n", + "```\n", + "\n", + "and it should be `active (running)`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "46676351", + "metadata": {}, + "outputs": [], + "source": [ + "from kgx.transformer import Transformer\n", + "\n", + "input_args = {\n", + " 'filename': [\"../merged-kg_edges.tsv\",\n", + " \"../merged-kg_nodes.tsv\"],\n", + " 'format': 'tsv'\n", + "}\n", + "output_args = {\n", + " 'uri': 'neo4j://localhost:7474',\n", + " 'username': 'neo4j',\n", + " 'password': 'demo',\n", + " 'format': 'neo4j'\n", + "}\n", + "\n", + "t = Transformer()\n", + "t.transform(input_args, output_args)\n" + ] + }, + { + "cell_type": "markdown", + "id": "f7f789b5", + "metadata": {}, + "source": [ + "## Set up and run Plater on its own" + ] + }, + { + "cell_type": "markdown", + "id": "98322a31", + "metadata": {}, + "source": [ + "See https://github.com/TranslatorSRI/Plater" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2159f6c5", + "metadata": {}, + "outputs": [], + "source": [ + "!git clone https://github.com/TranslatorSRI/Plater" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d93c1ecf", + "metadata": {}, + "outputs": [], + "source": [ + "!python -m pip install -r Plater/PLATER/requirements.txt" + ] + }, + { + "cell_type": "markdown", + "id": "db79d9fa", + "metadata": {}, + "source": [ + "Need a config file available as env variables.\n", + "Should look something like this:\n", + "```bash\n", + " WEB_HOST=0.0.0.0\n", + " WEB_PORT=8080\n", + " NEO4J_HOST=neo4j\n", + " NEO4J_USERNAME=neo4j\n", + " NEO4J_PASSWORD=demo\n", + " NEO4J_HTTP_PORT=7474\n", + " PLATER_TITLE='Plater'\n", + " PLATER_VERSION='1.2.0-7'\n", + " BL_VERSION='1.6.1'\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dbfdbab9", + "metadata": {}, + "outputs": [], + "source": [ + "!source Plater/.env" + ] + }, + { + "cell_type": "markdown", + "id": "1963bafd", + "metadata": {}, + "source": [ + "Running the FastAPI app can be erratic in a notebook, but run this:\n", + "```bash\n", + "$ uvicorn PLATER.services.server:APP --host 0.0.0.0 --port 8080 --reload\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "9e2a8cdc", + "metadata": {}, + "source": [ + "## Set up and run Automat" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}