Skip to content

Latest commit

 

History

History
130 lines (95 loc) · 7.58 KB

File metadata and controls

130 lines (95 loc) · 7.58 KB

Documentation

  • Look at article for notes and references

Steps to generate article

pwd # make sure that you are at equivalent path
/projects/VaidhyaMegha/vaidhyamegha-knowledge-graphs/docs/open_knowledge_graph_on_clinical_trials

rm -f out.*

docker run --rm -v "$(pwd):/data" -u "$(id -u)" pandocscholar/alpine

xdg-open  out.pdf
 

Open Knowledge Graph on Clinical trials

Specification

Below is a brief specification

  • Inputs
    • Mesh RDF
    • WHO's clinical trials database - ICTRP.
    • US clinical trial registry data from CTTI's AACT database.
  • Outputs
    • Clinical Trials RDF with below constituent ids and their relationships
      • MeSH, Clinical Trial, PubMed Article, Symptom/Phenotype, Genotype(from Human Genome)
      • Additionally, clinical trial -> clinical trial links across trial registries will also be discovered and added.

On-demand access : API : retrieve links given one or more of the ids as input

  • Input : one or more of these ids as input.
  • Output : { input : { id_type : xxx, key : key1, value : value1 }, output : [ { id_type : xxx, key : key1, value : value1 } ] }
  • ACMG

Notes

  • Human symptoms–disease network

  • HSDN supplementary data files "Combined-Input.tsv", "Symptom-Occurence-Input.tsv", and "Disease-Occurence-Input.tsv" were taken as input and new output files which also have the MeSH IDs were created.

  • Analysing bipartite symptoms to diseases network : Instead of the supplementary data files from HSDN, files retrieved from above LeoBman/HSDN were used. MeSH diseases where then mapped to the Disease Ontologies diseases used in Hetionet v1.0. Ultimately no data from HSDN was used in Hetionet, instead re-extracted symptom–disease relationships from MRCOC - MEDLINE topic co-occurrence were used.

  • Comparison of hetio/medline to MRCOC

    MEDLINE produces co-occurrence files under the codename MRCOC. More information is available in the 2016 report Building an Updated MEDLINE Co-Occurrences (MRCOC) File. These files might be a viable alternative to the analyses in this repository for certain applications. However, they don't appear to contain topics for supplemental concept records (for example MeSH term C000591739). Feel free to open an issue with additional insights on or comparisons to MRCOC

References

  • Look at References section in Article.md. Additional references are below.

Knowledge graph

MeSH

  • Tree view
  • Record view along with 'Mesh Tree Structures'

PheGenI

  • What is PheGenI
  • PheGenI: The Phenotype-Genotype Integrator demo
  • Downstream analysis of PheGenI results demo

Apache Jena

GraphQL

Java

Linux commands

  • Check if string exists in file in bash
  • Prevent grep from exiting when match is not found

PostgreSQL

  • PostgreSQL array columns
    • JDBC insert into array columns
  • PostgreSQL date column with default value
  • PostgreSQL upsert statement
  • PostgreSQL - pg_restore - restore only one selected schema
  • PostgreSQL - Array functions
  • Execute query on PostgreSQL using psql non-interactively
  • Save psql inline query output to a file
  • In PostgreSQL formulate a query to get all items in an array column

Postman

  • Tutorial for using GraphQL with Postman

Neo4j

Possibly older threads

  • GraphQL queries from Postman to neo4j.
  • Discussion on SparQL for neo4j.
    • Suggests it's feasible to execute SparQL 'get' query from Jena to Neo4j using n10s plugin.

Superset

Integration to data sources

  • GraphQL's integration in to Apache Superset.
    • Subsequent fork with commits from the above commenter - graphadvantage - to address this need.

Superset's API

  • REST API of Superset with a comment on GraphQL.