Skip to content

Latest commit

 

History

History
155 lines (132 loc) · 11.1 KB

README.md

File metadata and controls

155 lines (132 loc) · 11.1 KB

Awesome Computational Biology Awesome

A knowledge collection of databases, software and papers related to computational biology.

Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modelling and computational simulation techniques to the study of biological, ecological, behavioural, and social systems. - Wikipedia

Contents

Databases

scRNA

Compound

Pathway

Mass Spectra

  • MassBank - Open souce databases and tools for mass spectrometry reference spectra.
  • MoNA MassBank of North America - Meta database of metabolite mass spectra, metadata and associated compounds.

Protein

Genome

Disease

  • KEGG DRUG - Comprehensive drug information resource for approved drugs.
  • DrugBank - A database of drug and target maintained by the University of Alberta.

Interaction

  • Drug Gene Interaction
    • DGIdb - A database of drug-gene interactions and the druggable genome.
    • Comparative Toxicogenomics Database - A database of Chemical-gene interactions, Chemical-disease associations, Gene-disease associations, and Chemical-phenotype associations.
    • SNAP - A dataset which contains Drug-gene interactions.
    • Therapeutics Data Commons - A database for a lot of tasks such as drug-target, drug-response, drug-drug interaction.
  • Drug (-Cell line) Response
  • Chemical Protein Interaction
    • STITCH - A database of Chemical Protein Interaction.
    • BindingDB - A database of compounds and targes.
    • PDBBind - Database of experimentally measured binding affinity data for biomolecular complexes.
    • CrossDocked2020 - Large-scale dataset for machine learning in structure-based virtual screening.
  • Protein-Protein Interaction
    • STRING - Protein-Protein Interaction Networks for several organisms.
    • BioGRID - Database of Protein, Genetic and Chemical Interactions.
    • HIPPIE - Human Protein-Protein Interaction database.
  • Knowledge Graph

Clinical Trial

API

Preprocess

  • Chemistry Development Kit - A software of cheminformatics and Machine Learning.
  • RDKit - A software of cheminformatics and Machine Learning.
  • Scanpy - scRNA analysis library in Python.
  • Seurat - scRNA analysis library in R.

Machine Learning Tasks and Models

Drug Response Prediction

  • drGAT: A model for drug response prediction with gene explainability with attention mechanism.
  • MOFGCN: GCN + heterogeneous network
  • DeepDSC: Autoencoder + Fully Connected NN
  • DGDRP: Multi-view embedding NN.
  • DeepAEG: GNN Embedding + Attention

Drug Repurposing

Drug Target Interaction

  • NeoDTI - A library for Drug Target Interaction.

Compound Protein Interaction

  • MCPINN - A library for drug discovery using Compound Protein Interaction and Machine Learning.
  • TransformerCPI - A library for Compound Protein Interaction prediction using Transformer.

Pre-trained embedding

LLM for biology

  • AI4Chem/ChemLLM-7B-Chat - LLM for chemical and molecule science
  • BioGPT - LLM for Biomedical text generation
  • GeneGPT - LLM for biomedical information with several API.
  • GenePT - foundation LLM for single cell data
  • scPRINT - scPRINT is pretrained on 50M cells to denoise and perform zero imputation of any single cell RNAseq profile.