Skip to content

CPINSim - Constrained Protein Interaction Networks Simulator

License

Notifications You must be signed in to change notification settings

BiancaStoecker/cpinsim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

CPINSim - Constrained Protein Interaction Networks Simulator

CPINSim is a package for the simulation of constrained protein interaction networks. Besides simulation of complex formation in a cell, it also provides methods for data preprocessing: annotation of interactions and constraints with domains and a parser to provide the needed protein input format.

Features

  • Annotate interactions and constraints with domains: Infer domains from known ones where possible, set unique artificial domains otherwise.
  • Parse the interaction and constraints files into a protein-wise text representation as input for the simulation.
  • Simulate the complex formation in a cell for the given input proteins with regard to the interaction dependencies which are encoded as constraints. Further, the simulation of perturbation effects like knockout or overexpression of one or multiple proteins is possible.

System requirements

Installation

We recommend the installation using conda:

$ conda create -n cpinsim -c bioconda cpinsim
$ source activate cpinsim

# You now have a 'cpinsim' script; try it:
$ cpinsim --help

# To switch back to your normal environment, use
$ source deactivate

Alternatively, you can download the source code from github and install it using the setup script:

$ git clone http://github.com/BiancaStoecker/cpinsim.git cpinsim
$ cd cpinsim
~/cpinsim$ python setup.py install

In this case you have to manually install the requirements listed above.

Platform support

CPINSim is a pure Python program. This means that it runs on any operating system (OS) for which Python 3 and the other packages are available.

Example usage

The needed input file proteins_extended_adhesome.csv can be downloaded from the git repository via

wget https://raw.githubusercontent.com/BiancaStoecker/cpinsim/master/example_files/proteins_extended_adhesome.csv

Example 1: Simulate the complex formation for proteins proteins_extended_adhesome.csv with 100 copies per protein (-n). Save the simulated graph at simulated_graph.gz and some logging information about the simulation steps at simulation.log.

For further parameters the default values are used.

$ cpinsim simulate example_files/proteins_extended_adhesome.csv -n 100 -og simulated_graph.gz -ol simulation.log

Note: The simulated graph is a pickled Python object (from the networkx library), saved in gzipped format. To examine it, you have to write Python code to unzip and unpickle it and then use the networkx API to examine its properties (see below for an example).

Example 2: Simulate the complex formation as in example 1, but now knock out the protein FYN and overexpress the protein ABL1 by factor 5.

$ cpinsim simulate example_files/proteins_extended_adhesome.csv -n 100 -og simulated_graph_ko_FYN_oexp_ABL1.gz -ol simlation_ko_FYN_oexp_ABL1.log -p FYN 0 -p ABL1 5

To investigate the simulation results one can extract the simulation graph in a python shell and for example look at the node lists of the resulting complexes:

import pickle, gzip
import networkx as nx

with gzip.open("simulated_graph.gz", "rb") as f:
    # load graph, each complex is a connected component
    graph = pickle.load(f)
    # get list of complexes sorted descendingly by their number of nodes
    complexes = sorted(list(nx.connected_component_subgraphs(graph)), key=len, reverse=True)
    # print the first 5 complexes
    for c in complexes[:5]:
        # nodes have unique integer ids, for protein name the "name" attribut is needed
        print([c.node[node]["name"] for node in c])

With the steps above, complexes contains each protein complex as full networkx graph datastructure for further analysis.

Additional example files for the data preprocessing steps and a full workflow including the evaluation of the simulation results will we uploaded in the near future.

About

CPINSim - Constrained Protein Interaction Networks Simulator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages