Skip to content

Latest commit

 

History

History
124 lines (76 loc) · 4.51 KB

readme.md

File metadata and controls

124 lines (76 loc) · 4.51 KB

Protein-Protein Interaction (PPI) Representation

Protein-Protein Interaction (PPI) representation refers to the various ways in which the interactions between proteins can be represented or encoded. PPI representations aim to capture the structural, functional, and relational aspects of protein interactions and are used in various computational methods and analyses. Methods used for representation:

Node2Vec and HOPE are a popular algorithm used for generating node embeddings in network analysis, including graph-based protein representations. It is a representation learning method that learns low-dimensional vector representations, or embeddings, for nodes in a graph.

Please refer https://palash1992.github.io/GEM/ to access the readme as a webpage.

Dependencies

Related dependencies are available in the ppi_environment.yml file. Related dependencies can be installed by importing ppi_environment.yml file.

Node2vec parameters

Parameter Description Value
d embedding dimension 10, 50, 100, 200, 500, 1000
p return parameter(Parameter p controls the likelihood of immediately revisiting a node in the walk) 0.25, 0.5, 1, 2
q In-out parameter(Parameter q allows the search to differentiate between “inward” and “outward” nodes) 0.25, 0.5, 1, 2
max_iter maximum iterations 1
walk_len random walk length 80
con_size context size 10
num_walks number of random walks 10

HOPE parameters

Parameter Description Value
d embedding dimension 10, 50, 100, 200, 500, 1000
beta decay factor 0.00390625, 0.0078125, 0.015625, 0.03125, 0.0625, 0.125, 0.25, 0.5

Data Format

Edge List

-Read and write NetworkX graphs as edge lists.

-With the edgelist format simple edge data can be stored

*Example:

Interaction data Edgelist Data
Interaction A Interaction B
P05089 P05362
P05362 P14902
P14902 P16410
P15692 P14902
P16070 P14902
P16410 P05362
Interaction A Interaction B
0 1
1 2
2 5
3 2
4 2
5 1

How to run methods

  • Our users who have installed Hoper do not need to perform the following operations.

  • If you have not installed Hoper, you must perform the steps below to run PPI.

  • Dependencies are imported first.

  • Create edgelist (input data) Please refer edgelist_code.py

  • If you are going to use the IntAct database, Preprocessing is required for the IntAct database.The relevant code for this https://github.com/serbulent/HOPER/blob/main/Reproduction/ppi_representations/intact_data_preprocess.py

  • To install packages to use for Node2vec and HOPE in your home directory, use:

  • To make Node2vec executable; Clone repository git clone https://github.com/snap-stanford/snap and Compiles SNAP. The code for compiles is as below:

    • cd snap/
    • rm -rf examples/Release
    • make all
    • cd examples/node2vec
    • chmod +x node2vec
    • ls -alh node2vec
  • Make node2vec executable and add to system PATH or move it to the location you run.

  • Identify the protein names corresponding to the nodes(Reproduction/ppi_representations/data/proteins_id.csv)

You can make protein names using edgelist_code.py These names will be needed later for the node2vec.py and HOPE.py files. Do not forget the location information.

  • You can use small sample for application . The sample interaction is randomly generated (Reproduction/ppi_representations/data/small_example.xlsx)

  • Set parameters

  • Create representations

It can be run as python Node2vec.py and HOPE.py(input data: .edgelist file and proteins id names file ppi_representations/data)

Node2vec and HOPE outputs are recorded in ppi_representations/data.