Skip to content

aeu79/PyMI-IPA

Repository files navigation

Python version

PyMI-IPA

This repository contains a Python version of the Matlab MI-IPA implementation (GitHub): "Inferring interaction partners from protein sequences using mutual information", by Anne-Florence Bitbol (article).



What we know so far (matlab version)

Input

Standard_HKRR_dataset.fasta

MSA in fasta format (5109 random pairs). First sequence is the reference, last a dummy sequence. Names look like this:

>Halhy_2679_HK_Classic|Haliscomenobacter_hydrossis_DSM_1100|Pair_Halhy_2678|HisKA_202-269/Halhy_2678_RR_unclassified|Haliscomenobacter_hydrossis_DSM_1100|Pair_Halhy_2679|Response_reg_6-118

>Halhy_3192_HK_Classic|Haliscomenobacter_hydrossis_DSM_1100|Pair_Halhy_3191|HisKA_204-270/Halhy_3191_RR_OmpR|Haliscomenobacter_hydrossis_DSM_1100|Pair_Halhy_3192|Response_reg_5-115

We should somehow input the info about the length of each sequence pair in the alignment (Probably best if given as an argument).

SpeciesNumbering_Standard_HKRR_dataset.mat

Sp. Index Sp. name
1 Acetohalobium_arabaticum_DSM_5501
2 Carboxydothermus_hydrogenoformans_Z-2901
3 Vibrio_cholerae_MJ-1236
4 Rickettsia_rickettsii_str._Hino
5 Serratia_marcescens_WW4
6 Bacillus_cereus_AH820
7 Escherichia_coli_CFT073

502 Species in total. Made a .csv version for us.

Output

(from MatLab, see results.zip)

IniScrambling_Ninc1600_rep1.txt

Sp. Index Seq nº Sp. Index Seq nº
491 4948 491 4949
491 4949 491 4944
491 4950 491 4947
491 4951 491 4936
492 4952 492 4955
492 4953 492 4956
492 4954 492 4954

Probably random pairs (5064), not final result.

Resf_Ninc1600_rep1.txt

Sp. Index Seq nº Seq nº Score x? Score y?
1 2 2 -822.35 452.64
1 3 3 -705.42 452.64
2 4 7 -769.9 136.96
2 5 5 -301.21 359.58
... ... ... ... ...
502 5105 5105 -605.18 385.49
502 5106 5108 -409.19 152.64
502 5107 5107 -285.82 457.96
502 5108 5098 -270.35 134.47

Contains 5064 lines with the final pairs (5109 pairs in the input fasta). We will have to match this seq. nº to the ones in the input fasta to generate a nicer output (table).

TP_data_Ninc1600_rep1.txt

Nº of pairs? ? ? ? ? ?
0 4452.3 872 4192 0 0
1600 1511 1455 3609 256 1344
3200 2240.5 1737 3327 1043 2157
4800 3014 1913 3151 1690 3110
5064 3156.3 2040 3024 1913 3151

Output matrix (MI_IPA_main.m: lines 102-104)


Contributing

Workflow

If you're interested in contributing to this project, please follow the following guidelines. Our main branch, where all changes will be gathered, is called "develop". Every "feature" (function) will have its own branch and once is finished, the "feature_branch" will be merged to "develop". The branch "matlab" contains the original matlab version. The pending tasks are managed within Issues. alt text

Setting up your local repository

  1. Clone the repository
git clone [email protected]:ElofssonLab/MI_IPA.git

or use the https version if you are not using SSH keys

git clone https://github.com/ElofssonLab/MI_IPA.git
  1. Check current (local) branch
git branch
  1. Check existing remote branches
git branch -r
  1. Now you can collaborate in an existing branch or create a new one
  • Create and switch to a new branch
git checkout -b <my_new_branch>
  • Or download an existing branch ("MI_IPA_main.py" as an example)
git fetch origin MI_IPA_main.py:MI_IPA_main.py

Switch to the downloaded branch

git checkout MI_IPA_main.py

Start coding

Create the file ("my_code.py") with the code in the repository directory.
Go to the directory.

cd <path to directory>

Verify working branch

git branch

List files/directories modified locally

git status

Add to the stage your file

git add my_code.py

Commit the changes

git commit -m "Added my code"

Every commit creates a "snapshot" (GitHub advice about commits)

To finally send your commits to the remote:

git push

If is your first push to a newly created branch you will get this error:

fatal: The current branch my_new_branch has no upstream branch.
To push the current branch and set the remote as upstream, use
git push --set-upstream origin my_new_branch

So:

git push --set-upstream origin my_new_branch

Dependencies

  • pandas
  • biopython
  • scipy
  • munkres

Install with:

pip3 install pandas biopython scipy munkres

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages