-
Notifications
You must be signed in to change notification settings - Fork 8
Annotation example with the OpenStructure Computational Structural Biology Framework
OpenStructure is dedicated to load, analyze and manipulate macromolecular data. More information on the software framework including installation instructions (and information on running with Singularity/Docker containers) can be found on the official website: https://openstructure.org/
The EM structure with PDB ID 6m71 does not contain any ligands. The following code snippet loads a set of viral RNA-Polymerases from a study on poliovirus (PMID: 17223130). Despite remote homology, the polymerase architecture is largely conserved and thus allows for a structural superpositions and mapping of the residues that interact with nucleotides.
Running the following script requires OpenStructure to be installed on your system, i.e. the OpenStructure executable (ost) to be in your PATH and your PYTHONPATH set accordingly. Usage of the annotation features requires the root directory of the covid-19-Annotations-on-Structures repository to be in PYTHONPATH.
if all is set, you can fire:
ost my_awesome_script.py
where my_awesome_script.py contains:
# openstructure related imports
from ost import io, bindings
# assumes to have the root directory of
# https://github.com/SWISS-MODEL/covid-19-Annotations-on-Structures
# to be in your Python path
from utils.sm_annotations import Annotation
# EM structure of SARS-CoV2 RNA-Polymerase
polymerase = io.LoadPDB("6m71.1", remote=True, remote_repo='smtl')
# PMID 31138817 gives a structural description of viral RNA-Polymerases
# which share an overall architecture with fingers, palm and thumb domain:
# we use the fingers domain to superpose RNA-Polymerases from poliovirus
# onto 6m71
fingers_query = "cname=A and rnum=398:581,628:687"
fingers_domain = polymerase.Select(fingers_query)
# thats the polymerases from poliovirus and their bound nucleotides
polio_ids = ['2im2.1', '2im0.1', '1ra7.1', '2ily.1', '2im3.1', '2im1.1',
'2ilz.1']
polio_nucleotides = ['UTP', 'CTP', 'GTP', 'ATP', 'UTP', 'CTP', 'GTP']
# Superpose every polio polymerase and transfer nucleotides into separate
# ligand chain
ed = polymerase.EditXCS()
ligand_chain = ed.InsertChain('_')
for polio_id, nuc_name in zip(polio_ids, polio_nucleotides):
polio_polymerase = io.LoadPDB(polio_id, remote=True, remote_repo='smtl')
polio_peptide = polio_polymerase.Select("peptide=True")
tm_res = bindings.WrappedTMAlign(polio_peptide.chains[0],
fingers_domain.chains[0])
polio_polymerase.EditXCS().ApplyTransform(tm_res.transform)
nucleotides = polio_polymerase.Select("rname="+nuc_name)
for n in nucleotides.residues:
added_n = ed.AppendResidue(ligand_chain, nuc_name)
for at in n.atoms:
ed.InsertAtom(added_n, at.handle)
# save to disk for visual inspection
io.SavePDB(polymerase, "polymerase_with_nucleotides.pdb")
# get residues close to any of the nucleotides and generate custom annotations
close_residues = polymerase.Select('4 <> [cname=_] and cname!=_')
rnums = [r.GetNumber().GetNum() for r in close_residues.residues]
# rnums are relative to https://www.ncbi.nlm.nih.gov/protein/YP_009725307.1
# we need an offset to map it to P0DTD1
offset = 4392
annotation = Annotation()
for rnum in rnums:
annotation.add("P0DTD1", rnum+offset, 'r', 'Nucleotide binding site')
print(annotation)
You can directly copy the generated output into the annotation form
P0DTD1 4943 4943 #ff0000 Nucleotide binding site
P0DTD1 4945 4945 #ff0000 Nucleotide binding site
P0DTD1 4948 4948 #ff0000 Nucleotide binding site
P0DTD1 4949 4949 #ff0000 Nucleotide binding site
P0DTD1 5010 5010 #ff0000 Nucleotide binding site
P0DTD1 5013 5013 #ff0000 Nucleotide binding site
P0DTD1 5014 5014 #ff0000 Nucleotide binding site
P0DTD1 5015 5015 #ff0000 Nucleotide binding site
P0DTD1 5074 5074 #ff0000 Nucleotide binding site
P0DTD1 5152 5152 #ff0000 Nucleotide binding site
P0DTD1 5190 5190 #ff0000 Nucleotide binding site
alternatively, you can adapt the script to directly fire a post request see here