Skip to content

Annotation example with the OpenStructure Computational Structural Biology Framework

Gabriel Studer edited this page Apr 7, 2020 · 3 revisions

The OpenStructure Computational Structural Biology Framework

OpenStructure is dedicated to load, analyze and manipulate macromolecular data. More information on the software framework including installation instructions (and information on running with Singularity/Docker containers) can be found on the official website: https://openstructure.org/

Mapping Nucleotide Binding Residues for RNA-Polymerase

The EM structure with PDB ID 6m71 does not contain any ligands. The following code snippet loads a set of viral RNA-Polymerases from a study on poliovirus (PMID: 17223130). Despite remote homology, the polymerase architecture is largely conserved and thus allows for a structural superpositions and mapping of the residues that interact with nucleotides.

Running the following script requires OpenStructure to be installed on your system, i.e. the OpenStructure executable (ost) to be in your PATH and your PYTHONPATH set accordingly. Usage of the annotation features requires the root directory of the covid-19-Annotations-on-Structures repository to be in PYTHONPATH.

if all is set, you can fire:

ost my_awesome_script.py

where my_awesome_script.py contains:

# openstructure related imports
from ost import io, bindings
# assumes to have the root directory of 
# https://github.com/SWISS-MODEL/covid-19-Annotations-on-Structures
# to be in your Python path
from utils.sm_annotations import Annotation

# EM structure of SARS-CoV2 RNA-Polymerase
polymerase = io.LoadPDB("6m71.1", remote=True, remote_repo='smtl')

# PMID 31138817 gives a structural description of viral RNA-Polymerases 
# which share an overall architecture with fingers, palm and thumb domain:
# we use the fingers domain to superpose RNA-Polymerases from poliovirus
# onto 6m71
fingers_query = "cname=A and rnum=398:581,628:687"
fingers_domain = polymerase.Select(fingers_query)

# thats the polymerases from poliovirus and their bound nucleotides
polio_ids = ['2im2.1', '2im0.1', '1ra7.1', '2ily.1', '2im3.1', '2im1.1', 
             '2ilz.1']
polio_nucleotides = ['UTP', 'CTP', 'GTP', 'ATP', 'UTP', 'CTP', 'GTP']

# Superpose every polio polymerase and transfer nucleotides into separate
# ligand chain
ed = polymerase.EditXCS()
ligand_chain = ed.InsertChain('_')
for polio_id, nuc_name in zip(polio_ids, polio_nucleotides):
    polio_polymerase = io.LoadPDB(polio_id, remote=True, remote_repo='smtl')
    polio_peptide = polio_polymerase.Select("peptide=True")
    tm_res = bindings.WrappedTMAlign(polio_peptide.chains[0], 
                                     fingers_domain.chains[0])
    polio_polymerase.EditXCS().ApplyTransform(tm_res.transform)
    nucleotides = polio_polymerase.Select("rname="+nuc_name)
    for n in nucleotides.residues:
        added_n = ed.AppendResidue(ligand_chain, nuc_name)
        for at in n.atoms:
            ed.InsertAtom(added_n, at.handle)

# save to disk for visual inspection 
io.SavePDB(polymerase, "polymerase_with_nucleotides.pdb")

# get residues close to any of the nucleotides and generate custom annotations
close_residues = polymerase.Select('4 <> [cname=_] and cname!=_')
rnums = [r.GetNumber().GetNum() for r in close_residues.residues]

# rnums are relative to https://www.ncbi.nlm.nih.gov/protein/YP_009725307.1
# we need an offset to map it to P0DTD1
offset = 4392

annotation = Annotation()
for rnum in rnums:
    annotation.add("P0DTD1", rnum+offset, 'r', 'Nucleotide binding site')
print(annotation)

You can directly copy the generated output into the annotation form

P0DTD1	4943	4943	#ff0000	Nucleotide binding site
P0DTD1	4945	4945	#ff0000	Nucleotide binding site
P0DTD1	4948	4948	#ff0000	Nucleotide binding site
P0DTD1	4949	4949	#ff0000	Nucleotide binding site
P0DTD1	5010	5010	#ff0000	Nucleotide binding site
P0DTD1	5013	5013	#ff0000	Nucleotide binding site
P0DTD1	5014	5014	#ff0000	Nucleotide binding site
P0DTD1	5015	5015	#ff0000	Nucleotide binding site
P0DTD1	5074	5074	#ff0000	Nucleotide binding site
P0DTD1	5152	5152	#ff0000	Nucleotide binding site
P0DTD1	5190	5190	#ff0000	Nucleotide binding site

alternatively, you can adapt the script to directly fire a post request see here