Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plddt and Pae plots #15

Merged
merged 84 commits into from
Feb 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
70b8b04
Add number_of_models parameter to run_boltz function and update comma…
hllelli2 Jan 15, 2025
aa7c478
Refactor AlphafoldOutput to simplify CIF file extraction and add get_…
hllelli2 Jan 15, 2025
e59c11d
Add pLDDT distribution plotting functionality and corresponding tests…
hllelli2 Jan 15, 2025
70b0d39
Add updated test data and update tests for Chai and Alphafold outputs
hllelli2 Jan 15, 2025
d1c1fd6
Update test names to use consistent identifier "6BJ9" across output t…
hllelli2 Jan 15, 2025
46b43b8
WIP
hllelli2 Jan 16, 2025
cbe9389
Merge branch 'dev' into Output-plots
hllelli2 Jan 16, 2025
ca4efe6
Implement Plotly visualization for pLDDT distribution and update tests
hllelli2 Jan 16, 2025
744b633
Refactor input handling in abcfold to use input_params consistently
hllelli2 Jan 17, 2025
130752e
Refactor test setup to use test_data for Alphafold, Boltz, and Chai o…
hllelli2 Jan 17, 2025
aca9a38
Refactor BoltzOutput and ChaiOutput constructors to accept input_para…
hllelli2 Jan 17, 2025
7be6ff5
Add Plotly-based pLDDT distribution visualization and remove matplotl…
hllelli2 Jan 20, 2025
55d96a4
Refactor CifFile initialization to accept input_params and update rel…
hllelli2 Jan 20, 2025
42f1a64
Update test input parameters and refactor pLDDT plot test for consist…
hllelli2 Jan 20, 2025
afb05bf
Enhance pLDDT distribution plotting function with customizable output…
hllelli2 Jan 21, 2025
65c314e
Add docstrings to AlphafoldOutput methods for improved clarity and do…
hllelli2 Jan 21, 2025
d111b18
Refactor file structure and enhance documentation with detailed docst…
hllelli2 Jan 21, 2025
aa1aa44
Enhance documentation with detailed docstrings for ABCFold, BoltzOutp…
hllelli2 Jan 21, 2025
8b4d070
Add detailed docstrings to run_alphafold3, run_boltz, and run_chai fu…
hllelli2 Jan 21, 2025
1482757
Refactor import statements to use file_handlers module for CifFile, C…
hllelli2 Jan 21, 2025
acdd71d
Refactor BoltzOutput and CifFile methods for improved chain length ca…
hllelli2 Jan 21, 2025
42cbf53
Rename test_process_boltz_output to test_process_af3_output and updat…
hllelli2 Jan 21, 2025
8bdf3be
Refactor test_add_mmseqs2 to rename af3_json parameter to input_param…
hllelli2 Jan 21, 2025
ca8bb20
Added html plot to test data for output page development
hllelli2 Jan 22, 2025
3e002f1
Added altered pae_viewer repo to abcfold, under MIT license
hllelli2 Jan 22, 2025
8417a27
Add .history to .gitignore to exclude history files from version control
hllelli2 Jan 22, 2025
415d182
Fix typo in .gitignore to correctly exclude .history files
hllelli2 Jan 22, 2025
8fc0c22
Enhance pLDDT plotting functionality to support output as div files a…
hllelli2 Jan 23, 2025
d53920d
WIP pae, will remove
hllelli2 Jan 23, 2025
6cf3d3a
Need to figure out chai pae
hllelli2 Jan 23, 2025
9222343
Refactor import statements for readability and update chain_lengths m…
hllelli2 Jan 23, 2025
88d6fc6
Enhance chain_lengths method to support ligand atom counting and add …
hllelli2 Jan 23, 2025
5789347
Add Af3Pae class for processing PAE scores and enhance utility functions
hllelli2 Jan 23, 2025
2caa51b
Add tests for Af3Pae integration and enhance boltz output validation
hllelli2 Jan 23, 2025
d9a5d16
Add tests for Boltz output processing and integrate Af3Pae functionality
hllelli2 Jan 23, 2025
6c18517
Remove debug print statements from run_inference_wrapper function
hllelli2 Jan 23, 2025
d79ef6b
WIP
hllelli2 Jan 23, 2025
629523f
Fix variable name for ligand ccdCodes in add_ligand method
hllelli2 Jan 23, 2025
47759d7
Enhance Af3Pae.from_chai1 method to include ligand atom handling and …
hllelli2 Jan 23, 2025
9836fc1
Added af3_format paes for chai and boltz
hllelli2 Jan 23, 2025
0a7d6a0
Fix installation error message in check_af3_install function
hllelli2 Jan 23, 2025
b5af5de
No code changes made.
hllelli2 Jan 23, 2025
0210d66
Refactor output handling in AlphafoldOutput class to rename scores_fi…
hllelli2 Jan 23, 2025
3717265
Add conversion of PAE data to AlphaFold3 format in BoltzOutput and Ch…
hllelli2 Jan 23, 2025
b8b6f7a
Fix formatting in utils.py by adding a blank line for improved readab…
hllelli2 Jan 23, 2025
f400976
Add PAE viewer HTML templates and remove unused name setter in CifFil…
hllelli2 Jan 27, 2025
e040cd8
Add jinja2 dependency to pyproject.toml for template rendering support
hllelli2 Jan 27, 2025
921a704
WIP
hllelli2 Jan 27, 2025
aedbbc5
Refactor AlphafoldOutput and ChaiOutput to simplify data extraction; …
hllelli2 Jan 27, 2025
6290c13
Remove unused HTML templates for PAE viewer to streamline the project
hllelli2 Jan 27, 2025
38677fa
Refactor test_plots and pae_plots to improve structure and add templa…
hllelli2 Jan 28, 2025
c7e006a
Refactor compileTemplate function to accept relativePath for dynamic …
hllelli2 Jan 28, 2025
28b1c4a
Refactor imports in test files and utils.py for improved organization…
hllelli2 Jan 28, 2025
92f9376
Refactor import statements in boltz.py and chai.py for improved reada…
hllelli2 Jan 28, 2025
ca87f5b
Ficed trailing whitespace issues
hllelli2 Jan 28, 2025
b1752e8
Add MANIFEST.in and update pyproject.toml to include new plot templat…
hllelli2 Jan 28, 2025
1450cbd
Refactor abcfold.py and pae_plot.py for improved organization; add pl…
hllelli2 Jan 28, 2025
dab77ed
Update create_template.py to use src_path for relative path calculati…
hllelli2 Jan 28, 2025
c97346d
Refactor pae_viewer.py to ensure template_file is a Path object; upda…
hllelli2 Jan 28, 2025
46fa2f3
Fix template_file argument position in load_pae_viewer call and add d…
hllelli2 Jan 28, 2025
c692fbd
Refactor test fixtures in conftest.py to use session scope and improv…
hllelli2 Jan 28, 2025
8227e6a
Refactor tests to use output_objs for data handling and improve consi…
hllelli2 Jan 28, 2025
ce3b156
Comment out ligand check in CifFile class to retain residue processin…
hllelli2 Jan 28, 2025
474470b
Merge branch 'Output-plots' into pae_plots
hllelli2 Jan 28, 2025
cfeeb5b
Rename namedtuple in output_objs function for clarity
hllelli2 Jan 28, 2025
bdafe50
Test fixes
hllelli2 Jan 28, 2025
2c9ca62
fixtests2
hllelli2 Jan 28, 2025
3e614e4
Update MANIFEST.in to include PNG files and modify structure-panel te…
hllelli2 Jan 29, 2025
ec4e183
Add clashes CSV generation and update PAE plotting scripts
hllelli2 Jan 29, 2025
321f3c8
Update help message to reflect CSV format for crosslinks
hllelli2 Jan 29, 2025
226a602
Add clash detection method to CifFile for atom proximity checks
hllelli2 Jan 29, 2025
468e123
w
hllelli2 Jan 29, 2025
857d646
Refactor import statements in file_handlers.py for improved readability
hllelli2 Jan 29, 2025
b3652c9
Isort
hllelli2 Jan 29, 2025
d7a6494
Refactor create_pae_plots function parameters and update test cases f…
hllelli2 Jan 30, 2025
24df20d
Add VanderWaals radii constants for various elements
hllelli2 Jan 30, 2025
968f0d3
Enhance clash detection by incorporating VanderWaals radii and adding…
hllelli2 Jan 30, 2025
65dbd3a
Updated from artifact v3 to v4
hllelli2 Jan 30, 2025
3dc7b17
Added more elegant clash checker with variable atom radii distances c…
hllelli2 Jan 30, 2025
83f50a8
Refactor import statements to use abc_script_utils instead of af3_scr…
hllelli2 Jan 30, 2025
416d494
Add num_recycles argument to run functions and update command generation
hllelli2 Jan 30, 2025
249afae
Refactor argument handling and enhance PLDDT plotting with unique cha…
hllelli2 Jan 30, 2025
528f2f8
Set interactive mode to False by default in run_alphafold3 functions
hllelli2 Feb 4, 2025
f945df4
Update test_run_alphafold3.py
hllelli2 Feb 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
pip install coverage-badge
coverage-badge -o .blob/coverage.svg -f
- name: Upload coverage badge
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: coverage-badge
path: coverage.svg
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@ build/
.coverage

.vscode/

.history/
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
recursive-include abcfold *.js *.css *.html *.svg *.png
recursive-include abcfold/plots/pae-viewer-main/src/templates *.tpl
File renamed without changes.
81 changes: 71 additions & 10 deletions abcfold/abcfold.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,48 @@
import sys
import tempfile
from pathlib import Path
from typing import Dict

from abcfold.abc_script_utils import make_dir, setup_logger
from abcfold.add_mmseqs_msa import add_msa_to_json
from abcfold.af3_script_utils import make_dir, setup_logger
from abcfold.argparse_utils import (alphafold_argparse_util,
boltz_argparse_util, chai_argparse_util,
custom_template_argpase_util,
main_argpase_util, mmseqs2_argparse_util,
prediction_argparse_util)
from abcfold.plots.pae_plot import create_pae_plots
from abcfold.plots.plddt_plot import plot_plddt
from abcfold.processoutput.alphafold3 import AlphafoldOutput
from abcfold.processoutput.boltz import BoltzOutput
from abcfold.processoutput.chai import ChaiOutput
from abcfold.run_alphafold3 import run_alphafold3

logger = setup_logger()

PLOTS_DIR = ".plots"


def run(args, config, defaults, config_file):
"""Run AlphaFold3"""
"""Run ABCFold

Args:
args (argparse.Namespace): Arguments from the command line
config (configparser.SafeConfigParser): Config parser object
defaults (dict): Default values from the config file
config_file (Path): Path to the config file


Raises:
SystemExit: If the database directory or model parameters directory is not found


"""
outputs = []

args.output_dir = Path(args.output_dir)

make_dir(args.output_dir, overwrite=args.override)
make_dir(args.output_dir.joinpath(PLOTS_DIR))

updated_config = False
if args.model_params != defaults["model_params"]:
Expand All @@ -45,9 +65,9 @@ def run(args, config, defaults, config_file):
sys.exit(1)

with open(args.input_json, "r") as f:
af3_json = json.load(f)
input_params = json.load(f)

name = af3_json.get("name")
name = input_params.get("name")
if name is None:
logger.error("Input JSON must contain a 'name' field")
sys.exit(1)
Expand All @@ -72,14 +92,14 @@ def run(args, config, defaults, config_file):
else:
run_json = Path(args.output_json)

af3_json = add_msa_to_json(
input_params = add_msa_to_json(
input_json=input_json,
templates=args.templates,
num_templates=args.num_templates,
custom_template=args.custom_template,
custom_template_chain=args.custom_template_chain,
target_id=args.target_id,
af3_json=af3_json,
input_params=input_params,
output_json=run_json,
to_file=True,
)
Expand All @@ -102,11 +122,13 @@ def run(args, config, defaults, config_file):
model_params=args.model_params,
database_dir=args.database_dir,
number_of_models=args.number_of_models,
num_recycles=args.num_recycles,
)

# Need to find the name of the af3_dir
af3_out_dir = list(args.output_dir.iterdir())[0]
_ = AlphafoldOutput(af3_out_dir, name)
ao = AlphafoldOutput(af3_out_dir, input_params, name)
outputs.append(ao)

if args.boltz1:
from abcfold.run_boltz import run_boltz
Expand All @@ -115,9 +137,13 @@ def run(args, config, defaults, config_file):
input_json=run_json,
output_dir=args.output_dir,
save_input=args.save_input,
number_of_models=args.number_of_models,
num_recycles=args.num_recycles,
)
bolt_out_dir = list(args.output_dir.glob("boltz_results*"))[0]
_ = BoltzOutput(bolt_out_dir, name)
bo = BoltzOutput(bolt_out_dir, input_params, name)
bo.add_plddt_to_cif()
outputs.append(bo)

if args.chai1:
from abcfold.run_chai1 import run_chai
Expand All @@ -128,17 +154,52 @@ def run(args, config, defaults, config_file):
output_dir=chai_output_dir,
save_input=args.save_input,
number_of_models=args.number_of_models,
num_recycles=args.num_recycles,
)

_ = ChaiOutput(chai_output_dir, name)
co = ChaiOutput(chai_output_dir, input_params, name)
outputs.append(co)

plots(outputs, args.output_dir.joinpath(PLOTS_DIR))


def plots(outputs: list, output_dir: Path):
"""
Generate plots for the output of the different programs

Args:
outputs (list): List of output objects

"""
pathway_plots = create_pae_plots(outputs, output_dir=output_dir)
plddt_plot_input: Dict[str, list] = {}
for output in outputs:
if isinstance(output, AlphafoldOutput):
for seed in output.seeds:
if "Alphafold3" not in plddt_plot_input:
plddt_plot_input["Alphafold3"] = []
plddt_plot_input["Alphafold3"].extend(output.cif_files[seed])
elif isinstance(output, BoltzOutput):

plddt_plot_input["Boltz-1"] = output.cif_files
elif isinstance(output, ChaiOutput):
plddt_plot_input["Chai-1"] = output.cif_files

plot_plddt(plddt_plot_input, output_name=output_dir.joinpath("plddt_plot.html"))

pathway_plots["plddt"] = str(output_dir.joinpath("plddt_plot.html").resolve())

return pathway_plots


def main():
"""
Run AlphaFold3 / Boltz1 / Chai-1
"""
import argparse

parser = argparse.ArgumentParser(description="Run AlphaFold3 / Boltz1 / Chai-1")

# Load defaults from config file
defaults = {}
config_file = Path(__file__).parent.joinpath("data", "config.ini")
config = configparser.SafeConfigParser()
Expand Down
2 changes: 1 addition & 1 deletion abcfold/add_custom_template.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import logging
import os

from abcfold.af3_script_utils import get_custom_template
from abcfold.abc_script_utils import get_custom_template
from abcfold.argparse_utils import custom_template_argpase_util

logger = logging.getLogger("logger")
Expand Down
24 changes: 15 additions & 9 deletions abcfold/add_mmseqs_msa.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import requests # type: ignore
from tqdm.autonotebook import tqdm

from abcfold.af3_script_utils import (align_and_map,
from abcfold.abc_script_utils import (align_and_map,
extract_sequence_from_mmcif,
get_custom_template, get_mmcif)
from abcfold.argparse_utils import (custom_template_argpase_util,
Expand Down Expand Up @@ -42,15 +42,15 @@ def add_msa_to_json(
custom_template,
custom_template_chain,
target_id,
af3_json=None,
input_params=None,
output_json=None,
to_file=True,
):
if af3_json is None:
if input_params is None:
with open(input_json, "r") as f:
af3_json = json.load(f)
input_params = json.load(f)

for sequence in af3_json["sequences"]:
for sequence in input_params["sequences"]:
if "protein" in sequence:
input_sequence = sequence["protein"]["sequence"]
with tempfile.TemporaryDirectory() as tmpdir:
Expand All @@ -77,7 +77,13 @@ def add_msa_to_json(
# Can only add templates to protein sequences, so check if there
# are multiple protein sequences in the input json
if (
len([x for x in af3_json["sequences"] if "protein" in x.keys()])
len(
[
x
for x in input_params["sequences"]
if "protein" in x.keys()
]
)
> 1
and not target_id
):
Expand All @@ -99,13 +105,13 @@ def add_msa_to_json(
if to_file:
if output_json:
with open(output_json, "w") as f:
json.dump(af3_json, f)
json.dump(input_params, f)
else:
output_json = input_json.replace(".json", "_mmseqs.json")
with open(output_json, "w") as f:
json.dump(af3_json, f)
json.dump(input_params, f)

return af3_json
return input_params


# Code from https://github.com/sokrypton/ColabFold
Expand Down
20 changes: 8 additions & 12 deletions abcfold/argparse_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,12 @@ def prediction_argparse_util(parser):
default=5,
help="Number of models to generate",
)
parser.add_argument(
"--num_recycles",
type=int,
default=10,
help="Number of recycles to use during the inference",
)
return parser


Expand All @@ -60,6 +66,7 @@ def boltz_argparse_util(parser):
help="Save the input json file",
default=False,
)

return parser


Expand All @@ -70,16 +77,6 @@ def chai_argparse_util(parser):
action="store_true",
help="Run Chai-1",
)
# check if save input is in the parser
if "--save_input" not in parser._option_string_actions:

parser.add_argument(
"--save_input",
action="store_true",
help="Save the input json file",
default=False,
)
# add more arguments here
return parser


Expand All @@ -104,14 +101,13 @@ def alphafold_argparse_util(parser):
"-a",
"--alphafold3",
action="store_true",
help="Run Alphafold",
help="Run Alphafold3",
)

parser.add_argument(
"--override",
help="Override the existing output directory, if it exists",
action="store_true",
)
# add more arguments here

return parser
6 changes: 3 additions & 3 deletions abcfold/chai1/af3_to_chai.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,14 +215,14 @@ def ccd_to_smiles(self, ccd_id: str):
def add_ligand(self, seq: dict):
lig_id = seq["ligand"]["id"]
ligand_str = ""
if "ccdCode" in seq["ligand"]:
if "ccdCodes" in seq["ligand"]:
if isinstance(lig_id, list):
for i in seq["ligand"]["id"]:
smile = self.ccd_to_smiles(seq["ligand"]["ccdCode"][0])
smile = self.ccd_to_smiles(seq["ligand"]["ccdCodes"][0])
if smile:
ligand_str += f">ligand|{i}\n{smile}\n"
else:
smile = self.ccd_to_smiles(seq["ligand"]["ccdCode"])
smile = self.ccd_to_smiles(seq["ligand"]["ccdCodes"])
if smile:
ligand_str = f">ligand|{lig_id}\n{smile}\n"
if "smiles" in seq["ligand"]:
Expand Down
2 changes: 1 addition & 1 deletion abcfold/chai1/chai.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def run_inference_wrapper(
seed: int | None = None,
device: str | None = None,
):
# print(num_diffn_samples)

result = run_inference(
fasta_file=fasta_file,
output_dir=output_dir,
Expand Down
21 changes: 21 additions & 0 deletions abcfold/plots/pae-viewer-main/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 Christoph Elfmann

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Loading
Loading