Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing issues 75 & 81 #82

Merged
merged 15 commits into from
Dec 6, 2022
Merged

Addressing issues 75 & 81 #82

merged 15 commits into from
Dec 6, 2022

Conversation

bobym
Copy link
Collaborator

@bobym bobym commented Oct 18, 2022

  1. Updating MCL1 protein.pdb to have caps.

  2. Replacing all ligands.sdf with new versions to fix the ligand distortions seen in issue Abnormal ligand conformations #81

  3. Adding re-curated thrombin ligands (expanded ligand set)

1. Updating MCL1 protein.pdb to have caps.

2. Replacing all ligands.sdf with new versions to fix the ligand distortions seen in issue #81

3. Adding re-curated thrombin ligands (expanded ligand set)
@bobym bobym requested review from dotsdl and IAlibay October 18, 2022 21:13
@bobym bobym self-assigned this Oct 18, 2022
@bobym bobym linked an issue Oct 18, 2022 that may be closed by this pull request
@IAlibay
Copy link
Collaborator

IAlibay commented Oct 24, 2022

I visually inspected all the ligands, can confirm there are no further abnormal conformations.

@IAlibay
Copy link
Collaborator

IAlibay commented Oct 24, 2022

@bobym, the #75 fixes don't seem to be here?

Replace MCL1 protein.pdb for issue #75
@bobym
Copy link
Collaborator Author

bobym commented Oct 24, 2022

@IAlibay sorry this got overwritten when I merged my two PRs. Should be there now!

@ijpulidos
Copy link
Collaborator

I'm not sure this needs to be addressed, but I see some .reflig.maegz files that are not plain text files, and if these keep changing overtime they could end up consuming a lot of space in the repo. I wonder if we want these at all and if so, maybe we would need to add them as artifacts. What would be the best approach for this?

@IAlibay
Copy link
Collaborator

IAlibay commented Nov 22, 2022

Wasn't too sure on the convention for smiles, so I just used a plain rdkit:

sdf_supp = Chem.SDMolSupplier('02_ligands/ligands.sdf', removeHs=False)
sdfs = [m for m in sdf_supp]

for sdf in sdfs:
    smiles = Chem.MolToSmiles(sdf)
    name = sdf.GetProp('_Name')
    print(f'{name}: {smiles}')

type of approach. I'm not sure if we want the smiles to be isomeric (we seem to have different isomers for some entries I think?) or explicit bonds/Hs, etc... the whole thing seems rather undefined.

@IAlibay
Copy link
Collaborator

IAlibay commented Nov 22, 2022

I'm not sure this needs to be addressed, but I see some .reflig.maegz files that are not plain text files, and if these keep changing overtime they could end up consuming a lot of space in the repo. I wonder if we want these at all and if so, maybe we would need to add them as artifacts. What would be the best approach for this?

As per today's call - maegz files need to be removed from the git history and added in as a artifacts.

@ijpulidos
Copy link
Collaborator

While trying to run the structures using perses I faced the following issue for the pfkfb3 target

ValueError: No template found for residue 449 (POP).  This might mean your input topology is missing some atoms or bonds, or possibly that you are using the wrong force field.

Which refers to the POP residue in

HETATM 7261 P1 POP A 602 92.394 47.635 260.369 1.00 25.65 P
HETATM 7262 O1 POP A 602 91.918 47.441 258.922 1.00 24.48 O
HETATM 7263 O2 POP A 602 91.938 48.980 260.962 1.00 23.09 O1-
HETATM 7264 O3 POP A 602 92.086 46.427 261.263 1.00 22.53 O1-
HETATM 7265 O POP A 602 94.120 47.663 260.320 1.00 21.12 O
HETATM 7266 P2 POP A 602 95.359 48.505 259.472 1.00 22.73 P
HETATM 7267 O4 POP A 602 95.942 47.517 258.447 1.00 23.19 O
HETATM 7268 O5 POP A 602 96.375 48.888 260.559 1.00 19.27 O1-
HETATM 7269 O6 POP A 602 94.699 49.715 258.803 1.00 22.72 O1-

@IAlibay IAlibay mentioned this pull request Nov 28, 2022
@IAlibay
Copy link
Collaborator

IAlibay commented Nov 28, 2022

@ijpulidos I got confused between PRs 🤣 - yeah we should just fix this here whilst we're at it.

@IAlibay
Copy link
Collaborator

IAlibay commented Nov 28, 2022

@ijpulidos I got confused between PRs 🤣 - yeah we should just fix this here whilst we're at it.

I manually extracted the POP from the PDB and placed it in the cofactors.sdf in 2784fc7

@dotsdl dotsdl assigned IAlibay and unassigned bobym Nov 29, 2022
@codecov
Copy link

codecov bot commented Nov 29, 2022

Codecov Report

Merging #82 (c416bf2) into main (7da6cae) will decrease coverage by 0.08%.
The diff coverage is n/a.

Additional details and impacted files

@@ -76,7 +76,8 @@ def test_target_class():
df1 = tgt.get_ligand_set_dataframe(columns=columns)
df2 = ligand_set.get_dataframe(columns=columns)
pd.testing.assert_frame_equal(df1, df2)
assert tgt.get_ligand_set_html() == ligand_set.get_html()
# Temporarily disable in #82 - RDKit mol hash to different values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the way this used to work is that it somehow seemed to assume that rdkit molecules hashed to the same thing. Since this is all likely going away with #78 I'm going to say we can just temporarily disable it for now (it's not related to this PR)?

Copy link
Collaborator

@ijpulidos ijpulidos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay on this review. This looks good to me! The structures are getting run as far as I can test with perses. We need to re-check with the changes in #83 once edges are regenerated.

@IAlibay
Copy link
Collaborator

IAlibay commented Dec 6, 2022

Just trying to fix CI and then I'll squash merge.

@IAlibay
Copy link
Collaborator

IAlibay commented Dec 6, 2022

Excessive RTD memory consumption is a new one to me :/ Will deal with this elsewhere, probably need to switch to mamba.

@IAlibay
Copy link
Collaborator

IAlibay commented Dec 6, 2022

Squash merging to avoid large commit diffs.

@IAlibay IAlibay merged commit e9b153e into main Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Abnormal ligand conformations
3 participants