-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addressing issues 75 & 81 #82
Conversation
1. Updating MCL1 protein.pdb to have caps. 2. Replacing all ligands.sdf with new versions to fix the ligand distortions seen in issue #81 3. Adding re-curated thrombin ligands (expanded ligand set)
I visually inspected all the ligands, can confirm there are no further abnormal conformations. |
Replace MCL1 protein.pdb for issue #75
@IAlibay sorry this got overwritten when I merged my two PRs. Should be there now! |
I'm not sure this needs to be addressed, but I see some |
Wasn't too sure on the convention for smiles, so I just used a plain rdkit:
type of approach. I'm not sure if we want the smiles to be isomeric (we seem to have different isomers for some entries I think?) or explicit bonds/Hs, etc... the whole thing seems rather undefined. |
As per today's call - maegz files need to be removed from the git history and added in as a artifacts. |
While trying to run the structures using perses I faced the following issue for the
Which refers to the protein-ligand-benchmark/data/pfkfb3/01_protein/crd/protein.pdb Lines 7765 to 7773 in d6ad2bf
|
@ijpulidos I got confused between PRs 🤣 - yeah we should just fix this here whilst we're at it. |
I manually extracted the POP from the PDB and placed it in the cofactors.sdf in 2784fc7 |
@@ -76,7 +76,8 @@ def test_target_class(): | |||
df1 = tgt.get_ligand_set_dataframe(columns=columns) | |||
df2 = ligand_set.get_dataframe(columns=columns) | |||
pd.testing.assert_frame_equal(df1, df2) | |||
assert tgt.get_ligand_set_html() == ligand_set.get_html() | |||
# Temporarily disable in #82 - RDKit mol hash to different values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, the way this used to work is that it somehow seemed to assume that rdkit molecules hashed to the same thing. Since this is all likely going away with #78 I'm going to say we can just temporarily disable it for now (it's not related to this PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay on this review. This looks good to me! The structures are getting run as far as I can test with perses
. We need to re-check with the changes in #83 once edges are regenerated.
Just trying to fix CI and then I'll squash merge. |
Excessive RTD memory consumption is a new one to me :/ Will deal with this elsewhere, probably need to switch to mamba. |
Squash merging to avoid large commit diffs. |
Updating MCL1 protein.pdb to have caps.
Replacing all ligands.sdf with new versions to fix the ligand distortions seen in issue Abnormal ligand conformations #81
Adding re-curated thrombin ligands (expanded ligand set)