Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues #296, #297, and #298 #299

Merged
merged 10 commits into from
Feb 18, 2025
Merged

Fix issues #296, #297, and #298 #299

merged 10 commits into from
Feb 18, 2025

Conversation

lohedges
Copy link
Contributor

This PR closes #296 by setting and SDF tags as properties of the RDKit molecule during conversion. (The SDF metada is stored internally in Sire as a sdf_data property on molecule.) On conversion back to Sire, all non-internal properties of a RDKit molecule are now stored as an rdkit_data property. We can't write back to sdf_data, since the RDKit molecule doesn't tag where the properties come from, e.g. were they set when loading from SDF, or otherwise, so there is no clean separation of what data came from where. A user could allow these tags to be used when writing to SDF by using the property map, e.g. by remapping sdf_data to rdkit_data. In any case, the main use case here is the ability to convert an SDF molecule to RDKit, then later write to SDF using RDKit and retain all of the tags in the original file, which this does.

The PR also closes #297 by setting the MolName of any molecule loaded from SDF to the name in that file. Currently it takes a default molecule name of MOL. There is also a name molecular property that is set, but CI failures made me realise that this can also be used for other information, e..g. the neopentane-methane merged system appeared to have some kind of atom selection as the name property.

In [1]: import sire as sr

In [2]: mols = sr.load_test_files("neo_meth_scratch.bss")


In [3]: mols[0].property("name0")
Out[3]:
SireMol::AtomStringProperty( size=17
0: C1
1: C2
2: C3
3: C4
4: C5
...
12: H13
13: H14
14: H15
15: H16
16: H17
)

The PR also closes #298 by adding special handling for velocity records when the number of atoms in an AMBER RST7 file is 2 or less. If so, then the velocity line could be blank if no data is present. (There is a check to see how many lines should be exist if velocities are present, which this passes, but the line contains no data.) I'm not sure what parser is adding a blank line at the end, but it is part of the Exs pipeline.

  • I confirm that I have merged the latest version of devel into this branch before issuing this pull request (e.g. by running git pull origin devel): [y]
  • I confirm that I have added a changelog entry to the changelog (we will add a link to this PR as part of the review): [y]
  • I confirm that I have permission to release this code under the GPL3 license: [y]

@lohedges lohedges added bug Something isn't working exscientia Related to work with Exscientia labels Feb 17, 2025
@lohedges lohedges requested a review from chryswoods February 17, 2025 16:05
Copy link
Contributor

@chryswoods chryswoods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good :-)

@lohedges lohedges merged commit 9d43587 into devel Feb 18, 2025
8 of 10 checks passed
@lohedges lohedges deleted the fix_296-298 branch February 18, 2025 06:13
lohedges added a commit that referenced this pull request Feb 18, 2025
lohedges added a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working exscientia Related to work with Exscientia
Projects
None yet
2 participants