-
Notifications
You must be signed in to change notification settings - Fork 874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atom_site_label
in CIF file are not unique
#3761
Comments
One of the reasons labels should be unique: they are used to indicate bonding (and angles, torsions, etc). If they are not unique, this information is useless. (And I am working on a PR to add that capability to pymatgen.) |
I can confirm that this is likely a bug (in terms of having agreement with the CIF spec). |
The bug may not necessarily be in the CifWriter itself. Pymatgen expands the structure to P1 when reading a cif file (symmetry information gets lost). This includes duplicating the labels. |
Does pymatgen require labels to be unique? I do not know. If so, the bug is in the symmetry expansion. But what is sure is that labels in a CIF file should be unique (whether pymatgen accepts duplicate labels or not). |
No, pymatgen does not require unique labels. The duplication of labels happens when the symmetry is expanded in the CifParser: Lines 1003 to 1011 in 0e57abf
One way to solve would be to ensure labels are suffixed 'abcde...' when the symmetry gets expanded there. Although this can also be a check on the CifWriter side. |
@stefsmeets I just wanted to mention that this might be problematic for very large structures. There are only 26 letters in the alphabet. With exception to this drawback, such a solution would be nice. |
I'd be happy to have a go at this |
@stefsmeets sound great. (cc @janosh) |
Some options:
I want to avoid agressively/automatically relabelling, because I know that this can lead to unexpected results which can be frustrating. So I'm leaning towards 1 or 2. To help with 1 or 2, I want to add a method: Let me know what you think. |
|
atom_site_label
in CIF file are not unique
Python version
Python 3.11.6
Pymatgen version
2023.10.4
Operating system version
macOS 14.4.1
Current behavior
Read a structure with symmetry, and write it to a CIF file. My example is:
Then read it and write it back to CIF file:
The new CIF file will have atoms with duplicate
atom_site_label
:CIF format specifies (https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_label.html):
So this shouldn't happen. I believe this incorrect behavior was introduced by #3183 (and follow-ups: #3423 and #3527)
Expected Behavior
All sites in the CIF file should be given a unique
atom_site_label
. This should be enforced by theCifWriter
class.Minimal example
No response
Relevant files to reproduce this bug
No response
The text was updated successfully, but these errors were encountered: