Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure features 'assemblies' and 'disorder' depend on representation, not actual structure features #342

Open
merkys opened this issue Nov 27, 2020 · 3 comments
Labels
topic/property-standardization The specification of the precise data representation of properties and entries

Comments

@merkys
Copy link
Member

merkys commented Nov 27, 2020

From the specification:

  • assemblies: this flag MUST be present if the property assemblies is present.
  • disorder: this flag MUST be present if any one entry in the species list has a chemical_symbols list that is longer than 1 element.

However, both assemblies and disorder do not directly depend on the features of structure, but on its representation by the provider. Consider these two alternative descriptions of the same structure (taken from the specification):

	   {
	     "cartesian_site_positions": [[0,0,0]],
	     "species_at_sites": ["SiGe-vac"],
	     "species": [
		 {
		   "name": "SiGe-vac",
		   "chemical_symbols": ["Si", "Ge", "vacancy"],
		   "concentration": [0.3, 0.5, 0.2]
		 }
	     ]
	     // ...
	   }

and

	   {
	     "cartesian_site_positions": [ [0,0,0], [0,0,0], [0,0,0] ],
	     "species_at_sites": ["Si", "Ge", "vac"],
	     "species": [
	       { "name": "Si", "chemical_symbols": ["Si"], "concentration": [1.0] },
	       { "name": "Ge", "chemical_symbols": ["Ge"], "concentration": [1.0] },
	       { "name": "vac", "chemical_symbols": ["vacancy"], "concentration": [1.0] }
	     ],
	     "assemblies": [
	       {
		 "sites_in_groups": [ [0], [1], [2] ],
		 "group_probabilities": [0.3, 0.5, 0.2]
	       }
	     ]
	     // ...
	   }

Thus the structure in the first example would have structure features [ "disorder" ], whereas the second one [ "assemblies" ].

Having structure features that denote representation instead of actual structure features seems somewhat counter-intuitive to me. Could anyone confirm this was intentional, or is this a corner case?

@merkys merkys added the topic/property-standardization The specification of the precise data representation of properties and entries label Nov 27, 2020
@rartino
Copy link
Contributor

rartino commented Nov 27, 2020

IMO structure_features was intended more as a content negotiation feature between the client and server than something you typically would use to determine the "physics" of the material. Nevertheless, the absence of declaring features (assemblies, disorder, etc) does restrict the domain for the material. But, as you note in the example, it doesn't strictly work the other way - declaring a feature is not a commitment that the structure cannot have a more simple representation.

The computational difficulty in strictly knowing whether a simpler representation could exist aside, a typical scenario in which I forsee these flags being used is this:

  • A client fetches structures to do "normal" static DFT calculations. Hence, the user writing that client wants to exclude structures with disorder and assemblies, because those cannot easily be translated into, e.g., a VASP POSCAR file.

  • The client later adds the capability of transforming the OPTIMADE disorder representation into SQS supercells that can be calculated in VASP. Hence, the processing is extended to accept structures with the disorder feature. However, the greater generality of assemblies is not supported, so those structures are still excluded. (Even though, as you note, sometimes they could be translated into the simpler disorder representation.)

@merkys
Copy link
Member Author

merkys commented Dec 2, 2020

Thank you for the explanation. So structures having sites with mixtures of chemical elements or vacancies seem to be corner cases. I would prefer some way to dispel the ambiguity, but cannot think of an elegant solution. Surely we could attempt to standardize the representation, but I am in no position to suggest putting one of the representations in front of the other.

In CIF files (ultimate truth source for the COD) vacancies are expressed by occupancy parameter, which more naturally fits in the first representation. Mixture sites usually are split into several sites with the same coordinates, and we at the COD do little to identify such sites, as the number of such entries is low.

@merkys
Copy link
Member Author

merkys commented Jun 12, 2024

We have revisited the topic in workshop discussion with @rartino and @blokhin and it seems that we arrived to a consensus that we are OK with assemblies, disorder and structural_features describing the representation of data, not the underlying structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/property-standardization The specification of the precise data representation of properties and entries
Projects
None yet
Development

No branches or pull requests

2 participants