Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JSON serialization of ValueEnums #944

Merged
merged 3 commits into from
Feb 21, 2024

Conversation

Andrew-S-Rosen
Copy link
Member

@Andrew-S-Rosen Andrew-S-Rosen commented Feb 21, 2024

Closes #914.

from monty.serialization import dumpfn
from emmet.core.symmetry import CrystalSystem

sym = CrystalSystem("Triclinic")

dumpfn(sym, "test.json")

This now works.

@Andrew-S-Rosen Andrew-S-Rosen changed the title Remove ValueEnum class Fix (de)serialiation of ValueEnums Feb 21, 2024
@Andrew-S-Rosen Andrew-S-Rosen changed the title Fix (de)serialiation of ValueEnums Fix JSON serialiation of ValueEnums Feb 21, 2024
@Andrew-S-Rosen
Copy link
Member Author

Tagging @munrojm. Fixed it, I think! If you wouldn't mind minting a new version after this is merged, that would be a huge help.

Thank you @jmmshn for your suggestion. It worked like a charm! Not sure what I was doing wrong before.

@munrojm munrojm merged commit 7735bb1 into materialsproject:main Feb 21, 2024
10 checks passed
@jmmshn
Copy link
Contributor

jmmshn commented Feb 21, 2024

I will never end the crusade against custom as_dict / from_dict

@esoteric-ephemera
Copy link
Collaborator

Hey I think this broke atomate2, see this atomate2 issue which I've reproduced with emmet>=0.78.0, and couldn't reproduce with emmet<=0.77.1

@Andrew-S-Rosen
Copy link
Member Author

It's probably better to solve the root cause of the problem in atomate2 (or monty, if relevant) rather than patch back in a custom .as_dict() method that also doesn't serialize to a dictionary but rather a string.

@utf
Copy link
Member

utf commented Mar 27, 2024

Maybe I'm missing something, but this PR changed the behaviour of these objects. That doesn't seem like an atomate2 issue as we were using these objects as intended.

The docstring specifies that ValueEnum should serialise to a string.

@esoteric-ephemera
Copy link
Collaborator

esoteric-ephemera commented Mar 27, 2024

There's a jsanitize call in jobflow with strict=True passed that would require VaspObject to have an as_dict method, which was removed here. This appears as a bug in atomate2, so I agree with @Andrew-S-Rosen and @utf that the placement of a fix should change. I'll try to work out where that is best suited

@utf
Copy link
Member

utf commented Mar 27, 2024

The point is that we want to serialise these objects as strings. That's why we were using ValueEnum.

I'd ask if you want to have enums that serialise as dicts then please just use regular Enums (now supported in monty).

@Andrew-S-Rosen
Copy link
Member Author

Andrew-S-Rosen commented Mar 28, 2024

I guess there are a few things to unpack here.

In emmet, and therefore in atomate2 and other packages, there are a few objects inheriting from ValueEnums. The example I mentioned in #944 is one example. When a dictionary or other object containing a CrystalSystem is dumped with dumpfn, an error is raised. That's because the custom .as_dict() method is returning a str. @jmmshn suggested removing the method, which was what was done in this PR.

The ValueEnum in emmet is, admittedly, advertised as a way to serialize the object to str. So, I can see the argument there for retaining the .as_dict() method to be a str return. However, the method is called .as_dict(). In my opinion, we should not have a method like .as_dict() returning something that is not a dictionary. This has downstream negative effects, as mentioned in the example above, and is a hack.

I suppose the question to ask is there a better way to achieve the desired str serialization without relying on such a hacky/fake .as_dict() method? I would argue the answer is yes --- inherit from Enum and take advantage of monty's enum_values=True flag.

Now that the situation is a bit clearer to me and we also see some of the downstream effects, perhaps the better approach would simply be to re-add the .as_dict() method but deprecate the entire ValueEnum class, although maybe that is considered to be even more drastic.

@utf
Copy link
Member

utf commented Mar 28, 2024

I think this PR and issue #914 are misunderstanding what ValueEnum is for. There many objects in Python that cannot be serialised as json by themselves. For example, str, int, float, list etc. Instead, they have to be serialised as part of a dict. The fact that this does not work

from monty.serialization import dumpfn
from emmet.core.symmetry import CrystalSystem

sym = CrystalSystem("Triclinic")
dumpfn(sym, "test.json")

is not an issue in itself. E.g., the following code does not mean that string serialisation is broken:

dumpfn("Triclinic", "test.json")  # will fail

The goal of ValueEnum is to define a set of string literals in the form of an enum. When we serialise these in a task document, we want to only store the enum value. E.g.

{
    "crystal_system": "Triclinic"
}

rather than

{
    "crystal_system": {
        "@module": "emmet.core.utils",
        "@class": "ValueEnum",
        "value": "Triclinic",
    }
}

The latter makes it convoluted to query the task document easily.

Perhaps there is a better way to achieve this and I agree that the as_dict labelling could be confusing. However, the point is this code has been in production for several years and works absolutely fine. The suggestion of enabling enum_values=True is not a good fix in this case, since this will mean it isn't possible to store ANY enums in their dict form. ValueEnum combined with enum_values=False provides a mechanism to have two types of Enum serialisation behaviour.

To summarise, the issue raised in #914 is not a bug it is the intended behaviour as per the docstring and existing usage. This change has broken atomate2 and other downstream packages. Please can we revert it.

@janosh janosh changed the title Fix JSON serialiation of ValueEnums Fix JSON serialization of ValueEnums May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: The .as_dict() method on CrystalSystem is not Monty compatible
5 participants