Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Handling unknown enum variants #759

Open
zeylahellyer opened this issue Apr 14, 2021 · 4 comments · Fixed by #1550 · May be fixed by #2045
Open

[RFC] Handling unknown enum variants #759

zeylahellyer opened this issue Apr 14, 2021 · 4 comments · Fixed by #1550 · May be fixed by #2045
Assignees
Labels
c-model Affects the model crate m-breaking change Breaks the public API. t-refactor Refactors APIs or code.
Milestone

Comments

@zeylahellyer
Copy link
Member

zeylahellyer commented Apr 14, 2021

When a new API feature is released involving a new enum variant the deserialization of any parent struct will fail due to the unknown variant. This can be seen with the release of stage channels which caused the deserialization of any guild with one to fail. This needed a hotfix and quick release as more stage channels were being seen, causing users to experience an increasing number of failures.

There's a few routes we can go to handle this but they all have their drawbacks.

Ignoring failed optional or list element deserializations

When the deserialization of an optional field fails we could simply ignore it from the parent type. Consider this example:

#[derive(serde_repr::Deserialize_repr)]
#[repr(u8)]
enum BType {
    First = 1,
    Second = 2,
    Third = 3,
}

#[derive(serde::Deserialize)]
struct Foo {
    a: u64,
    b: Option<BType>,
}

assert!(serde_json::from_str::<Foo>(r#"{"a": 1, "b": 5}"#).is_err());

Due to a new BType being published in the API the deserialization of Foo fails. This is similar to a new ChannelType enum variant causing the failure of the deserialization of the parent GuildChannel cascading into the failure of the deserialization of Guilds.

We could instead simply ignore the failure of the deserialization of Foo::b and log the error. This could be applied similarly to individual failing elements within a list. The benefit of this is that the user will still get the rest of Foo. The drawback is that this basically silently ignores errors and throws away information, thus not giving the user all of the information from the raw API response.

Unknown variant

Another option is an Unknown variant at the maximum of the u8 spectrum when the deserialization of an enum otherwise fails. This allows users to be able to tell that the variant isn't supported. See this example:

#[derive(Debug, PartialEq)]
#[repr(u8)]
enum Foo {
    Bar = 1,
    Baz = 2,
    Qux = 3,
    Unknown = 255,
}

impl<'de> Deserialize<'de> Foo {
    // pseudocode...
    Ok(match number {
        1 => Foo::Bar,
        2 => Foo::Baz,
        3 => Foo::Qux,
        other => {
            tracing::warn!(%number, "unknown foo type");

            Foo::Unknown
        }
    })
}

assert_eq!(Ok(Foo::Unknown), serde_json::from_str::<Foo>("5"));

As stated above the benefit is that the deserialization of any parents continues while the user can know that the variant is unknown. The downside is that the actual value is lost and this is more of a "hack".

Parent wrapping enum

Lastly, we could wrap the type in an enum as part of the parent type. This enum could have a variant containing the deserialized enum or an "unknown" variant containing the number. This allows the user to still have the type safety of the enum when deserialization was successful while also being able to know and retrieve the raw numeric value when the numeric value is unknown. This is easier to show in code:

#[derive(Debug, PartialEq, serde_repr::Deserialize_repr)]
enum Foo {
    Bar = 1,
    Baz = 2,
    Qux = 3,
}

#[derive(Debug, PartialEq)]
enum TypeWrapper<T> {
    Unknown(u8),
    Value(T),
}

impl<'de> Deserialize<'de> for TypeWrapper {
    // impl...
}

trait ReprAccess {
    fn number(&self) -> u8;
}

impl ReprAccess for Foo {
    fn number(&self) -> u8 {
        *self as u8
    }
}

impl<T> TypeWrapper<T> {
    fn is_unknown(&self) -> bool {
        matches!(self, Self::Unknown(_))
    }

    fn is_value(&self) -> bool {
        matches!(self, Self::Value(_))
    }

    fn value(&self) -> Option<T> {
        match self {
            Self::Unknown(_) => None,
            Self::Value(value) => Some(value),
        }
    }
}

impl<T: ReprAccess> TypeWrapper<T> {
    fn number(&self) -> u8 {
        match self {
            Self::Unknown(number) => number,
            Self::Value(repr) => repr.number(),
        }
    }
}

let success = serde_json::from_str::<TypeWrapper<Foo>>("3")?;
assert_eq!(Some(Foo::Qux), success.value());
assert_eq!(3, success.number());

let failure = serde_json::from_str::<TypeWrapper<Foo>>("5")?;
assert!(failure.value().is_none());
assert_eq!(5, failure.number());

The benefit is that the type safety of Foo remains while the unknown value is concisely represented, preventing deserialization failures. The downside is that this requires an additional method access or pattern match to retrieve either value.

Notes

This is all pseudo-code and untested. Personally I'd go with the TypeWrapper implementation because I see this as having the least drawbacks, is not a hack unlike the second option, and doesn't throw away information unlike the first.

@zeylahellyer zeylahellyer added c-model Affects the model crate m-breaking change Breaks the public API. enhancement labels Apr 14, 2021
@zeylahellyer zeylahellyer modified the milestones: 0.4, 0.5 Apr 14, 2021
@zeylahellyer zeylahellyer moved this to Todo in Go 1.0 Dec 26, 2021
@zeylahellyer zeylahellyer added this to the Model Rework milestone Dec 26, 2021
@zeylahellyer zeylahellyer self-assigned this Dec 27, 2021
@7596ff 7596ff added t-refactor Refactors APIs or code. and removed t-enhancement-DEPRECATED labels Jan 12, 2022
@AEnterprise
Copy link
Member

I'd like to propose a 4th option: separate the incoming and outgoing payloads:

  • in all the incoming payloads replace all enums with their underlying types (u8, string, ...). For easy comparing we could expose some consts or an enum with compare implementations against the raw type.
  • Keep the enums for validated outgoing payloads (with a try_from impl from the raw type for easy converting)

This way we eliminate both the data loss potential on the incoming payloads as well as keep it ergonomic to work with for downstream users. In addition if discord does add something new that the lib doesn't know about the user can compare it to an own const or enum to work with it (so they are not forced to do several major twilight upgrades just to access a new channel type)

going with the only sending validated payloads to the discord api to avoid 400 errors twilight uses we should keep using enums there to enforce this.

@AEnterprise
Copy link
Member

AEnterprise commented Feb 8, 2022

a few implementation options:

  1. no longer make the enums repr(u8). This allows to add an unknown version that contains the unknown value for exposing to the user

    • This allows us the keep the enum with strong typing
    • Requires custom serialize and deserialize implementations to map correctly
  2. no longer make it an enum at all but a type akin to Id. A Type with markers for having strong typing on the types, but without pinning us to specific values

    • Allows us to keep the strong typing
    • Requires constants to be exposed for users to compare against

@laralove143
Copy link
Member

i like the TypeWrapper option most but couldn’t it be made into a Result? so that it can easily be propagated without much of a change in code

@7596ff 7596ff linked a pull request Mar 12, 2022 that will close this issue
3 tasks
@7596ff
Copy link
Contributor

7596ff commented Jul 20, 2022

Looks like this has now been addressed, but we should figure out what is left to do here.

@zeylahellyer zeylahellyer linked a pull request Jan 19, 2023 that will close this issue
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-model Affects the model crate m-breaking change Breaks the public API. t-refactor Refactors APIs or code.
Projects
Status: Todo
4 participants