Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Deprecate support for enums #8401

Closed
robbieaverill opened this issue Sep 24, 2018 · 10 comments
Closed

RFC: Deprecate support for enums #8401

robbieaverill opened this issue Sep 24, 2018 · 10 comments

Comments

@robbieaverill
Copy link
Contributor

robbieaverill commented Sep 24, 2018

Affected Version

Deprecate in 4.x and remove in 5.x

Description

Support in SilverStripe for enum DB fields is one of the main blockers for moving to use doctrine for example.

Some other reasons why we shouldn't use enums:

  • Semantic reasons, e.g. that the data storage is actually column metadata where it should be in a field
  • They aren't extensible - user code can't add new values to enum fields (without overloading them entirely)

We use enums for some quite fundamental parts of the framework, e.g. the ClassName field. We can replace this with application level logic and constant values instead.

Pros

Cons

  • Developers probably like using them because we've made them easy to configure
@sminnee
Copy link
Member

sminnee commented Sep 24, 2018

Modernize the framework a little more

Can we elaborate on what alternative to enum we are advocating and why this would be a step forward? I agree that this would unblock #6632, and it would make our PGSQL implementation a little more consistent, but I'm skeptical of other benefits, and "modernize" as a benefit is a bit of a truism.

Using ClassName as a specific example

ClassName = varchar + Index + A separate constraint
Our constraint is managed separately, but provided this isn't a performance degradation, maybe this is okay. dev/build adjustments of allowed items might be quicker (assuming an enum is slower to modify than a constraint), but that requires testing. I don't see how this is a clear improvement, but it seems fine.

EDIT: The most significant change that this would make is the data usage of each record, which on big tables can get significant. Which may end up pushing us in the direction of the other 2 solutions, although one good thing we could do is avoid including the ClassName field on tables that don't support multiple classes.

And then there are some solutions which I think would be a backward step:

ClassName in a separate table and ClassNameID in the parent table
Now we need extra joins or preparatory getClassNameIDs() queries. Extra joins would be by far the worst of the two, slowing down every part of SilverStripe. Manual interrogation of the database in a MySQL is a bit harder, you can't see the enum values in your root table. Technically this is denormalised data but in my view excessively so.

ClassName is an integer an the lists are maintained in PHP
A lot of the problems of the previous solution. The getClassNameIDs() query is avoided but instead this list needs to be maintained in PHP/server-side somehow.


So, in my view, this is solely justified by #6632.

@robbieaverill
Copy link
Contributor Author

Let's leave that as the sole justification for now then =)

@maxime-rainville
Copy link
Contributor

You can pry my Enums from my cold, dead hands!!! 😱

On a serious note, if it's holding up #6632 it sounds like a worthwhile sacrifice.

Sam's ClassName = varchar + Index + A separate constraint comments sounds sensible.

I'm wondering if we want to keep some sort of virtual Enum construct, that constrains the values you can store in your field at the Framework level without actually using the Enum type in the DB? Or would that be counter intuitive and/or defeat the purpose of the change?

@robbieaverill
Copy link
Contributor Author

Yes I think application level support would be a good thing to let people still use them without affecting the DB, it would also be extensible.

Doctrine have some examples of how we could do this in core: https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/cookbook/mysql-enums.html

@kinglozzer
Copy link
Member

Yeah, my immediate thought when I saw this blocker in #6632 was varchar with ORM-validation. It’d essentially be a convenience wrapper to save you writing your own validator, but I think that’s fine. I’m not familiar with constraints at a database-level, are they well supported/supported in PGSQL/SQLite etc?

I’m trying to think if there are any drawbacks to not having a list of available values when inspecting a column. I can’t really think of a use-case for that, so it’s probably not worth worrying about - perhaps if we implement database-level constraints you could grab them from there...

@dhensby
Copy link
Contributor

dhensby commented Sep 25, 2018

My opinion is that we should remove native DB enum support but have an application layer support for them. That means we still retain the "Enum" DB Field type in the ORM we just don't construct an enum in the database. We can still define a list of allowed values for a field but we just don't store that list in the DB.

In my view, one of the biggest drawbacks of Enums is that it causes data corruption on dev/build when the enum values are changed, which makes rolling back database changes a huge pain.


Just as an aside, there are a lot of things blocking #6632 including:

  1. Close coupling to autoincremending DB IDs (that have to be pulled from the DB)
  2. mIXeD case identifiers (they need to be all lower case)
  3. a great love of writing, manipulating and injecting SQL in our core eco-system and modules (aka - DataQuery constructs SQL instead of a representation of a query in our domain language)

@sminnee
Copy link
Member

sminnee commented Oct 1, 2018

If we do address this we should also look at #1387, which is to say, we want to ensure that constraints get re-written when allowed elements change, or remove it altogether.

I have to say that removing all constraints from the database and just storing a varchar, relying on PHP to maintain data integrity feels like a step backwards:

  • Less integrity at the database level
  • More bytes per row, making any indexes longer – this can get material if we store the FQCN in ClassName as we currently do, although perhaps we can reduce this impact but assuming that names are 7-bit safe, and sizing the field to based on the values allowed.

@sminnee
Copy link
Member

sminnee commented Oct 15, 2018

I would suggest that we start by introducing a config option, eg. DBEnum.use_native_enum = true. If you set it to false then it will add a varchar + constraint to the database instead of an enum.

This will let enthusiasts try this feature out before we take the step of deprecating the current implementation, which personally I would only do if we were pretty far down the track of implementing DBAL

@kinglozzer
Copy link
Member

Just dropping back in to list another disadvantage of enums. When deploying changes to an existing SS4 site, the ChangeSetItem.ObjectClass column has to be updated to reflect any new/removed versioned DataObjects. It’s really slow when you have a lot of changeset records...

@robbieaverill
Copy link
Contributor Author

#6632 is not something we're planning on doing any more, so this has less drive than it had.

I like @sminnee's last idea of allowing config based opt-in.

I don't think there's been enough interest in this RFC since I opened it two years ago to pursue it any further, so I'm going to close it. Thanks to everyone for the input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants