Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearchitect custom fields to employ discrete database table fields #14000

Closed
jeremystretch opened this issue Oct 6, 2023 · 9 comments
Closed
Labels
type: feature Introduction of new functionality to the application

Comments

@jeremystretch
Copy link
Member

NetBox version

v3.6.3

Feature type

Change to existing functionality

Proposed functionality

Synopsis

Custom field data is currently stored in a single custom_field_data JSON column on each PotsgreSQL table for models which support custom field assignment. Within this database field, custom field data exists as a mapping of field name to value. For example:

                                        Table "public.dcim_site"
      Column       |           Type           | Collation | Nullable |             Default              
-------------------+--------------------------+-----------+----------+----------------------------------
 created           | timestamp with time zone |           |          | 
 last_updated      | timestamp with time zone |           |          | 
 custom_field_data | jsonb                    |           | not null | 
...
>>> site.custom_field_data
{'locode': 'AE-AMU', 'zip_code': '12345'}

This FR proposes rearchitecting the current implementation so that each custom field exists as a discrete table column within the database. In the example above, locode and zip_code would each exist as individual columns on the dcim_site table.

Adding and Removing Custom Fields

Creating and deleting custom fields on models would trigger the creation/deletion of the corresponding table column for the affected models. The same utilities employed by Django to effect schema migrations in response to model changes could theoretically be employed for this purpose. We could also implement validation that checks for existing data in any particular column prior to its deletion.

Table columns associated with custom fields would not be managed by native Django migrations, and would have no impact on NetBox's native schema migrations. Similarly, this change would not impact the REST or GraphQL APIs, which would continue to represent custom field data in the current format.

Use case

Assuming a robust implementation can be identified, this change would introduce several substantial benefits:

  1. Greatly simplified filtering and ordering of objects by custom field values
  2. Improved support for custom field validation
  3. The ability to optimize the population of related objects (see Improve performance of object type custom field lookups for API responses #12917)

Database changes

This change would replace the existing custom_field_data column on various models with zero or more discrete columns, one for each associated custom field. Below is a (hastily) proposed mapping of custom field types to PostgreSQL data types.

Custom Field Column Type
Text text
Long text text
Integer bigint
Decimal numeric
Boolean boolean
Date date
Date & time timestamp
URL char
JSON json
Select char
Multi-select array[char]
Object bigint*
Multi-obejct array[bigint]*

Object and potentially multi-object fields should be treated as foreign keys; further research is needed in this area.

External dependencies

None have been identified at this time, although it's likely we'll lean on Django's built-in migration utilities for this work.

@jeremystretch jeremystretch added type: feature Introduction of new functionality to the application status: under review Further discussion is needed to determine this issue's scope and/or implementation labels Oct 6, 2023
@DanSheps
Copy link
Member

DanSheps commented Oct 11, 2023

Would it make sense to isolate the table a little and perhaps store the custom field data in a 1:1 related table? I know that adds extra tables to our schema but it does provide a clear deliniation.

@jeremystretch
Copy link
Member Author

@DanSheps what benefit would that provide? We'd be manipulating a table schema either way.

@Haeki
Copy link

Haeki commented Oct 12, 2023

A theoretical problem would be if people use a custom field with the same name as a column that already exists in the model table. I don't know why anyone would choose such a custom field name, but to be on the safe side, the custom field columns could be prefixed (e.g. cf_).

@DanSheps
Copy link
Member

@DanSheps what benefit would that provide? We'd be manipulating a table schema either way.

Separation of the "core" data from the CF's (direct database reporting, python API interaction, etc) and regular data, this also prevents issues down the line where someone introduces a custom field with a name we intend to use down the line.

Also, might make it a little easier to migrate any existing scripts/plugins that use custom fields as you could "keep" custom_field_data and just turn it into a relation.

Again, this was just a random thought I had, rather then a concrete suggestion.

@PieterL75
Copy link
Contributor

I like the idea, as it would also make the custom_fields indexible for the globalsearch cache. not ?

@ITJamie
Copy link
Contributor

ITJamie commented Feb 14, 2024

out of interest why not have an m2m table for this.
eg:

                                        Table "public.object_custom_fields_m2m"
| Column           | Type         | Use                                                                  |
|------------------|--------------|----------------------------------------------------------------------|
| id               | int          |                                                                      |
| object_id        | foreign key  | to related object                                                    |
| field_id         | foreign key  | to custom field id                                                   |
| field_type       | text         | lets you know which of the following columns the value is stored in  |
| field_value_text | text field   | if field_type is text, use this                                      |
| field_value_int  | bigint       | used if field_type is int                                            |
| field_value_dec  | decimal      | used if field_type is Decimal                                        |

etc for each field type

this would move the data out of the existing json objects, would be sortable by the values in each field and would mean migrations wouldn't have to be performed when custom fields were created or deleted.
the existing django ondelete functions would handle cleaning up of data as normal on object deletes and could even work for a custom field itself being deleted to clean up these based on the foreign keys.

it would also make it possible for people to migrate custom fields between field types (eg from decimal to text) when those situations occur.

@jeremystretch
Copy link
Member Author

out of interest why not have an m2m table for this.

@ITJamie the first implementation of custom fields was something similar, and performed very poorly.

@ITJamie
Copy link
Contributor

ITJamie commented Feb 29, 2024

@ITJamie the first implementation of custom fields was something similar, and performed very poorly.

Fair enough. My one hesitation about this change is that schema changes/migrations outside of existing maintenance windows is not a best practice. In some organizations I've worked in the db user netbox is using did not have permission to amend the schema outside of a pre-approved change management window.

If we go ahead with this design change I would hope there would be a pre-check of grants make sure the db connection is allowed to make amendments or that there would be good handling of error response from postgres if it denys the change

@jeremystretch
Copy link
Member Author

I'm going to retire this idea for now as I don't think there's enough potential value to justify the development effort. It's a potential future enhancement.

@jeremystretch jeremystretch closed this as not planned Won't fix, can't repro, duplicate, stale May 14, 2024
@jeremystretch jeremystretch removed the status: under review Further discussion is needed to determine this issue's scope and/or implementation label May 14, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: feature Introduction of new functionality to the application
Projects
None yet
Development

No branches or pull requests

5 participants