Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardens editable_future_graph against nodegroup changes #11570 #11609

Open
wants to merge 25 commits into
base: dev/8.0.x
Choose a base branch
from

Conversation

chrabyrd
Copy link
Contributor

@chrabyrd chrabyrd commented Nov 8, 2024

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Description of Change

This handles source_graph nodegroup creation for when a node is set to is_collector after already being created. It includes general performance/QoL improvements, and has spawned a followup ticket: #11610

Issues Solved

Closes #11570

Checklist

  • I targeted one of these branches:
    • dev/8.0.x (under development): features, bugfixes not covered below
    • dev/7.6.x (main support): regressions, crashing bugs, security issues, major bugs in new features
    • dev/6.2.x (extended support): major security issues, data loss issues
  • I added a changelog in arches/releases
  • I submitted a PR to arches-docs (if appropriate)
  • Unit tests pass locally with my changes
  • I added tests that prove my fix is effective or that my feature works
  • My test fails on the target branch

@chrabyrd chrabyrd marked this pull request as ready for review November 14, 2024 06:48
@chrabyrd
Copy link
Contributor Author

re-running all tests to ensure flakiness no longer occurs. 1 successful run isn't enough 😂

Copy link
Member

@johnatawnclementawn johnatawnclementawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dog-fooding this for converting string nodes into non-localized string nodes and found that if your graph has a slug, you get an error like this:

File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2577, in update_from_editable_future_graph
    return self.restore_state_from_serialized_graph(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        serialized_editable_future_graph
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2689, in restore_state_from_serialized_graph
    updated_graph.save()
    ~~~~~~~~~~~~~~~~~~^^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 599, in save
    self.validate()
    ~~~~~~~~~~~~~^^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2328, in validate
    raise GraphValidationError(
    ...<4 lines>...
    )
arches.app.models.graph.GraphValidationError: "Another resource model already uses the slug 'string_example_model'"

…nto 11570-cbyrd-harden-editable_future_graphs-against-nodegroup-changes
@chrabyrd
Copy link
Contributor Author

chrabyrd commented Nov 15, 2024

Dog-fooding this for converting string nodes into non-localized string nodes and found that if your graph has a slug, you get an error like this:

File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2577, in update_from_editable_future_graph
    return self.restore_state_from_serialized_graph(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        serialized_editable_future_graph
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2689, in restore_state_from_serialized_graph
    updated_graph.save()
    ~~~~~~~~~~~~~~~~~~^^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 599, in save
    self.validate()
    ~~~~~~~~~~~~~^^
  File "/Users/johnathanclementi/Documents/Arches/Projects/rdm/arches/arches/app/models/graph.py", line 2328, in validate
    raise GraphValidationError(
    ...<4 lines>...
    )
arches.app.models.graph.GraphValidationError: "Another resource model already uses the slug 'string_example_model'"

we should chat, I cannot reproduce this error in test or UX. I added a test anyway 👍

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the QoL improvements, and thanks for digging into this many-tentacled issue! A few questions for you.

@@ -0,0 +1,38 @@
# Generated by Django 5.1.2 on 2024-11-13 18:17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename: edtible -> editable

arches/app/models/models.py Show resolved Hide resolved
arches/app/models/models.py Show resolved Hide resolved
Comment on lines +1709 to +1713
serialized_user_permission["content_object"] = (
models.NodeGroup.objects.get(
pk=serialized_user_permission["object_pk"]
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about prefetching these to avoid one-by-one queries?

Comment on lines +1599 to +1600
source_graph = Graph.new(name="TEST RESOURCE", is_resource=True, author="TEST")
source_graph.save()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a newer developer to this part of the code, tests like these help me understand the intended developer interface. Am I supposed to save() after calling new() because new() lacks some functionality? Would be nice to clarify with a comment here if so, or else open a follow-up ticket if the implementations need some attention.

tests/models/graph_tests.py Show resolved Hide resolved

nodegroup = models.NodeGroup.objects.create()
string_node = models.Node.objects.create(
graph=source_graph,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ensure there's a collector/root node:

Suggested change
graph=source_graph,
pk=nodegroup.pk,
graph=source_graph,

tests/models/graph_tests.py Show resolved Hide resolved
Comment on lines +2603 to +2605
# ensures any resources that were related to the source graph are not deleted
self.pk = uuid.uuid4()
self.delete()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chance of legit data loss from a UUID collision is infinitesimal, but I wonder if we should decompose Graph.delete() into deleteRelatedObjects() so we can call it without calling super().delete() and fiddling with the pk. By not hacking the pk, it will be a little clearer and hardened against other ORM internals (e.g., do we need to also fiddle with ._state.adding and set it False, now that it's a nonexistent object? Would rather not massage that...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we also want to delete the graph. I'm unsure decomposing this will do anything for

graph = models.ForeignKey(GraphModel, db_column="graphid", on_delete=models.CASCADE)

on ResourceInstance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well from a code read, we're not deleting the graph, and I just checked the queries to be sure. We set the PK to a random value, so that super().delete() will shoot a bunch of blanks, firing off deletes on a bunch of tables that will never match on anything. Since we're not deleting the graph, I was thinking it would be clearer to decompose that out and avoid shooting the blanks:

BEFORE:  a6cd3760-199b-4071-bd15-307fb04e53d2
AFTER:  de19d77d-4b59-41ae-bdee-aa7669cda961

Notice the deletes are for the random uuid (also why I was mentioning the very very rare risk of collision):

-> super(Graph, self).delete()
(Pdb) print(connection.queries)
[{'sql': 'SELECT "cards"."cardid" FROM "cards" WHERE "cards"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.003'}, {'sql': 'SELECT "edges"."edgeid" FROM "edges" WHERE "edges"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.002'}, {'sql': 'SELECT "graphs"."graphid" FROM "graphs" WHERE "graphs"."source_identifier" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}, {'sql': 'SELECT "graphs_x_published_graphs"."publicationid" FROM "graphs_x_published_graphs" WHERE "graphs_x_published_graphs"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.002'}, {'sql': 'SELECT "nodes"."nodeid" FROM "nodes" WHERE "nodes"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}, {'sql': 'SELECT "resource_instances"."resourceinstanceid" FROM "resource_instances" WHERE "resource_instances"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}, {'sql': 'DELETE FROM "functions_x_graphs" WHERE "functions_x_graphs"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}, {'sql': 'DELETE FROM "resource_x_resource" WHERE ("resource_x_resource"."resourceinstancefrom_graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid) OR "resource_x_resource"."resourceinstanceto_graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid))', 'time': '0.001'}, {'sql': 'DELETE FROM "graphs_x_mapping_file" WHERE "graphs_x_mapping_file"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}, {'sql': 'DELETE FROM "graphs" WHERE "graphs"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}]

Copy link
Contributor Author

@chrabyrd chrabyrd Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm talking about:

{'sql': 'DELETE FROM "graphs" WHERE "graphs"."graphid" IN (\'de19d77d-4b59-41ae-bdee-aa7669cda961\'::uuid)', 'time': '0.001'}

so that the current graph object gets deleted. I know the Graph.delete method manually handles cards, nodes, widgets, nodegroups, and edges in a way that wouldn't be affected by changing the graphid. graph_x_published_graph is interesting, I haven't checked that those are also deleted, but I will. Same with checking for strays with functions_x_graphs

I'm not against the decomposition here, but at some point the graph itself needs to be deleted, so not I'm entirely sure that the decomposition will handle what we need it to.

And you're right about firing blanks, that's intentional for resource_x_resource and resource_instances.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But de19d77d-4b59-41ae-bdee-aa7669cda961 isn't the current object, it's the random one, right?

BEFORE is self.pk, then
AFTER is after setting self.pk = uuid.uuid4()

BEFORE:  a6cd3760-199b-4071-bd15-307fb04e53d2
AFTER:  de19d77d-4b59-41ae-bdee-aa7669cda961

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. yeah I get it 😞 . Def something to handle for

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot republish graph after adding parent/child nodegroup configuration
3 participants