-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes: #15194 - Check the whole queue for matching queued webhooks #15318
Fixes: #15194 - Check the whole queue for matching queued webhooks #15318
Conversation
a2e6304
to
1728997
Compare
@peteeckel thanks for all your work on this, both in capturing the root issue and in devising a solution! Admittedly, I haven't attempted to reproduce this firsthand, but thanks to your very detailed notes I believe I understand the root issue. Given that we cannot rely on merely checking the most recent queued item ( If you're open to it, I'd be happy to continue your work in this direction. (Of course, you're also welcome to continue tackling it yourself.) Let me know what you think, and thanks again! |
Hi Jeremy, you're very welcome - I know that some of the bugs I find are slightly exotic and that you're probably working much more than usual to get NetBox 4 out, so never mind about the delay. As for converting the quere to a dictionary: That makes a lot of sense to me, provided that nobody actually relies on it being a queue, which probably isn't the case. The only thing that might be of concern is that some users might make assumptions about the order in which webhooks are triggered, so my first thought was to use an In CPython, dicts are in fact always ordered, but the question is whether we can rely on NetBox always being run on CPython? Trying to err on the safe side I'd assume the answer is 'no', so the queue should be replaced by an I'm currently on a business trip until Thursday, so I can start working on it again on Friday - if that's not an issue I'll give it a try. On the other hand, the sooner the problem is fixed the better, so don't hesitate to go for it. |
Hi @jeremystretch, I just had a look at the code and it seems that converting the queue to a hash map is not a trivial change. To me, there seems to be a showstopper in So while changing it is rather simple, it will definitely break documented or at least established interfaces, which would probably be possible with the switch to NetBox 4 but not earlier, correct? |
Hi @jeremystretch, as this seems to be stalled on the question of the interface to methods in the I think moving from a queue to a hash map would require changing the (documented?) interface, so the question is whether or not this is an option (or whether I am missing something). If it gets changed, it would probably be best with NetBox 4.0 as a major version. Or were you suggesting converting the queue to a hash map just for the sake of the check whether an item is already in the queue, without changing the queue structure as such? I doubt this would be an efficient use of resources, as repeatedly copying the data into a hash map is more effort than the linear search I implemented. If not, do you require any changes to my PR? |
@peteeckel Please add a test that illustrates the behavior you're describing, so that the developer who picks this up can be confident in their understanding of the problem. Additionally, given the sensitivity of the code path that this PR targets, we need to include tests which confirm the continued correct operation of the rest of the code path's behavior. For example, there's some concern about the possibility of race conditions when multiple actors try to modify the same object. |
Hi @jeffgdotorg, that's an interesting request :-)
|
By the way, I can add a test to the NetBox DNS test suite. Not sure if that would help, though. |
@peteeckel my biggest concern, given the nature of the bug, is that some future change to the logging logic could reintroduce the error. If you can walk me through exactly what's happening in the plugin, I might be able to rig up a test to replicate it in NetBox. |
HI @jeremystretch, thanks for the quick reaction. I constructed some code to reproduce the behviour in the underlying issue. It can be triggered in multiple ways, all requiring a specifically crafted The easiest way to trigger it is to call When there is an m2m relationship, the issue is also triggered when the The only thing my patch does is to check the whole previous queue instead of only the last event in the queue for events for the same object, and not only With your earlier suggestion of changing the data structure of the queue to a hash map, that would be much simpler and more elegant, however at the cost of breaking the interface for the |
Just in case you are asking yourself "Why should anyone want to do that kind of stuff?" ... Take a DNS zone, now update the list of name servers for the zone. The latter is an m2m relationship.
And there you have everything described above in a single request. I documented the result in peteeckel/netbox-plugin-dns#89 (comment), including a verification that the PR actually solves this. |
1728997
to
e535c14
Compare
e535c14
to
b5cb67b
Compare
Sorry, I accidentally deleted the last comment instead of editing it, but never mind - I have some more data now, and a better approach to creating a test for the problem. Essentially there is a little change to the model of the (netbox) [root@dns netbox]# git diff develop -- netbox/netbox/tests/dummy_plugin/models.py
diff --git a/netbox/netbox/tests/dummy_plugin/models.py b/netbox/netbox/tests/dummy_plugin/models.py
index 9bd39a46b..2677727db 100644
--- a/netbox/netbox/tests/dummy_plugin/models.py
+++ b/netbox/netbox/tests/dummy_plugin/models.py
@@ -11,3 +11,7 @@ class DummyModel(models.Model):
class Meta:
ordering = ['name']
+
+ def save(self, *args, **kwargs):
+ super().save()
+ super().save() This is enough to trigger the erroneous invocations of the event rules for (netbox) [root@dns netbox]# git diff develop -- netbox/extras/tests/test_event_rules.py
diff --git a/netbox/extras/tests/test_event_rules.py b/netbox/extras/tests/test_event_rules.py
index 8cea2078a..80375ba82 100644
--- a/netbox/extras/tests/test_event_rules.py
+++ b/netbox/extras/tests/test_event_rules.py
@@ -16,6 +16,8 @@ from extras.events import enqueue_object, flush_events, serialize_for_event
from extras.models import EventRule, Tag, Webhook
from extras.webhooks import generate_signature, send_webhook
from utilities.testing import APITestCase
+from netbox.tests.dummy_plugin.models import DummyModel
+from netbox.context import events_queue
class EventRuleTest(APITestCase):
@@ -31,6 +33,7 @@ class EventRuleTest(APITestCase):
def setUpTestData(cls):
site_type = ObjectType.objects.get_for_model(Site)
+ dummy_type = ObjectType.objects.get_for_model(DummyModel)
DUMMY_URL = 'http://localhost:9000/'
DUMMY_SECRET = 'LOOKATMEIMASECRETSTRING'
@@ -65,7 +68,7 @@ class EventRuleTest(APITestCase):
),
))
for event_rule in event_rules:
- event_rule.object_types.set([site_type])
+ event_rule.object_types.set([site_type, dummy_type])
Tag.objects.bulk_create((
Tag(name='Foo', slug='foo'),
@@ -377,3 +380,45 @@ class EventRuleTest(APITestCase):
# Patch the Session object with our dummy_send() method, then process the webhook for sending
with patch.object(Session, 'send', dummy_send) as mock_send:
send_webhook(**job.kwargs)
+
+ def test_webhook_double_save_create(self):
+ data = {
+ 'name': 'Dummy',
+ }
+ url = reverse('plugins-api:dummy_plugin-api:dummymodel-list')
+ self.add_permissions('dummy_plugin.add_dummymodel')
+ response = self.client.post(url, data, format='json', **self.header)
+ self.assertHttpStatus(response, status.HTTP_201_CREATED)
+ self.assertEqual(DummyModel.objects.count(), 1)
+
+ dummy = DummyModel.objects.first()
+
+ self.assertEqual(self.queue.count, 1)
+ job = self.queue.jobs[0]
+ self.assertEqual(job.kwargs['event_rule'], EventRule.objects.get(type_create=True))
+ self.assertEqual(job.kwargs['event'], ObjectChangeActionChoices.ACTION_CREATE)
+ self.assertEqual(job.kwargs['model_name'], 'dummymodel')
+ self.assertEqual(job.kwargs['data']['id'], dummy.pk)
+ self.assertEqual(job.kwargs['snapshots']['postchange']['name'], 'Dummy')
+
+ def test_webhook_double_save_update(self):
+ dummy = DummyModel.objects.create(name='Dummy')
+
+ data = {
+ 'name': 'New Dummy',
+ 'number': 42,
+ }
+ url = reverse('plugins-api:dummy_plugin-api:dummymodel-detail', kwargs={'pk': dummy.pk})
+ self.add_permissions('dummy_plugin.change_dummymodel')
+ response = self.client.patch(url, data, format='json', **self.header)
+ self.assertHttpStatus(response, status.HTTP_200_OK)
+ self.assertEqual(DummyModel.objects.count(), 1)
+
+ self.assertEqual(self.queue.count, 1)
+ job = self.queue.jobs[0]
+ self.assertEqual(job.kwargs['event_rule'], EventRule.objects.get(type_update=True))
+ self.assertEqual(job.kwargs['event'], ObjectChangeActionChoices.ACTION_UPDATE)
+ self.assertEqual(job.kwargs['model_name'], 'dummymodel')
+ self.assertEqual(job.kwargs['data']['id'], dummy.pk)
+ self.assertEqual(job.kwargs['snapshots']['postchange']['name'], 'New Dummy')
+ self.assertEqual(job.kwargs['snapshots']['postchange']['number'], 42) I appended the tests appended to The problem is: This does not work at all. |
The first problem is that the updates to
The migration needs to be changed as well: (netbox) [root@dns netbox]# git diff develop -- netbox/netbox/tests/dummy_plugin/migrations/0001_initial.py
diff --git a/netbox/netbox/tests/dummy_plugin/migrations/0001_initial.py b/netbox/netbox/tests/dummy_plugin/migrations/0001_initial.py
index b7241b51d..9ba1547d3 100644
--- a/netbox/netbox/tests/dummy_plugin/migrations/0001_initial.py
+++ b/netbox/netbox/tests/dummy_plugin/migrations/0001_initial.py
@@ -15,6 +15,8 @@ class Migration(migrations.Migration):
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False)),
('name', models.CharField(max_length=20)),
('number', models.IntegerField(default=100)),
+ ('created', models.DateTimeField(auto_now_add=True, null=True)),
+ ('last_updated', models.DateTimeField(auto_now=True, null=True)),
],
options={
'ordering': ['name'], This is where the trouble really started. |
The change logging functionality relies on serialising objects, and that requires finding the serializer for the model in the first place. This is done via
After a while I found out that the problem is with the way To be honest, I did not find a real solution for this. I have a very dirty hack in place that just works and doesn't break anything else, but it's ugly as hell and definitely can't be used like this. (netbox) [root@dns netbox]# git diff 0bfb9777bed4fcf22f5b684504b1e62d03cdfb9b 3c38f51250710c6637667ef626f0bc484760ead9
diff --git a/netbox/netbox/tests/dummy_plugin/models.py b/netbox/netbox/tests/dummy_plugin/models.py
index 9bd39a46b..ade46f768 100644
--- a/netbox/netbox/tests/dummy_plugin/models.py
+++ b/netbox/netbox/tests/dummy_plugin/models.py
@@ -8,6 +8,7 @@ class DummyModel(models.Model):
number = models.IntegerField(
default=100
)
+ serializer_label = 'netbox.tests.dummy_plugin'
class Meta:
ordering = ['name']
diff --git a/netbox/utilities/api.py b/netbox/utilities/api.py
index 11b914811..3cedfc4a3 100644
--- a/netbox/utilities/api.py
+++ b/netbox/utilities/api.py
@@ -30,7 +30,12 @@ def get_serializer_for_model(model, prefix=''):
"""
Return the appropriate REST API serializer for the given model.
"""
- app_label, model_name = model._meta.label.split('.')
+ if hasattr(model, 'serializer_label'):
+ app_label = model.serializer_label
+ model_name = model._meta.label.split('.')[1]
+ else:
+ app_label, model_name = model._meta.label.split('.')
+
serializer_name = f'{app_label}.api.serializers.{prefix}{model_name}Serializer'
try:
return import_string(serializer_name) For the time being I left that hack in place. One way to do it properly would probably be to add the |
The next problem, a trivial one, was that the serializer for diff --git a/netbox/netbox/tests/dummy_plugin/api/serializers.py b/netbox/netbox/tests/dummy_plugin/api/serializers.py
index 239d7d998..0a6412095 100644
--- a/netbox/netbox/tests/dummy_plugin/api/serializers.py
+++ b/netbox/netbox/tests/dummy_plugin/api/serializers.py
@@ -2,7 +2,7 @@ from rest_framework.serializers import ModelSerializer
from netbox.tests.dummy_plugin.models import DummyModel
-class DummySerializer(ModelSerializer):
+class DummyModelSerializer(ModelSerializer):
class Meta:
model = DummyModel
diff --git a/netbox/netbox/tests/dummy_plugin/api/views.py b/netbox/netbox/tests/dummy_plugin/api/views.py
index 58f221285..3fd51bcf2 100644
--- a/netbox/netbox/tests/dummy_plugin/api/views.py
+++ b/netbox/netbox/tests/dummy_plugin/api/views.py
@@ -1,8 +1,8 @@
from rest_framework.viewsets import ModelViewSet
from netbox.tests.dummy_plugin.models import DummyModel
-from .serializers import DummySerializer
+from .serializers import DummyModelSerializer
class DummyViewSet(ModelViewSet):
queryset = DummyModel.objects.all()
- serializer_class = DummySerializer
+ serializer_class = DummyModelSerializer And that's pretty much all ... with these changes in place there is a test that fails if the proposed patch is not in place and succeeds if it is. The only major issue is the problem with By the way, I experimented very briefly with setting |
And, as a side note, it also requires changing |
@peteeckel I think you've been on the right path with this approach, but employing unique IDs to map each object in the queue offers better protection against duplicated events. We can also simplify it a good deal by invoking the Please have a look at the PR I just put in (#16366) and let me know what you think! |
Fixes: #15194
This PR changes the logic how changes to M2M relationships are matched to changes in the base object to which the M2M relationship is related. The current code assumes that the
post_save
is immediately followed by them2m_changed
signal for the former, but that is not generally true as anysave()
operation for a different object occurring in between will separate the two entries in the queue. This results in two webhooks being triggered, the first one of which has incorrect data in its payload.The new code finds the position of the webhook data for the object change in the queue and updates it with the M2M change data and ensures only one webhook is triggered, and with the correct data.