You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A customer experienced a massive amount of e-mail being dispatched from their NAV server through the night. On our end, we received some 78.000 Django error report e-mails over the course of about 14 hours.
It appears that what happened was that a user had the IPAM web tool open, and it kept on hammering (in a tight loop) the back-end API endpoint that retrieves information about a scope-level prefix for display in IPAM. The end-point failed with an unhandled error (500 response code), because the prefix address in question could not be found in the NAV database (it may have been deleted after the IPAM UI loaded its initial data set). The production server is set up to e-mail Django error reports to the developer team when a Django view hits an unhandled exception, which caused every one of these hammer-hits to dispatch a full e-mail debug report.
To Reproduce
Reproducing the failing back-end API call is fairly easy, even from the IPAM UI. However, we've not been able to reproduce the tight loop observed in production.
A failing API call can be produced by visiting /ipam/api/?net_type=all&within=A.B.C.D/XX&show_all=True, where A.B.C.D/XX represents a network prefix that is not registered anywhere in NAV (see traceback below).
The same call is most easily reproduced through the front-end thus:
Steps to reproduce the behavior:
Go to SeedDB
Click on the Prefix tab
Click on the button Add new prefix
Add a new scope prefix (suggest using some RFC 1918 address unused in your installation)
Open a new browser tab, browse /ipam to populate the initial UI list of scope prefixes. Verify that the new prefix in the list.
Return to the SeedDB tab and click the Delete button. Confirm the deletion of the prefix you added in step 4.
Return to the IPAM tab. Click on the delete prefix address to expand it.
IPAM performs the failing API request.
It's a this point we are unable to reproduce a tight reloading loop. What we observe is the API endpoint failing with a 500 error, which the IPAM UI seems to be oblivious to: It expands the selected prefix, and displays a spinner icon with the text Getting data for allocation tree. This spinner never stops.
Expected behavior
The back-end API should not crash if the requested prefix address is unknown. In fact, the end-point seems to be coded to produce an alternative response if the prefix isn't found, but this code is faulty.
The front-end should not hammer the back-end API. However, since we're unable to reproduce this behavior, it's not clear what we can do about it.
The front-end should take appropiate measures to report API errors to the user, rather than showing a spinner indefinitely.
Number 3 could be implemented as a separate issue, since it is entirely in the front-end. The most important issue right now is to fix number 1, as there is no need to mail the Django site admins when the API cannot find the requested address.
Tracebacks
Recreated in development:
Traceback (most recent call last):
File "/opt/venvs/nav/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
response = get_response(request)
File "/opt/venvs/nav/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/opt/venvs/nav/lib/python3.9/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_viewreturn view_func(*args, **kwargs)
File "/opt/venvs/nav/lib/python3.9/site-packages/rest_framework/viewsets.py", line 125, in viewreturnself.dispatch(request, *args, **kwargs)
File "/opt/venvs/nav/lib/python3.9/site-packages/rest_framework/views.py", line 509, in dispatch
response =self.handle_exception(exc)
File "/opt/venvs/nav/lib/python3.9/site-packages/rest_framework/views.py", line 469, in handle_exceptionself.raise_uncaught_exception(exc)
File "/opt/venvs/nav/lib/python3.9/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exceptionraise exc
File "/opt/venvs/nav/lib/python3.9/site-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/opt/venvs/nav/lib/python3.9/site-packages/nav/web/ipam/api.py", line 157, in list
result = make_tree(prefixes, root_ip=within, family=family, show_all=show_all)
File "/opt/venvs/nav/lib/python3.9/site-packages/nav/web/ipam/prefix_tree.py", line 471, in make_tree
scope = Prefix.objects.get(net_address=root_ip)
File "/opt/venvs/nav/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_methodreturngetattr(self.get_queryset(), name)(*args, **kwargs)
File "/opt/venvs/nav/lib/python3.9/site-packages/django/db/models/query.py", line 435, in getraiseself.model.DoesNotExist(
nav.models.manage.Prefix.DoesNotExist: Prefix matching query does not exist.
Environment (please complete the following information):
NAV version installed: 5.10.2
The text was updated successfully, but these errors were encountered:
NAV test data does not include a prefix 192.168.42.0/24. This should
ensure the IPAM main API endpoint does not crash when asked to build
a tree for a prefix not known to NAV.
Describe the bug
A customer experienced a massive amount of e-mail being dispatched from their NAV server through the night. On our end, we received some 78.000 Django error report e-mails over the course of about 14 hours.
It appears that what happened was that a user had the IPAM web tool open, and it kept on hammering (in a tight loop) the back-end API endpoint that retrieves information about a scope-level prefix for display in IPAM. The end-point failed with an unhandled error (500 response code), because the prefix address in question could not be found in the NAV database (it may have been deleted after the IPAM UI loaded its initial data set). The production server is set up to e-mail Django error reports to the developer team when a Django view hits an unhandled exception, which caused every one of these hammer-hits to dispatch a full e-mail debug report.
To Reproduce
Reproducing the failing back-end API call is fairly easy, even from the IPAM UI. However, we've not been able to reproduce the tight loop observed in production.
A failing API call can be produced by visiting
/ipam/api/?net_type=all&within=A.B.C.D/XX&show_all=True
, whereA.B.C.D/XX
represents a network prefix that is not registered anywhere in NAV (see traceback below).The same call is most easily reproduced through the front-end thus:
Steps to reproduce the behavior:
Prefix
tabAdd new prefix
/ipam
to populate the initial UI list of scope prefixes. Verify that the new prefix in the list.Delete
button. Confirm the deletion of the prefix you added in step 4.It's a this point we are unable to reproduce a tight reloading loop. What we observe is the API endpoint failing with a 500 error, which the IPAM UI seems to be oblivious to: It expands the selected prefix, and displays a spinner icon with the text
Getting data for allocation tree
. This spinner never stops.Expected behavior
Number 3 could be implemented as a separate issue, since it is entirely in the front-end. The most important issue right now is to fix number 1, as there is no need to mail the Django site admins when the API cannot find the requested address.
Tracebacks
Recreated in development:
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: