1710 dsr bulk reprocess not working with 20 demo #2015
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #1710
Code Changes
async
keyword from privacy request endpoints, as it is one contributing reason that reprocessed privacy requests are being executed concurrently when running without a worker, which is not a state we expect/handleSteps to Confirm
The steps in the issue describe how to reproduce the problem while using
fides deploy up
. But we won't have the fix available in that workflow until we merge this PR, since it requires an updateethyca-fides
package on pypi - so we're in a bit of a catch 22 there. That being said, I was able to reproduce the problem on all of my local testing environments (e.g.nox -s dev
andnox -s test_env
if I just updated my localfides.toml
to haveanalytics_opt_out = false
. So to confirm, you couldfides.toml
to haveanalytics_opt_out = false
nox -s dev -- ui
In general, the comments I left on the issue provide some context for different testing I did and my findings - they may be useful to read while evaluating this fix.
I'll also attach the latest version of the little test script (test_fides_1710.py.txt) I had that helped to reproduce the problem and verify the fix - the script uses
fides
dependencies so it needs to run in a python env withethyca-fides
installed. it's also not the most beautiful script - but it gets the job done :)Pre-Merge Checklist
CHANGELOG.md
Description Of Changes
async
keyword from the endpoint function doesn't get super close to the "root cause" of the problems seen in DSR reprocessing. If someone comes along and writes a newasync
endpoint function that queues privacy requests, the underlying infrastructure will still allow us to get into a state where we've got multiple concurrent DSR executions occurring when not using a worker, which is not something that will be handled well. That being said, this was by far the simplest and least intrusive fix I could think of, and it's effective - our current routes do not put us in a bad state, whether using a worker or not, with manual approval or not, when submitted or reprocessing, etc.async
keyword. But because of the way that fastAPI works, when the endpoint functions wereasync
, they were being executed on the app's main event loop; if the underlying work it was doing was not effectively leveragingasync
/await
functionality - and most of it is not - then all of that work would clog up the entire web server. This did not impact manually approved privacy requests (since they were using async
endpoint), and it didn't impact execution with a worker (since that work was successfully pushed off onto another process); but when running without a worker, and when bulk re-processing requests, or when creating a privacy request without manual approval required (bothasync
endpoints), the whole server would lock up until the privacy requests finished execution! With this change, they no longer do that since thesync
endpoints do all of their work on a separate thread/event loop than the one used by the main webserverasync
endpoints in Make log send async fidesops#1174 while generally addingasync
functionality to the ops codebase to avoid errors with thefideslog
client calls. The changes in this PR don't impact this, since what was needed there was for the celery task function to beasync
; we can call theasync
celery task fromsync
endpoints, as we do in this PR, due to the@sync
wrapper we have.