-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better signal handling for proxy, stern #2715
Conversation
Can't reproduce, what am I doing wrong? When I |
Can you please try |
Doesn't change anything; the handler still doesn't get caught. Anyway that's besides the point, no? I'm installing a |
ah, i see: brig is calling |
This reverts commit a175577.
Nice catch! We don't actually have logs from last deployment to prod because they're gone. So, I guess we can either spam staging with a lot of requests and do a deployment and find out or, we could just go through all the services and see which ones don't use |
that was easy! there is only proxy who calls |
Not at all confident, but this might resolve it, no? Thanks for finding the issue, @akshaymankar, this was fun! |
we should also add an hlint rule that forbids us to use |
services/cannon/src/Cannon/Run.hs
Outdated
void $ installHandler sigTERM (signalHandler (env e) tid) Nothing | ||
void $ installHandler sigINT (signalHandler (env e) tid) Nothing | ||
runSettings s app `finally` do | ||
runSettingsWithShutdown s app 5 `finally` do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the impact of this 5 second wait, so I will run a couple of experiments to see that the draining happens correctly. Earlier we were also not closing the listen socket, so impact of that is another mystery to me.
I am being super careful because this functionality is important and we don't have nice ways of automatically testing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is closing all the websockets at the same time and then calling the drain script. I suspect this might be due to some cleanup in wai. Given this place is for sure not our problem, I suggest that we revert the change in cannon and rest of the things can be applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also applies hlint again to the whole codebase (excluding tests), as we had some drift between finalising hlint and new PRs being merged without being linted / having CI catch those cases. I also disalbed the pipefail from the script, as that would short-circuit the linter on first issue found. Hopefully that doesn't mess with CI. PS: This will fail CI linters phase until #2715 has been merged.
After running hlint with the changes from #2718 it found a possible spot where this is also not used, in Federator.Response. |
I also suspect that the reason for 502s is that we have 5 second grace period for requests which are already running, maybe we can bump it to 30s. Specially in cargohold where the uploads could take some time to finish. |
unrelated. |
I'll do that 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Please update the changelog though!
Co-authored-by: Akshay Mankar <[email protected]>
Co-authored-by: Akshay Mankar <[email protected]>
* Add new custom hlint rule for runSetting. Also applies hlint again to the whole codebase (excluding tests), as we had some drift between finalising hlint and new PRs being merged without being linted / having CI catch those cases. I also disalbed the pipefail from the script, as that would short-circuit the linter on first issue found. Hopefully that doesn't mess with CI. PS: This will fail CI linters phase until #2715 has been merged. * Removed Federator.Response from runSettings rule.
https://wearezeta.atlassian.net/browse/SQPIT-1431
Checklist
changelog.d