-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos.gitlab: loosen the coupling of gitlab services to postgresql and redis #286532
nixos.gitlab: loosen the coupling of gitlab services to postgresql and redis #286532
Conversation
This currently makes the gitlab tests fail, I could use some help figuring out how to resolve this. gitaly fails to restart due to a missing
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. Sorry about the amount of separate reviews. I have apparently forgotten how to do this properly on GitHub.. Lastly, though:
The tests fail because gitlab-config.service
isn't properly stopped before it's started again. The issue seems to be that stopping a target, in this case gitlab.target
, is not a blocking operation. It probably had time to stop when more had to be done before postgresql.service
could be stopped, and therefore this wasn't apparent. Adding gitlab-config.service
to the list of units to stop before removing the database in the gitlab test solves this issue. Here's a patch:
--- a/nixos/tests/gitlab.nix
+++ b/nixos/tests/gitlab.nix
@@ -419,7 +419,7 @@ in {
gitlab.systemctl("start gitlab-backup.service")
gitlab.wait_for_unit("gitlab-backup.service")
gitlab.wait_for_file("${nodes.gitlab.services.gitlab.statePath}/backup/dump_gitlab_backup.tar")
- gitlab.systemctl("stop postgresql.service gitlab.target")
+ gitlab.systemctl("stop postgresql.service gitlab-config.service gitlab.target")
gitlab.succeed(
"find ${nodes.gitlab.services.gitlab.statePath} -mindepth 1 -maxdepth 1 -not -name backup -execdir rm -r {} +"
)
45af7df
to
85423d5
Compare
@talyz Thanks for catching the bug in the test. The binding of gitlab-config to gitlab-db-config apparently was aimed in the right direction, as I did that to ensure gitlab-config is stopped when gitlab is stopped after backup and thus ready to be started again and initialise files at the next start. |
Regarding release notes, I consider this to just be a change in the under-the-hood mechanics and not worth a mention there. |
…s/ redis to avoid restarts and races Gitlab stays running at redis and postgresql restarts as if these components were on a different host anyways. Handling reconnetctions is part of the application logic. Co-authored-by: Kim Lindberger <[email protected]> for formatting fixes and test failure debugging.
85423d5
to
13ba002
Compare
Tests are now passing for me as well. |
Successfully created backport PR for |
Description of changes
This reduces the strong degree of coupling between Gitlab and its supporting 3rd party services redis and postgresql.
It's a potential solution for #286530.
Background:
When Gitlab and its database postgresql are running on the same system, its systemd services are strongly coupled via
binds-to
relations to ensure start/stop/restart of the individual services propagates to the other services.We already had some slight issues with this in the past and fixed the reliability of coupled restarts in #245240. But since then, we still have seen some additional hard-to-reproduce race conditions that show the approach of strongly-coupled service restarts is brittle here. In one case when postgresql needed to restart, Gitlab was shut down while trying to start and it stayed inactive.
As a consequence, I propose to drop the approach of strong coupling between these services altogether and let the services postgresql or redis restart without forcing a restart of gitlab.service. This just causes the running service to temporarily loose its database connections, but regain them after a successful restart of the db services. Given the long startup times of gitlab.service, this actually even reduces the downtime caused by a database restart in many cases.
Reasoning/ why this decoupling is fine:
By design, both redis and postgresql can be hosted on another host than the gitlab service itself. The NixOS module actually supports this for postgresql, but not for redis (as a dependency of sidekiq). Still, Gitlab and Sidekiq are designed to cope with temporary unavailabilities of redis and postgresql and do recover from such situations automatically.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.