Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutdown and startup timeouts too small (systemd) #7264

Closed
RubenKelevra opened this issue May 1, 2020 · 1 comment
Closed

shutdown and startup timeouts too small (systemd) #7264

RubenKelevra opened this issue May 1, 2020 · 1 comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@RubenKelevra
Copy link
Contributor

Version information:

go-ipfs version: 0.6.0-dev
Repo version: 9
System version: amd64/linux
Golang version: go1.14.2

Revision 826826c

Description:

The shutdown via systemd didn't go as planned and after the default timeout systemd killed the service:

Apr 26 11:56:42 vidar.pacman.store ipfs[6306]: Daemon is ready
Mai 01 15:15:11 vidar.pacman.store systemd[1]: Stopping InterPlanetary File System (IPFS) daemon...
Mai 01 15:15:11 vidar.pacman.store ipfs[6306]: Received interrupt signal, shutting down...
Mai 01 15:15:11 vidar.pacman.store ipfs[6306]: (Hit ctrl-c again to force-shutdown the daemon.)
Mai 01 15:15:12 vidar.pacman.store ipfs[6306]: 2020-05-01T15:15:11.339+0300        ERROR        reprovider.simple        simple/reprovide.go:80  ...canceled
Mai 01 15:16:41 vidar.pacman.store systemd[1]: [email protected] stop-sigterm timed out. Killing.
Mai 01 15:16:41 vidar.pacman.store systemd[1]: [email protected]: main process exited, code=killed, status=9/KILL
Mai 01 15:16:41 vidar.pacman.store systemd[1]: Stopped InterPlanetary File System (IPFS) daemon.
Mai 01 15:16:41 vidar.pacman.store systemd[1]: Unit [email protected] entered failed state.
Mai 01 15:16:41 vidar.pacman.store systemd[1]: [email protected] failed.

I suspect that something in the badgerdb is broken and ipfs will lock up for a long while...

Mai 01 15:41:55 vidar.pacman.store ipfs[317]: Initializing daemon...
Mai 01 15:41:55 vidar.pacman.store ipfs[317]: go-ipfs version: 0.6.0-dev
Mai 01 15:41:55 vidar.pacman.store ipfs[317]: Repo version: 9
Mai 01 15:41:55 vidar.pacman.store ipfs[317]: System version: amd64/linux
Mai 01 15:41:55 vidar.pacman.store ipfs[317]: Golang version: go1.14.2
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip4/10.176.233.122/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip4/127.0.0.1/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip4/127.0.0.1/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip4/94.176.233.122/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip6/2a02:7b40:5eb0:e97a::1/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm listening on /ip6/::1/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm announcing /ip4/94.176.233.122/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Swarm announcing /ip6/2a02:7b40:5eb0:e97a::1/tcp/443
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: API server listening on /ip4/127.0.0.1/tcp/5001
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: WebUI: http://127.0.0.1:5001/webui
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Gateway (readonly) server listening on /ip4/127.0.0.1/tcp/8080
Mai 01 15:46:54 vidar.pacman.store ipfs[317]: Daemon is ready

This is an issue with the new "notify"-type systemd.service file, since systemd would give the daemon just one minute and 30 seconds to startup, not 5 minutes.

We either need to increase the timeout in the service file or get this stuff done faster after a crash - if this is possible.

Also, the shutdown timeout is probably too short. Not sure what a sensible timeout would be there.

I aborted ipfs once after 2 minutes to see where it's got stuck - stack trace attached.

ipfs_stack_trace_2020-05-01.txt

@RubenKelevra RubenKelevra added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 1, 2020
@RubenKelevra
Copy link
Contributor Author

Closed in favour of the more specific bugreport #7273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

1 participant