Skip to content

Commit

Permalink
Remove stale pidfile if it exists
Browse files Browse the repository at this point in the history
The postmaster will refuse to start if the pid of the pidfile is currently
in use by the same OS user. This protection mechanism however is not strict
enough in a container environment, as we only have the pids in our own namespace.
The Volume containing the data directory could accidentally be mounted
inside multiple containers, so relying on visibility of the pid is not enough.

There is only 1 way for us to communicate to the other postmaster (in another container?)
on the same $PGDATA: by removing the pidfile.

The other postmaster will shutdown immediately as soon as it determines that its
pidfile has been removed. This is a Very Good Thing: it prevents multiple postmasters
on the same directory, even in a container environment.
See also https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7e2a18)

The downside of this change is that it will delay the startup of a crashed container;
as we're dealing with data, we'll choose correctness over uptime in this instance.
  • Loading branch information
feikesteenbergen committed Mar 20, 2020
1 parent 063b0cf commit 26d75a1
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
These are changes that will probably be included in the next release.

### Added
* Remove stale pidfile if it exists
### Changed
### Removed
### Fixed
Expand Down
23 changes: 23 additions & 0 deletions timescaledb_entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,27 @@ install -m 0700 -d "${PGDATA}"
python3 /scripts/augment_patroni_configuration.py /home/postgres/postgres.yml
}

if [ -f "${PGDATA}/postmaster.pid" ]; then
# the postmaster will refuse to start if the pid of the pidfile is currently
# in use by the same OS user. This protection mechanism however is not strict
# enough in a container environment, as we only have the pids in our own namespace.
# The Volume containing the data directory could accidentally be mounted
# inside multiple containers, so relying on visibility of the pid is not enough.
#
# There is only 1 way for us to communicate to the other postmaster (in another container?)
# on the same $PGDATA: by removing the pidfile.
#
# The other postmaster will shutdown immediately as soon as it determines that its
# pidfile has been removed. This is a Very Good Thing: it prevents multiple postmasters
# on the same directory, even in a container environment.
# See also https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7e2a18)
#
# The downside of this change is that it will delay the startup of a crashed container;
# as we're dealing with data, we'll choose correctness over uptime in this instance.
log "Removing stale pidfile ..."
rm "${PGDATA}/postmaster.pid"
log "Sleeping a little to ensure no other postmaster is running anymore"
sleep 65
fi

exec patroni /home/postgres/postgres.yml

0 comments on commit 26d75a1

Please sign in to comment.