Binary upgrade
Replace the pg_doorman binary on a running server. Idle clients are handed to
the new process over a Unix socket together with their cancel keys and
prepared-statement cache, so they keep using the same TCP connection without
reconnecting. Clients inside a transaction finish on the old process and
migrate the moment they become idle. Operators get a kill -USR2 and an exit
status; applications get neither a reconnect storm nor a wave of
auth/SCRAM handshakes against PostgreSQL.
PgBouncer's online restart (-R, deprecated since 1.20; or so_reuseport
rolling restart) and Odyssey's online restart (SIGUSR2 +
bindwith_reuseport) follow the same pattern as each other: the new process
picks up new connections, the old one drains until its existing clients
disconnect. Sessions, prepared statements, and TLS state never cross
processes. pg_doorman migrates the live socket via SCM_RIGHTS, plus the
cipher state with the tls-migration build (Linux, opt-in).
Quick start
On hosts where pg_doorman comes from apt install pg-doorman /
dnf install pg-doorman, use the package manager for the binary
replacement. apt-get install --only-upgrade pg-doorman or
dnf upgrade pg-doorman is the idiomatic devops path. The manual
install below is for direct-binary deployments where no package
manager is in scope.
# 1. Install the new binary at the path used by the running service.
install -m 0755 pg_doorman_new /usr/bin/pg_doorman
# 2. Validate the new binary against the live config before triggering
# the upgrade. SIGUSR2 also runs `-t` and aborts on failure, but
# catching it here gives you a chance to fix the config without
# touching the running server.
/usr/bin/pg_doorman -t /etc/pg_doorman/pg_doorman.toml
# 3. Trigger the upgrade. With `ExecReload=/bin/kill -SIGUSR2 $MAINPID`
# in the unit, `systemctl reload` sends SIGUSR2 to start binary
# upgrade. pg_doorman then validates config, starts the child,
# migrates state where possible, and drains the old process.
# systemd delivers the signal to the
# tracked MainPID, so this targets the single correct process even
# when other pg_doorman instances are running on the host. Direct
# `kill -USR2 $(pgrep -f /usr/bin/pg_doorman)` works but matches by
# command line and can hit every instance, which is why packaged
# installs go through systemctl.
sudo systemctl reload pg_doorman.service
# A successful reload only means systemd delivered SIGUSR2. Validation,
# child startup, MAINPID handoff, and client migration happen inside
# pg_doorman. Verify them in the next step and in the logs.
# 4. Verify: systemd tracks the new MainPID (Type=notify receives
# `MAINPID=<new_pid>` from the child during the handoff). Active
# state and the admin console confirm clients are still attached.
systemctl show -p MainPID --value pg_doorman.service
psql -h pgdoorman -p 6432 -c 'SHOW POOLS;' # served by the new process
If the unit is not running under systemd, read the PID file the daemon
writes (daemon_pid_file, default /tmp/pg_doorman.pid) instead of
parsing pgrep: kill -USR2 "$(cat /var/run/pg_doorman/pg_doorman.pid)".
Foreground deployments not managed by systemd should keep the PID of
the supervising process and signal that one directly.
The same upgrade can be triggered from the admin console:
UPGRADE;
UPGRADE sends SIGUSR2 to the running process, which is the same code
path as kill -USR2. A successful command response means the signal was
sent, not that validation and migration have finished.
How the upgrade works
SIGUSR2
|
v
+-----------------------+
| 1. Validate config |
| (pg_doorman -t) | -- fail --> abort, keep serving
+-----------+-----------+
|
v
+-----------------------+
| 2. Spawn new process |
| socketpair() |
| inherit-fd |
| readiness pipe | -- wait up to 10s
+-----------+-----------+
|
+-------------+-------------+
| |
v v
+---------------------+ +---------------------+
| OLD process | | NEW process |
| | | |
| 3. Idle clients | | migration_receiver |
| serialize state +--->+ reconstruct |
| dup() + SCM_RIGHTS | spawn client |
| | | handle() |
| 4. In-tx clients | | |
| finish tx | | Accepts new conns |
| migrate on idle +--->+ |
| | | |
| 5. Shutdown timer | +---------------------+
| poll 250ms |
| exit when empty |
+---------------------+
Phase 1: Config validation
The running process executes the same binary path it was started with,
using -t and the current config file. After the install in the quick
start, that path points to the new binary, so the check validates the
binary that will take over. If validation fails, the upgrade is aborted
and the old process keeps serving traffic. An error banner appears in
the logs:
!!! BINARY UPGRADE ABORTED - SHUTDOWN CANCELLED !!!
!!! FIX THE CONFIGURATION BEFORE ATTEMPTING BINARY UPGRADE AGAIN !!!
!!! THE SERVER WILL CONTINUE RUNNING WITH THE CURRENT BINARY !!!
Phase 2: Spawn new process
Foreground mode:
- A Unix
socketpair()is created for client migration. - The listener fd passes to the child via
--inherit-fd. - A pipe signals readiness: the parent waits up to 10 seconds for a single byte. If the child starts and begins accepting, it writes to the pipe.
- The parent closes its listener -- new connections go to the child.
Daemon mode:
A new daemon process starts. The old daemon closes its listener.
Client migration via socketpair is not used — existing clients
stay on the old process. When shutdown_timeout expires, the old
process exits and any remaining client sockets close. Use foreground
mode if clients must migrate to the new process.
Phase 3: Idle client migration (foreground)
When MIGRATION_IN_PROGRESS is set, each idle client (not in a
transaction, no pending deferred BEGIN, no buffered reads)
migrates:
- Serialize: connection_id, secret_key, pool name, username, server parameters, full prepared statement cache.
- dup() + SCM_RIGHTS: the TCP socket fd is duplicated and sent to the new process over the Unix socketpair.
- Reconstruct: the new process rebuilds the Client struct,
assigns it to the correct pool, and calls
handle().
The client sees no interruption. No reconnect, no error, no re-authentication. The TCP connection is the same physical socket.
Phase 4: In-transaction client drain
A client inside BEGIN ... COMMIT continues running on the old
process. Its server connection stays alive. After the transaction
ends (COMMIT or ROLLBACK), the client becomes idle and migrates
on the next loop iteration.
A deferred BEGIN (no server checked out yet) also blocks migration.
The client must send a query (flushing the deferred BEGIN) and then
COMMIT before it can migrate.
Phase 5: Shutdown timer
The shutdown timer polls CURRENT_CLIENT_COUNT every 250 ms. When
all clients have migrated or disconnected, the old process calls
process::exit(0).
If shutdown_timeout elapses before all clients finish, the old
process exits regardless -- force-closing remaining connections.
During migration, drain_all_pools() is deferred. In-transaction
clients still need their server connections. Pool draining starts
only after migration completes or when MIGRATION_IN_PROGRESS
is cleared.
Prepared statements
Each client's prepared statement cache is serialized during migration:
- Statement key (named or anonymous hash)
- Query hash
- Full query text
- Parameter type OIDs
In the new process:
- Each entry is registered in the pool-level shared cache (DashMap).
- Server backends are fresh -- they have no prepared statements.
- On the first
Bindto a migrated statement, pg_doorman transparently sendsParseto the new backend. The client does not see this extra round-trip.
Limits:
- If the new config has a smaller
client_anonymous_prepared_cache_size, excess Anonymous entries are evicted (LRU). Named entries are unbounded and survive in full. The remaining entries work normally. - Anonymous prepared statements (empty-name
Parse) survive migration but require a re-ParsebeforeBindin the new process. DEALLOCATE ALLafter migration clears the transferred cache. Re-Parsewith the same name uses the new query text.
TLS migration
By default, TLS clients cannot be migrated -- the encrypted session
requires key material that lives inside the OpenSSL state machine.
These clients drain during upgrade: their connection is closed when
shutdown_timeout expires, and the client reconnects to the new
process.
The opt-in tls-migration feature solves this. A patched OpenSSL
exports the symmetric cipher state, passes it alongside the fd over
the Unix socket, and the new process imports it to resume encryption
mid-stream. The client does not re-handshake.
What gets exported
The patch adds SSL_export_migration_state() and
SSL_import_migration_state() to OpenSSL 3.5.5. Exported data:
- TLS protocol version
- Cipher suite ID and tag length
- Read/write symmetric keys (AES key schedule input, not expanded)
- Read/write IVs (nonce)
- Read/write sequence numbers (8 bytes each)
- For TLS 1.3: server and client application traffic secrets
This is enough to reconstruct the record layer in the new process and continue encrypting/decrypting on the same TCP connection.
Building with TLS migration
cargo build --release --features tls-migration
Requires perl and patch in the build environment. Vendored
OpenSSL 3.5.5 compiles from source with the migration patch applied.
Offline builds
# Download the tarball in advance
curl -fLO https://github.com/openssl/openssl/releases/download/openssl-3.5.5/openssl-3.5.5.tar.gz
# Build with the local tarball
OPENSSL_SOURCE_TARBALL=./openssl-3.5.5.tar.gz \
cargo build --release --features tls-migration
SHA-256 is verified automatically.
Restrictions
- Linux only. macOS and Windows use platform-native TLS (Security.framework / SChannel), not OpenSSL. TLS migration is not possible with native-tls backends.
- Same certificates. Both processes must use the same
tls_private_keyandtls_certificate. The cipher state is bound to the SSL_CTX created from the certificate. Changed certificates cause import failure and client disconnection. - FIPS incompatible. Vendored OpenSSL is not FIPS-validated.
For FIPS compliance, build without
tls-migration(TLS clients drain instead of migrating). - No HSM/PKCS#11. Vendored OpenSSL is built with
no-engine.
Known limitations
-
TLS 1.3 KeyUpdate changes cipher keys. If either side sends a KeyUpdate message after the cipher state was exported, the imported keys become invalid and the connection will fail with AEAD authentication errors.
Driver-specific behavior (verified April 2026):
Driver Auto KeyUpdate? Risk libpq (psql, pgbench) No — OpenSSL does not auto-send None asyncpg (Python) No — Python ssl wraps OpenSSL None node-postgres No — Node.js tls wraps OpenSSL None Npgsql (.NET) No — SslStream has no KeyUpdate API None pgjdbc (Java) Yes — JSSE sends after ~128 GB ( jdk.tls.keyLimits)High tokio-postgres (rustls) Yes — rustls rotates at AEAD limit Medium PostgreSQL server No — renegotiation disabled, no KeyUpdate calls None Java clients: JSSE automatically sends KeyUpdate after ~128 GB of encrypted data per connection. JDK bug JDK-8329548 can cause a storm of KeyUpdate messages. For Java clients with long-lived, high-throughput connections, TLS migration may lose connections after the threshold. Workaround: increase the threshold via
jdk.tls.keyLimitsinjava.security, or disable TLS between client and pg_doorman for Java workloads.Rust clients with rustls: rustls tracks AEAD usage and rotates keys at cipher suite limits (very high threshold, ~2^36 records for AES-GCM). Unlikely to hit in practice for PostgreSQL workloads. Using
native-tls(OpenSSL) backend instead of rustls eliminates the risk.All OpenSSL-based drivers are safe. OpenSSL explicitly does not perform automatic key updates (openssl#23566).
-
SSL_pending data not checked. The migration happens at the idle point, where no application data is buffered. The idle-point invariant guarantees this, but there is no explicit SSL_pending() assertion.
-
Tied to OpenSSL 3.5.5. The patch modifies internal OpenSSL structures (
ssl_local.h,rec_layer_s3.c,ssl_lib.c). Upgrading OpenSSL requires reviewing and re-applying the patch against the new version.
Signal reference
| Signal | Behavior |
|---|---|
SIGUSR2 | Binary upgrade + old-process drain. Recommended for all modes. |
SIGINT | Foreground + TTY (Ctrl+C): shutdown only, no upgrade. Daemon / non-TTY: binary upgrade (legacy compatibility). |
SIGTERM | Immediate exit. Active transactions are killed. All clients disconnected. |
SIGHUP | Reload configuration without restart. No downtime. |
UPGRADE (admin) | Sends SIGUSR2 to the current process internally. Same effect. |
SIGINT triggers binary upgrade in daemon mode or without a TTY (e.g. when spawned by systemd). In an interactive terminal, Ctrl+C stops the process cleanly without spawning a new one. Use kill -USR2 or the UPGRADE admin command for binary upgrade in foreground mode.
Daemon vs foreground
| Foreground | Daemon | |
|---|---|---|
| Client migration via fd passing | Yes (socketpair) | No |
| Idle clients preserved | Yes | No (closed when old process exits) |
| In-tx clients | Finish tx, then migrate | Finish tx until timeout, then close |
| New process startup | Inherits listener fd | Starts independently |
| Recommended for | systemd, containers, k8s | Legacy deployments |
For zero-downtime upgrades with client migration, run in foreground
mode. systemd manages the process lifecycle. Use Type=notify so the
unit reaches active only after pg_doorman signals readiness, and the
child process can update MainPID to itself during SIGUSR2 upgrades:
[Service]
Type=notify
# The child process that takes over on SIGUSR2 must be allowed to send
# READY=1 and MAINPID=<new_pid> during handoff.
NotifyAccess=exec
ExecStart=/usr/bin/pg_doorman /etc/pg_doorman/pg_doorman.toml
# `systemctl reload` triggers binary upgrade: validate config, spawn
# the new process, migrate clients where possible, then drain the old
# process according to pg_doorman's shutdown_timeout.
ExecReload=/bin/kill -SIGUSR2 $MAINPID
# `systemctl stop` is immediate shutdown. It is not a binary upgrade
# path and it does not wait for active transactions to migrate.
ExecStop=/bin/kill -SIGTERM $MAINPID
# During binary upgrade the new child becomes MainPID via sd_notify.
# With KillMode=mixed, systemd sends SIGTERM only to MainPID on stop
# and SIGKILLs remaining cgroup processes only after TimeoutStopSec.
KillMode=mixed
TimeoutStopSec=60
# Do not restart after a clean manual stop or after the old process exits
# successfully during binary upgrade.
Restart=on-failure
Nice=-15
# pg_doorman is connection-heavy: each client + each backend uses an
# fd, plus internal pipes. 65536 covers most OLTP pools; size it from
# `general.pool_size * num_pools` plus a few thousand for clients.
LimitNOFILE=65536
# Run as a non-privileged service account that owns the PID file. On
# many deployments postgres already exists; reusing it keeps file
# ownership consistent with PostgreSQL itself.
User=postgres
Group=postgres
SyslogIdentifier=pg_doorman
systemctl reload pg_doorman sends SIGUSR2; a zero exit status only
means the signal was delivered. pg_doorman then runs -t on the new
binary, cancels the upgrade if the config is bad, otherwise spawns the
new process and drains the old one. UPGRADE; from the admin console
reaches the same code path. The drain window is controlled by
shutdown_timeout in pg_doorman.toml; TimeoutStopSec controls normal
systemctl stop, not how long systemctl reload waits for migrated
sessions.
Production deployments often layer more resource controls on top of the
above, such as MemoryMax= and CPUAffinity=2,3,4,5,6,7,8,9. These are
workload-specific and orthogonal to the upgrade contract.
Configuration
shutdown_timeout
Maximum time to wait for in-transaction clients before force-closing connections. The old process exits after this timeout regardless of remaining clients.
Default: 10 seconds.
For production with long-running analytics queries: 30-60 seconds.
[general]
shutdown_timeout = 60000 # milliseconds
Setting it too low risks killing active transactions. Setting it too high delays the old process exit when a client is stuck (e.g., idle-in-transaction). Choose a value that covers your longest expected transaction, plus margin.
tls_private_key / tls_certificate
For the tls-migration feature to succeed, both the old and the new
process must load the same client-facing certificate and private key.
The cipher state is bound to the SSL_CTX created from those files,
and import fails on mismatch — affected clients drop and reconnect.
Client-facing TLS material is not reloaded on SIGHUP (only
server-facing certificates are; see Hot reload of server TLS).
Do not combine client-facing certificate rotation with an upgrade where
you expect TLS sessions to migrate. If the files change between old and
new process, TLS import fails and affected clients reconnect even with
tls-migration enabled. Rotate the client-facing certificate in a
maintenance window where reconnects are acceptable, or keep the same
certificate files for the binary upgrade and rotate later with a restart.
prepared_statements_cache_size
Pool-level prepared statement cache. Does not directly affect migration, but the pool cache in the new process must be large enough to hold entries registered by migrated clients.
client_anonymous_prepared_cache_size
Per-client Anonymous prepared statement LRU. The client's full cache (both Named and Anonymous) is serialized during migration. If the new config has a smaller value, only Anonymous entries are subject to LRU eviction; Named entries are unbounded and migrate intact.
Rollback
Binary upgrade has no separate undo path. Roll back by staging the
previous binary at the same path and running another SIGUSR2 upgrade.
If validation fails, the current process keeps serving traffic. If the
new process already took over, treat the rollback as a normal binary
upgrade in the opposite direction.
Avoid systemctl restart or SIGTERM for rollback unless reconnects
are acceptable: both close client sessions instead of migrating them.
Monitoring
Logs
Key log lines during migration:
INFO Got SIGUSR2, starting binary upgrade and graceful shutdown
INFO Validating configuration with: /usr/bin/pg_doorman -t pg_doorman.toml
INFO Configuration validation successful
INFO Starting new process with inherited listener fd=5
INFO New process signaled readiness
INFO Client migration enabled
INFO [user@pool #c42] client 10.0.0.1:51234 migrated to new process
INFO waiting for 3 clients in transactions
INFO All clients disconnected, shutting down
INFO Migration sender finished
In the new process:
INFO migration receiver: listening for migrated clients
INFO [user@pool #c42] migrated client accepted from 10.0.0.1:51234
INFO migration receiver done: migration socket closed
INFO migration receiver: stopped
Prometheus metrics
| Metric | Relevance during upgrade |
|---|---|
pg_doorman_pools_clients{status="active"} | Should drop to 0 on old process |
pg_doorman_pools_clients{status="idle"} | Drops as clients migrate |
pg_doorman_connections_total{type="total"} | New process accepts fresh connections; use rate() / increase() |
pg_doorman_clients_prepared_cache_entries | Confirms cache transferred |
Admin console
-- On the new process (old rejects non-admin connections)
SHOW POOLS;
SHOW CLIENTS;
Troubleshooting
Client receives 58006 or disconnects instead of migrating
Ctrl+C in foreground mode. SIGINT in TTY = shutdown without
upgrade. Use kill -USR2 or the UPGRADE admin command.
Daemon mode. Daemon mode does not use fd-based migration. Existing clients stay on the old process and are closed when it exits. Switch to foreground mode for migration.
PG_DOORMAN_CI_SHUTDOWN_ONLY=1 is set. This env var forces
shutdown-only mode (used in CI tests). Unset it.
Old process does not exit
Long transaction. A client is stuck in BEGIN without COMMIT.
Wait for shutdown_timeout or end the transaction manually.
Admin connections. Admin connections do not migrate. Close the admin session on the old process.
Force exit: kill -TERM <old_pid> sends SIGTERM for immediate
exit.
TLS connection dropped after upgrade
Binary built without --features tls-migration. TLS clients
drain instead of migrating. Rebuild with --features tls-migration.
Not running on Linux. TLS migration is Linux-only.
Certificate or key changed. The old process exported cipher state bound to the old certificate. Use the same files for both processes if you need TLS migration. Client-facing certificate rotation requires a restart or a planned reconnect window.
"TLS migration not available" in logs
The new process received a migration payload with TLS data but was
built without --features tls-migration or is not running on Linux.
The client is disconnected. Rebuild the new binary with
--features tls-migration.
"migration channel not ready" in logs
The MIGRATION_TX channel has not been initialized yet. This can
happen if the new process has not finished starting when a client
tries to migrate. The client retries on the next idle iteration
(within milliseconds).
"migration channel send failed" in logs
The migration channel is full (capacity: 4096). Possible when thousands of clients migrate simultaneously. The client retries on the next idle iteration.
"prepare_migration failed" in logs
The client's raw fd is unavailable or dup() failed. Possible
causes: fd exhaustion, or the client connected through a code
path that does not store the raw fd. Check ulimit -n.
Libraries like github.com/lib/pq or Go's database/sql may need
configuration to handle the reconnection path for clients that cannot
migrate and receive 58006 or a connection close. See
this issue.
Operational checklist
Before rolling out binary upgrade to production:
- Run in foreground mode (not daemon) for fd-based migration
-
Set
shutdown_timeoutto cover your longest expected transaction (recommendation: 30-60 seconds for OLTP, longer for analytics) -
If using TLS: build with
--features tls-migration, verify both processes use the same certificate and key files - Test the upgrade in staging: open a session, trigger SIGUSR2, verify the session continues working
-
Verify the systemd unit is
Type=notifywithNotifyAccess=exec,ExecReload=/bin/kill -SIGUSR2 $MAINPID(sosystemctl reloadruns binary upgrade with config validation),KillMode=mixed, andRestart=on-failure - Monitor logs for migration errors after the first production upgrade
-
Confirm old process exits (check PID file or
pgrep) - Verify Prometheus metrics show clients on the new process
The web listener (which serves /metrics) binds with SO_REUSEPORT. While the old process drains and the new one accepts new clients, both share the same port; the kernel balances scrape requests between them. Counter values may appear to jump backwards on a single scrape until the old process exits. The race window lasts at most shutdown_timeout.