PgDoorman
A multi-threaded PostgreSQL connection pooler written in Rust. Drop-in replacement for PgBouncer and Odyssey, and an alternative to PgCat. Three years in production at Ozon under Go (pgx), .NET (Npgsql), Python (asyncpg, SQLAlchemy), and Node.js workloads.
Get PgDoorman 3.10.6 · Comparison · Benchmarks
Headline features
A diagnostic console embedded in the pg_doorman binary, served on the same port as /metrics. It gives operators the incident view that /metrics and the psql admin console expose only in pieces: pool saturation, latency percentiles, SQLSTATE breakdowns, long-running queries, prepared-cache and query-interner state, process memory, cgroup limits, per-worker CPU, and a live log tail.
The console is for live diagnosis, not a replacement for long-term Prometheus/Grafana monitoring.
Pause / Resume / Reconnect / Reload act from the same page, scoped per pool or globally. Read-only otherwise. The console activates only when [web].ui = true and general.admin_password is non-default; a fresh install with the placeholder password keeps the listener at /metrics only and logs a WARN.
PgDoorman caps total backend connections per database. When max_db_connections is reached, the coordinator evicts an idle connection from the user with the most spare capacity, ranking candidates by p95 transaction time so the slowest pools yield first. A reserve pool absorbs short bursts; per-user min_guaranteed_pool_size keeps critical workloads off the eviction list.
PgBouncer's max_db_connections has no eviction or fairness — when the cap is reached, clients queue until existing connections close on their own idle timeout. Odyssey has no equivalent setting.
When PgDoorman runs next to PostgreSQL on the same machine and a Patroni switchover kills the local backend, PgDoorman polls the Patroni REST API (GET /cluster), picks a live cluster member (priority sync_standby → replica), and routes new connections there. The local backend enters cooldown; fallback connections inherit a short lifetime so the pool returns to local as soon as it recovers.
Set patroni_api_urls and fallback_cooldown in [general] and it applies to every pool. No HAProxy or consul-template in front of the pooler.
Update PgDoorman without stopping the listener. Idle client sessions can move to the new process, which avoids a reconnect wave and reduces repeated auth/SCRAM handshakes against PostgreSQL. Clients inside a transaction stay on the old process until they go idle.
On SIGUSR2 the old process hands each idle client's TCP socket to the new one through SCM_RIGHTS — same socket, no reconnect — together with cancel keys and the prepared-statement cache. Clients inside a transaction finish on the old process and migrate as soon as they go idle. With the tls-migration build (Linux, opt-in) the OpenSSL cipher state moves too, so TLS sessions survive without a re-handshake.
PgBouncer's online restart (-R, deprecated since 1.20; or so_reuseport rolling restart) and Odyssey's online restart (SIGUSR2 + bindwith_reuseport) work the same way as each other: the new process picks up new connections, the old one drains until its existing clients disconnect on their own. Sessions, prepared statements, and TLS state never move between processes.
In extended protocol, many drivers send short parameterised queries as Parse with an empty statement name. Without a remap, that hot path keeps paying PostgreSQL planner CPU and repeated backend Parse work on every reuse.
PgDoorman rewrites the empty name to an internal DOORMAN_<N> on the backend and keeps the mapping in the pool. PostgreSQL sees a named prepared statement, so later Binds for the same query shape can reuse prepared backend state across one client and across clients sharing the pool. The primary value is performance: less planner work and fewer backend Parses on repeated OLTP queries.
PgBouncer (1.21+) and Odyssey support prepared statements in transaction mode, but only for named statements; an anonymous Parse is forwarded as-is and re-planned on every call. PgDoorman is the one that rewrites it.
To keep that optimization operationally safe, the cache is bounded and observable. Anonymous entries time out on idle, named entries reclaim once nothing references them, SHOW INTERNER exposes interner size, and Prometheus metrics expose hits, misses, and evictions.
Why PgDoorman
- Caches
Parseon hot query paths. Prepared backend state is reused between clients sharing a pool, including the anonymousParsemost drivers send for short parameterised queries. That cuts PostgreSQL planner CPU on repeated OLTP queries;SHOW INTERNERshows query-text memory, while Prometheus metrics show cache hits, misses, and evictions. - Multi-threaded, single shared pool. All worker threads share one pool. PgBouncer is single-threaded; the recommended scale-out — several instances behind
so_reuseport— gives each instance its own pool, and idle counts can drift between processes for the same database. - Thundering herd suppression. When 200 clients race for 4 idle connections, PgDoorman caps concurrent backend creates (
scaling_max_parallel_creates) and routes returning servers straight to the longest-waiting client through an in-process oneshot channel — no requeue through the idle pool. - Bounded tail latency. Waiters are served strict FIFO so the worst-case wait can't be overtaken by latecomers. Pre-replacement of expiring backends — at 95% of
server_lifetime, up to 3 in parallel — keeps the pool warm, so there is no checkout spike when a generation of connections rotates out. - Dead backend detection inside transactions. If the backend dies mid-transaction (failover, OOM, network partition), PgDoorman returns SQLSTATE
08006immediately by racing the client read against backend readability with a 100 ms tick. Without this, the client would block until TCP keepalive fires — on Linux defaults that is about two hours plus 9×75 s probes. - Built for operations. YAML or TOML config with human-readable durations (
30s,5m).pg_doorman generate --host …introspects an existing PostgreSQL and emits a starter config.pg_doorman -tvalidates the config without starting the server. A Prometheus/metricsendpoint is built-in.
Comparison
| Feature | PgDoorman | PgBouncer | Odyssey |
|---|---|---|---|
| Multi-threaded with shared pool | Yes | No (single-threaded) | Workers, separate pools |
| Prepared statements in transaction mode | Yes | Yes (since 1.21) | Yes (pool_reserve_prepared_statement) |
Anonymous Parse cache for hot parameterised queries | Yes, reused across clients in a pool | No, named statements only | No, named statements only |
| Pool Coordinator (per-database cap, priority eviction) | Yes | No | No |
| Patroni-assisted fallback (built-in) | Yes | No | No |
Pre-replacement on server_lifetime expiry | Yes | No | No |
| Stale backend detection inside a transaction | Yes (immediate 08006) | No (waits for TCP keepalive) | No (waits for TCP keepalive) |
| Binary upgrade with session migration | Yes (SCM_RIGHTS, TLS state opt-in) | No (sessions stay on old process) | No (sessions stay on old process) |
| Backend TLS to PostgreSQL | Yes (5 modes, hot reload via SIGHUP) | Yes (server_tls_*, hot reload via RELOAD) | No |
| Auth: SCRAM passthrough (no plaintext password) | Yes (ClientKey extracted from proof) | Yes (encrypted SCRAM secret via auth_query/userlist.txt, since 1.14) | Yes |
| Auth: JWT (RSA-SHA256) | Yes | No | No |
Auth: PAM / pg_hba.conf / auth_query | Yes | Yes | Yes |
| Auth: LDAP | No | Yes (since 1.25) | Yes |
| Config format | YAML / TOML | INI | Own format |
| JSON structured logging | Yes | No | Yes (log_format "json") |
| Latency percentiles (p50/p90/p95/p99) | Yes (built-in /metrics) | No (averages only) | Yes (via separate Go exporter) |
Config test mode (-t) | Yes | No | No |
Auto-config from PostgreSQL (generate --host) | Yes | No | No |
| Prometheus endpoint | Built-in /metrics | External exporter | External exporter (Go sidecar) |
Benchmarks
AWS Fargate (16 vCPU), pool size 40, pgbench 30 s per test:
| Scenario | vs PgBouncer | vs Odyssey |
|---|---|---|
| Extended protocol, 500 clients + SSL | ×3.5 | +61% |
| Prepared statements, 500 clients + SSL | ×4.0 | +5% |
| Simple protocol, 10 000 clients | ×2.8 | +20% |
| Extended + SSL + reconnect, 500 clients | +96% | ~0% |
Quick start
Install via your distro package manager:
# Ubuntu / Debian
sudo add-apt-repository ppa:vadv/pg-doorman
sudo apt update
sudo apt install pg-doorman
# Fedora / RHEL family
sudo dnf copr enable @pg-doorman/pg-doorman
sudo dnf install pg_doorman
Distro packages and the Docker image are built without the tls-migration and pam features. See Installation for the TLS feature matrix and how to build with them.
Or run via Docker:
docker run -p 6432:6432 \
-v $(pwd)/pg_doorman.yaml:/etc/pg_doorman/pg_doorman.yaml \
ghcr.io/ozontech/pg_doorman \
pg_doorman /etc/pg_doorman/pg_doorman.yaml
Minimal config (pg_doorman.yaml):
general:
host: "0.0.0.0"
port: 6432
admin_username: "admin"
admin_password: "change_me"
pools:
mydb:
server_host: "127.0.0.1"
server_port: 5432
pool_mode: "transaction"
users:
- username: "app"
password: "md5..." # hash from pg_shadow / pg_authid
pool_size: 40
server_username and server_password are omitted on purpose: PgDoorman re-uses the client's MD5 hash or SCRAM ClientKey to authenticate against PostgreSQL. No plaintext passwords in the config.
Installation guide → · Configuration reference →
Where to next
- New to PgDoorman? Start with Overview, then Installation and Basic usage.
- Migrating from PgBouncer or Odyssey? Read Comparison and Authentication.
- Running Patroni? See Patroni-assisted fallback and
patroni_proxy. - Production sizing? Read Pool pressure and Pool Coordinator.
- Operating PgDoorman? See Binary upgrade, Signals, Troubleshooting.