Changelog
3.10.7
pgjdbc LargeObject fastpath calls work in transaction pooling
pg_doorman now forwards PostgreSQL Fastpath FunctionCall (F) messages and
passes FunctionCallResponse (V) back to the client. pgjdbc
LargeObjectManager uses this protocol path for functions such as lo_open,
lo_read, and lo_write. Transaction-mode clients could previously hang
because pg_doorman did not forward the frontend F message.
Large FunctionCallResponse messages now use the same large-message streaming
path as oversized DataRow and CopyData messages. This avoids buffering a
large fastpath lo_read response in pg_doorman memory before forwarding it to
the client.
Large object calls now work through pg_doorman and can hold transaction-pool backends while reads or writes are in flight. Size pools for concurrent large object traffic and keep application-side reads chunked. See Fastpath and Large Objects for pool sizing, timeout, and read-size guidance.
3.10.6
Binary upgrade no longer carries migrated client fds into the next generation
Client fds received over the SIGUSR2 migration socket are now marked
close-on-exec in the new process. A chained binary upgrade used to inherit
stale copies of already-migrated client sockets, so every generation could
start with extra fds and eventually fail with Too many open files under
load.
The foreground upgrade path also marks inherited service fds close-on-exec after startup and cleans up unexpected inherited descriptors before config load when the process starts as a binary-upgrade child. This lets an upgraded binary recover from a parent that was already polluted by older non-CLOEXEC fds instead of preserving that fd garbage forever.
Local fd exhaustion no longer enters Patroni-assisted fallback
Backend connection failures caused by pg_doorman's own EMFILE/ENFILE
state are now classified as local resource exhaustion, not as PostgreSQL
unreachability. Those errors no longer blacklist the local backend or enter
the Patroni-assisted fallback discovery path, so fd pressure does not amplify
itself with fallback connection attempts and noisy discovery failures.
Web admin sockets use the safe TCP policy
Accepted Web UI and /metrics TCP sockets now receive the same low-risk TCP
keepalive, buffer-size and user-timeout configuration as other TCP sockets,
but do not inherit the pooler client SO_LINGER policy. This avoids abortive
HTTP closes when general.tcp_so_linger = 0 while still bounding web socket
resource usage.
3.10.5
Binary upgrade survives a tight RLIMIT_NOFILE
SIGUSR2 binary upgrade now handles EMFILE/ENFILE from the old
process without spinning in the accept loop or overfilling the migration
queue.
-
The TCP and Unix accept loops treat
EMFILE/ENFILEas local resource pressure: they sleep for 10 ms and log at most once every 5 seconds. Other accept errors still log normally. -
The migration channel is no longer fixed at 4096 entries. At upgrade time pg_doorman reads the current
RLIMIT_NOFILE, counts open fds via/proc/self/fd, reserves headroom for the handoff pipe/socketpair and per-client fd work, and caps the queue by the remaining budget. If no safe headroom remains, pg_doorman starts the new process without client migration and logs the budget decision. -
Client migration reserves a channel slot before calling
dup()on the client fd. A full channel now applies backpressure before creating an extra fd.
If the pre-flight pg_doorman -t spawn fails with local EMFILE/ENFILE,
pg_doorman skips that validation step and continues with the binary
upgrade. Other validation failures still abort the upgrade before shutdown.
/metrics scrape uses cached socket-state counts
/metrics no longer walks /proc/PID/net/tcp and /proc/PID/net/unix
on the request path. On hosts with thousands of sockets, that synchronous
walk could hold worker threads long enough for regular Prometheus scrapes
to increase client p99.
Socket-state counts now live in a cached ArcSwap snapshot refreshed by a
background spawn_blocking task. The /metrics handler, periodic
print_all_stats output, and admin SHOW SOCKETS command read the cached
snapshot. The Web UI sockets endpoint still refreshes socket details on
demand for operator use.
The cache keeps scrape cost independent of the number of live sockets in the common Prometheus path.
3.10.1
Configurable kernel TCP socket buffer size
New general.tcp_socket_buffer_size (ByteSize, default 0). When set
to a non-zero value, pg_doorman calls setsockopt(SO_RCVBUF/SO_SNDBUF)
on every accepted client TCP socket and outbound backend TCP socket,
sets fixed send/receive buffer limits, and disables Linux TCP autotuning
for that socket. Linux applies/reports doubled values and may clamp them
by net.core.rmem_max / net.core.wmem_max.
The default 0 keeps the current behaviour (autotuning on). Operators
who observe MemFree jumping back up after a pg_doorman restart with
many long-lived idle clients may be seeing kernel TCP buffer
accumulation. This memory is not process RSS; depending on kernel and
cgroup mode it may show up as socket memory, for example sock in
cgroup v2 memory.stat. Those deployments can bound per-socket kernel
buffer limits by setting this knob to a value in the 64 KiB – 256 KiB
range suitable for OLTP traffic in one datacenter. See the
tcp_socket_buffer_size
reference for details and trade-offs.
Config reloads do not resize already-open sockets. During SIGUSR2
binary upgrade, migrated client sockets are reconfigured in the new
process; backend sockets pick up the value only when opened or
reconnected.
Equivalent of PgBouncer's tcp_socket_buffer parameter. Odyssey and
PgCat have no analogue.
3.10.0
Prepared statements and startup-time planner parameters
sync_server_parameters now replays safe parameters sent by the client
in StartupMessage, not only the small set of PostgreSQL-reported
ParameterStatus values. This lets transaction-mode pools preserve
startup-time session state such as search_path,
default_transaction_isolation, and role when a client transaction
lands on a different backend connection. Configured
startup_parameters still win over client-supplied values.
The prepared-statement cache key now includes a digest of the
startup-time planner parameters that pg_doorman can safely replay:
search_path, default_transaction_isolation,
default_transaction_read_only, default_text_search_config, and
role. Two clients that prepare the same query under different
search_path values now get separate server-side prepared statements
instead of sharing one PostgreSQL plan.
Runtime SET for planner parameters that PostgreSQL does not report is
still not tracked. Clients that need to change those values after
connection startup should set them in StartupMessage, reconnect or
run DISCARD ALL after changing them, or disable prepared_statements
for that pool.
PgDoorman also rolls back optimistic per-backend prepared-statement LRU
entries when PostgreSQL rejects Parse. Reusing the same client
statement name after a failed Parse now forces a fresh Parse instead of
hitting a stale DOORMAN_<N> entry and surfacing SQLSTATE 26000.
Per-pool response cache for general.pooler_check_query. The first
matching SimpleQuery in each pool's lifetime is forwarded to PostgreSQL;
every subsequent matching probe is answered from the cache without
touching the backend.
Behavior change for cold pools
Before this release pg_doorman answered any pooler_check_query match
locally with a hardcoded empty result. The default ; came back instantly
without ever talking to PostgreSQL, and a non-empty value such as select 1
returned an empty response that did not match what a real PostgreSQL would
have produced.
The first probe per pool now does one PostgreSQL round-trip and captures
the real response. If PostgreSQL is unreachable at that moment, the
probing client sees a probe failure instead of an unconditional OK; the
earlier hardcode reported the pooler as healthy even when PostgreSQL was
down. Typical JDBC keepalive queries such as select 1 (WildFly, HikariCP)
and select 'pg_doorman' now return the expected row.
Cache lifecycle
The cache is per pool and keyed by the query string. A RELOAD that
changes pooler_check_query invalidates the cache on the next ping; the
new value triggers one fresh backend probe and is then served from cache
until the value changes again. A reload that keeps the same value keeps
the cached response. ErrorResponse from the backend is forwarded to
the client unchanged and is never cached, so the next probe retries
against PostgreSQL.
Operator contract
pooler_check_query must be stable: the same input must produce the
same bytes, with no side effects. Safe values: ;, select 1,
select 'pg_doorman', select version().
Unsafe values that the cache will silently freeze:
select now(),select clock_timestamp()— the cached timestamp never advances.select pg_is_in_recovery()— a failover flips the role on PostgreSQL but the cached response still reports the old role.select count(*) from <table>— the cached count is whatever the first probe observed.UPDATE,INSERT,DELETE,CALL,DO— the side effect runs once and the success response is cached forever.
New metrics
pg_doorman_pooler_check_query_backend_total— counter, increments on each probe forwarded to PostgreSQL (cache miss or RELOAD-induced re-probe).pg_doorman_pooler_check_query_cache_total— counter, increments on each probe served from the cache.
The ratio cache_total / (cache_total + backend_total) is the cache
hit rate.
Eviction visibility for prepared-statement caches
Per-eviction events from the named and anonymous query interner and
from the per-client anonymous LRU are now emitted as TRACE log
lines. The default INFO level is unchanged; turn them on at
runtime with
SET log_level = 'info,pg_doorman::server::prepared_statement_cache=trace,pg_doorman::client::protocol=trace';
The GC sweep task additionally emits one DEBUG aggregate line per
cycle that actually evicted something. Operators that previously had
only the aggregate pg_doorman_query_interner_evictions_total and
pg_doorman_clients_prepared_anonymous_evictions_total Prometheus
counters can now follow individual evictions during an incident.
The 80-char-with-ellipsis and 120-char preview helpers used in those
log lines live in a new utils::strings module and replace three
inline copies that had drifted apart.
Web UI lifecycle events
The sidebar used to toast "pg_doorman restarted — rate baseline reset"
on every routine RELOAD. Totals are summed across the live pool set,
and RELOAD plus dynamic-pool GC drop pools from that set, so the sum
legitimately falls without the process going anywhere. The heuristic
is gone. A real restart is detected by a change in pid,
started_at_ms, or uptime_seconds.
/api/events grows two new event targets:
PROCESS_START— emitted once when setup finishes; carries the binary version and pid.CONFIG_VALIDATION_ERROR— emitted when SIGHUP, admin RELOAD, or/api/admin/reloadrejects the new config. Rate-limited to one per second per target so a SIGHUP loop with a bad config cannot fill the 1024-entry ring with duplicates.
A persistent banner across the top of the UI replaces the transient toast for conditions an operator must not miss:
shutdown_in_progress— pg_doorman is draining.migration_in_progress— binary upgrade in flight.- Last unresolved
CONFIG_VALIDATION_ERROR— stays up until a successfulRELOADclears it. /api/overviewsilent for >15 s — banner switches to "pg_doorman unreachable — last contact 23s ago", so the operator knows the rest of the page is no longer trustworthy.
A no-op SIGHUP (config file re-parsed identically) now emits a
RELOAD entry with message config unchanged instead of going
silent — one event per signal keeps the audit timeline complete.
/api/events and /api/overview send Cache-Control: no-store so
intermediate proxies cannot collapse two consecutive polls into the
same response.
3.9.1
Web admin console refresh and a follow-up pass on startup_parameters.
Upgrade notes for operators monitoring 3.9.0:
- The pg_doorman-side budget rejection now returns
SQLSTATE 53400(configuration_limit_exceeded) instead of54000. Alert rules and log filters keyed on54000need to switch. PgDoormanStartupParameterPgRejectionis nowseverity: warning(wascriticalin 3.9.0). Cascade-overflow stayscritical. Review the Alertmanager / on-call routing if you key on severity to page.
Web admin console
- Light theme by default. Three-position theme toggle (Light / System / Dark) in the sidebar footer; choice persists in localStorage.
- New
/serverspage reads SHOW SERVERS. Filters (database, user, state, application_name) and pagination live in the URL. - New "Top SQLSTATE codes" card on Overview aggregating
errors_by_sqlstateacross pools. - Patroni-assisted fallback banner on Overview when any pool reports
fallback_active=true. - Global RELOAD button on Config with typed confirmation.
- Logs and Clients filters move to URL parameters; deep links are shareable.
- Cmd+K / Ctrl-K command palette for navigation and pool lookup.
?opens a keyboard-shortcut sheet. Esc dismisses popovers and leaves the war room./wallrequests a screen wake lock so a TV stays on past the OS screensaver timeout.- Structured (i) popovers everywhere — definition, admin SHOW source, formula, thresholds, related metrics, link to docs.
- Sonner toast notifications for admin actions.
- Persistent transport indicator (http/https) in the sidebar footer.
- Counter-reset detection: a pg_doorman restart no longer renders as silent "0 qps" in the sidebar.
- Storage keys gained a host suffix, so two tabs against different poolers keep separate rolling buffers.
- Clients table memoises rows; poll cadence relaxed to 3 s. Resolves a memory growth reported on long sessions.
- Sidebar collapses below
md(mobile navigation via Cmd+K and URL). - Trimmed embedded font bundle: 5 woff2 (~146 KB) down from 9.
Backend: web/access_log.rs demotes authenticated 2xx reads to debug.
info covers admin actions, personal-data paths, /api/auth/,
/api/sso/, and any non-2xx.
Docs: guides/web-ui.md rewritten for the new pages and shortcuts.
startup_parameters follow-up
- If the resolved
startup_parametersset exceeds the startup packet budget, backend startup now fails withSQLSTATE 53400. A deterministicgeneral + pooloverflow is rejected at config load. - The final
ParameterStatusmessages sent to the client no longer overwrite operator-managed GUC names, so the client-visible values match the backend checkout state. auth_querynow rebuilds a dynamic pool after a successful MD5 refetch, rejects the stale-overlay race increate_dynamic_pool, and accepts nativejson/jsonbstartup_parameter columns without a::textcast./api/configand/api/poolsshow literal startup_parameter values only toAdmin; SSO readers get the masked view./api/configalso marksgeneral.host,general.port,web.host, andweb.portas restart-required.- Prometheus rules now cover PostgreSQL-side rejection, budget overflow, malformed auth_query columns, dedicated-mode drops, and rejected SSO credentials sent over insecure transport.
- Each pool now precomputes the merged startup map, budget decision, and canonical operator-key set. Backend checkout reuses those cached values instead of cloning and recalculating the map each time.
3.9.0
Per-pool PostgreSQL startup parameters. pg_doorman can now add
configured GUCs to each backend StartupMessage. Values apply in
three layers: general.startup_parameters, pools.<name>.startup_parameters,
and the optional startup_parameters column returned by passthrough
auth_query.
PostgreSQL stores these values as the session reset defaults, so
client-side RESET ALL and DISCARD ALL return to the configured value.
This gives one pool a different plan_cache_mode, statement_timeout,
work_mem, or idle_in_transaction_session_timeout without changing
postgresql.conf, ALTER ROLE, or ALTER DATABASE.
Cascade resolution
general.startup_parameters,pools.<name>.startup_parameters, and the optionalstartup_parameterstext column on anauth_queryrow are applied in order. Later layers override earlier ones per key.- Dedicated
auth_querymode uses a sharedserver_user, so pg_doorman ignores the per-user column there and logs one warning per pool and username. - A reload that changes startup parameters recycles the affected pools. Idle backends with the old reset defaults are not reused.
Validation and protocol safety
- Reserved protocol keys (
user,database,replication,options, the_pq_.*extension prefix) are refused at config load. - Keys must match the PG GUC naming shape
[A-Za-z_][A-Za-z0-9_.]*, values must not contain null bytes, and each level fits the startup-parameter budget ofMAX_STARTUP_PACKET_LENGTH - 512bytes. - The resolved parameter set is checked before each backend startup
against PG's 10 000-byte
MAX_STARTUP_PACKET_LENGTH. If only the auth_query layer overflows the packet, pg_doorman drops that layer and keeps the general/pool baseline. If the baseline itself does not fit, pg_doorman skips all configured keys for that spawn and logs the byte counts.
Behaviour on PG-side rejection
- If PostgreSQL rejects a configured startup parameter at backend
startup, pg_doorman returns PostgreSQL's
ErrorResponseto the client unchanged. pg_doorman does not retry without the key and does not disable the key automatically for the pool. Fix the parameter in the config; until then, backend startup for that pool fails with PostgreSQL's own SQLSTATE and message. - SQLSTATEs with the
57Pprefix (server unavailable) keep mapping toServerUnavailableErrorfirst so the Patroni-assisted fallback path can route around the failed node before the startup-parameter log line fires. - The configured parameter wins over the client sync path:
even if the client connect string carries an
application_name(or another tracked GUC likeTimeZone), the per-checkoutsync_parameterscall no longer overrides the configured value on the backend. That default stands until an explicitSETstatement on the client session changes it.
RELOAD coherence
- A SIGHUP that changes
general.startup_parametersdrains pools that inherit that baseline. The per-pool config hash includes the general startup map, and carried-over dynamicauth_querypools are recycled when the baseline changes.
Observability
pg_doorman_backend_startup_parameter_errors_total{pool, sqlstate}counts backend startups PostgreSQL rejected because of an configured startup parameter. The failing parameter name and username are written to the warning log line, not to metric labels.SHOW STARTUP_PARAMETERS(admin SQL console) lists the per-pool resolved parameters with the source of each value.psqltab completion onSHOW <TAB>now includes the command.- The Web UI pool detail page shows the same rows in a "Startup
parameters (configured)" section, driven by the new
startup_parameters[]field on/api/pools.
See PostgreSQL startup parameters for the configuration walkthrough, plus General Settings and Pool Settings for the full parameter list.
3.8.5
The web console now accepts JWTs issued by an external SSO proxy
alongside the existing Basic auth. The listener resolves every
request to one of three roles — Anonymous, Sso (read-only,
including logs and SQL text), and Admin (full access, including
POST /api/admin/*) — and a JWT can reach the Admin role through a
configurable group claim, so SSO operators run mutating admin actions
without sharing the Basic password. A per-request access log on a
dedicated logger target makes role transitions and 401/403 spikes
visible from journalctl. Full reference and an oauth2-proxy example
live in guides/web-ui.md.
SSO authentication
- New
[web]fields wire the SSO branch:sso_enabled,sso_proxy_url,sso_public_key_file,sso_audience,sso_allowed_users,sso_groups_claim,sso_admin_groups. JWTs are validated as RS256 against the PEM-encoded public key; the parsed key reloads onRELOAD. - A JWT whose
sso_groups_claimvalue intersectssso_admin_groupsresolves toAdminwithauth_source = sso. Emptysso_admin_groups(the default) keeps every SSO login on the read-onlySsorole. - Tokens are accepted from
Authorization: Bearer, thesso_access_tokencookie, or the?token=query parameter, in that priority order. Basic still wins over SSO when both are presented; a wrong Basic password no longer blocks a valid SSO token. GET /api/auth/configreportssso_enabled,sso_proxy_url,sso_admin_groups_configured,sso_config_error, and the resolvedcurrent_user, so the SPA renders the role-aware sign-in modal and sidebar without a second probe.
Role-aware gating
[web].ui_anonymous = falsenow requires theSsorole for the public/api/*endpoints; previously every authenticated request neededAdmin. Read-only privileged endpoints (/api/logs,/api/prepared/text/*,/api/interner/top,/api/top/queries) are reachable bySsousers.POST /api/admin/*remainsAdmin-only.- Insufficient-role rejections return
403 Forbiddenwith body{"error":"forbidden","message":"admin role required"}. Missing or invalid credentials still produce401. The SPA re-opens the sign-in modal on401and renders a non-blocking "admin role required" banner on403.
Browser sign-in flow
- The sign-in modal shows a Sign in via SSO button next to the
Basic form when the backend reports
sso_proxy_url. The proxy bounces the browser back with?token=<jwt>, which the SPA stores inlocalStorageand rewrites out of the URL. - A silent-refresh poller (every 60 s, fires when
expis under 90 s) opens a hidden iframe at${origin}/?sso_silent=1. The iframe renders a minimalSilentCallbackand posts the new token to the parent. If silent refresh fails and a Basic credential is available, the SPA falls back to Basic without redirecting; otherwise it performs a full redirect through the SSO proxy. - The SPA never sends cookies (
credentials: "omit"); cookie auth remains available for curl, sidecars, and oauth2-proxy variants that paste the token into a cookie on the shared domain.
Access log
- Every response (200/401/403/404/5xx,
/metricsscrapes included) emits one logfmt line on the dedicatedpg_doorman::web::accesstarget withmethod,path,query(presence flag only — raw query strings are never logged),status,bytes,latency_ms,peer,auth_role,auth_source, andauth_user. Bodies are not logged. - Levels are picked per request. Admin actions, personal-data reads,
every non-2xx response, and any authenticated request log at
info. Anonymous successful reads of public APIs and/metricsscrapes log atdebug, soRUST_LOG=infono longer drowns in scrape noise.
Real client IP behind a reverse proxy
- New
[web].trusted_proxiesCIDR list. When the TCP peer falls in this list, the access log parsesX-Forwarded-For(or RFC 7239Forwarded), walks the chain right-to-left skipping further trusted hops, and uses the first untrusted address aspeer. An untrusted client that sendsX-Forwarded-Foris ignored, so the field cannot be spoofed.
Observability
- New gauges
pg_doorman_web_sso_enabledandpg_doorman_web_sso_config_error. The latter stays at1whilesso_enabled = truebut the runtime failed to load (missing PEM file, empty audience, unparsable PEM). The exact reason is exported through/api/auth/config.sso_config_errorand rendered as a banner in the SPA. - New counters
pg_doorman_web_auth_attempts_total{role,source},pg_doorman_web_requests_total{status_class,role}, andpg_doorman_web_sso_validation_errors_total{reason}(reasons:signature,expired,audience,no_username,allowlist). Operators alert on SSO degradation without grepping logs.
3.8.0
Added
Built-in operator dashboard. pg_doorman exposes a single-page
diagnostic console on the same port as /metrics, served from
inside the binary and gated on [web].ui = true plus a non-default
admin_password. Getting comparable detail from the existing psql
admin console means running SHOW POOLS, SHOW CLIENTS,
SHOW STATS and friends in a loop, computing rates by hand between
two snapshots, and joining the rows mentally. The dashboard does
that on a 1.5 s tick.
What it shows that the psql admin console does not:
- Live time-series, not snapshots. Latency p95/p99, qps, errors/s and connection saturation render as sparklines, so "spiking now" is visually distinct from "always been like this".
- Errors broken down by SQLSTATE per pool. Plus top-N stuck
queries by
current_query_age_ms, top-N noisy clients by errors, top-N hottest prepared statements by hit rate. - Process memory by category. RSS split into jemalloc live allocations, jemalloc fragmentation, internal pg_doorman caches, code + libs, stacks + page tables, swap and anonymous remainder, with cgroup current / max alongside. Every category carries a one-line explanation on hover.
- Per-thread tokio-worker CPU. Drill-down from the threads
count to per-thread utilisation, so a stuck worker is visible
without
perf topon the host. - Live log tail. An in-process LogTap activates on the first
/api/logsrequest and self-disables two minutes after the last viewer. Level and target filters apply client-side over the rolling buffer. - Sortable, filterable tables. Pools, Clients, Apps and Caches sort by any column and filter by substring; Prepared statements adds a kind dropdown on top.
The dashboard is read-only by default. Pause / Resume / Reconnect /
Reload are the four writes, scoped to one pool via
?pool=user@db, to every pool of a database via ?db=, or
globally — the same semantics as the admin protocol.
Notes
[web].ui_anonymous(defaultfalse) controls whether the read-only/api/*endpoints answer without basic auth. Admin- only endpoints (/api/logs,/api/admin/*,/api/prepared/text/{hash},/api/interner/top,/api/top/queries) always require it regardless of that flag.- The dashboard polls every 1.5 s, but a 250 ms shared snapshot
feeds
/api/overview,/api/pools,/api/clients,/api/servers,/api/apps,/api/statsand/metrics, so a multi-tab dashboard does not multiply pool-stats work by the number of open tabs.
3.7.0
ACTION REQUIRED before upgrading to 3.7.0
- SQLSTATE for missing prepared statements changed from
58000to26000. AnyBindorDescribereferencing a prepared statement that pg_doorman cannot resolve now returns SQLSTATE26000(invalid_sql_statement_name), matching native PostgreSQL. Audit dashboards, log searches, alert rules, and retry middleware that filter on58000for this condition (Splunk saved searches, Grafana log alerts, custom retry policies). Drivers that auto-retry on26000(pgjdbc, pgx withcache_describe) now do so; drivers that closed the connection on58000will no longer. - Migration format v1 is no longer accepted. Upgrades from a pg_doorman that emitted v1 (3.5.0–3.5.x) must hop through 3.6.x first; from 3.4 and earlier no migration support existed, so the upgrade is unaffected.
client_prepared_statements_cache_sizeis deprecated. It remains a serde alias ofclient_anonymous_prepared_cache_size, with aWARNat startup. Planned for removal in 3.9; rename in configs now.- Anonymous prepared statements have a TTL by default. The
query interner evicts an anonymous entry after
query_interner_anon_idle_ttl_seconds(default 60) of idle time. Drivers like pgjdbc andpgxwithcache_describere-issueParsetransparently when the nextBindreturns SQLSTATE26000. If your driver relies on cross-batch unnamed prepared statements without a re-Parse, setquery_interner_anon_idle_ttl_seconds: 0to keep the pre-3.7 unbounded behaviour.
Added
- The query interner is split into NAMED (passive
Arc::strong_countGC) and ANON (idle TTL). Two general knobs control the GC:query_interner_gc_interval_seconds(default 60, restart-only) andquery_interner_anon_idle_ttl_seconds(default 60;0disables TTL and restores pre-3.7 unbounded behaviour; live-reloadable). A two-cycle mark-and-sweep grace prevents eviction of entries touched between cycles. SHOW INTERNERreports entries and bytes per kind;SHOW INTERNER Nlists the top N by interned text length with hash, kind, idle_ms, and a 120-character preview;RESET INTERNERclears both halves (diagnostics-only).- Prometheus interner metrics:
pg_doorman_query_interner_entries{kind},_bytes{kind},_evictions_total{kind, reason},_synthetic_misses_total,_gc_duration_seconds. server_prepared_statements_cache_size(general + per-pool) sizes the per-backend server-level prepared-statement LRU. When unset, inheritsprepared_statements_cache_size.client_anonymous_prepared_cache_sizebounds the Anonymous part of the per-client cache; named statements remain unbounded. The knob is now optional and inheritsprepared_statements_cache_sizewhen unset (0still means unlimited).kindcolumn appended toSHOW PREPARED_STATEMENTS(named/anonymous/mixed).SHOW POOLS_MEMORYgainsclient_named_count,client_anonymous_count, andclient_anonymous_evictions_alive(a gauge of evictions across currently connected clients; the authoritative cumulative counter lives in Prometheus aspg_doorman_clients_prepared_anonymous_evictions_total). The matching gaugespg_doorman_clients_prepared_named_entries/..._anonymous_entriesround out the surface.
Changed
- The per-client prepared-statement cache is split into Named
(unbounded) and Anonymous (LRU). Fixes a bug where the previous
combined LRU could evict a Named entry and cause the next
Bindto fail withprepared statement does not exist. Bindagainst an anonymous prepared statement that is no longer cached anywhere (interner, pool, client) now returns SQLSTATE26000(invalid_sql_statement_name) instead of58000, matching native PostgreSQL. Standard drivers re-issueParsetransparently.
Deprecated
client_prepared_statements_cache_sizeis renamed toclient_anonymous_prepared_cache_size. The old name remains a serde alias and logs aWARNat startup; rename it in your config.
Removed
- Migration format v1 is no longer accepted. Upgrades from versions
that emitted v1 (3.4 and earlier) must hop through a 3.5–3.6
binary first;
deserialize_statereturnsunsupported version 1otherwise.
3.6.5 May 4, 2026
Fix: stuck cl_active/sv_active after large DataRow client disconnect under pressure
When a large DataRow was deferred via pending_large_message, recv() cleared the deferred header before streaming. If the client disconnected during streaming write, the next drain/read path lost frame boundaries and could block in wait_available(). Under full pressure, this left cl_active/sv_active pinned at pool size and prevented normal server_lifetime recycling.
recv() now keeps pending_large_message until large-message handling succeeds and clears it only on Ok. On error, the next recv() still has correct frame context, allowing cleanup to complete and active counters to drop as expected.
Observability: oldest_active_age_ms per pool
SHOW POOLS exposes a new oldest_active_age_ms column and Prometheus exports pg_doorman_pools_oldest_active_age_ms{user, database}. The gauge reports the maximum age in milliseconds among ACTIVE servers in each pool, taken at snapshot time, and falls back to 0 when no server is currently ACTIVE. Sustained non-zero values flag stuck checkouts before pool exhaustion.
3.6.4 Apr 29, 2026
Fallback resilience
Patroni-assisted fallback now races Server::startup against every alive cluster member in parallel, with a strict sync_standby priority that protects write traffic during a local-backend outage. See Patroni-assisted fallback for operator-level details.
- Startup deadline per candidate.
Server::startupruns undertokio::time::timeout. Main path:connect_timeout(default3s), now also covers the StartupMessage round-trip. Fallback path:fallback_connect_timeout(default5s) per candidate. Raiseconnect_timeoutif local startup legitimately exceeds 3s (large WAL replay after restart). - Two-wave parallel race. Wave 1 races startup against every
sync_standbyin parallel and takes the first success; wave 2 (replica + leader) runs only if every sync_standby failed or none existed. While any sync_standby is still in-flight, a replica that already finished startup is intentionally not used — the user-facing requirement is "sync wins if it's alive at all", because the sync_standby is the lowest-data-loss promotion target. On full exhaustion the doorman log recordsall fallback candidates rejected (3 startup_error, 1 timeout)with a deterministic per-reason breakdown; the client always sees the sanitizedUnable to retrieve server parameters … may be unavailable or misconfiguredFATAL — read the doorman log for the wave/winner trace. - Per-host cooldown with exponential backoff. Failed candidate is marked unhealthy for
fallback_connect_timeout, doubling on consecutive failures up to60s; resets to base after the window elapses. The cooldown map is pruned of expired entries at the start of each discovery cycle, so its size stays linear in actively-failing candidates rather than accumulating dead pod IPs. - Soft outer deadline. The full fallback path runs under
query_wait_timeout(default5s). If it fires, pg_doorman aborts cleanly withfallback: outer deadline {ms}ms exceededin the log and returns the sanitized FATAL to the client. Per-candidate timeouts are the hard guarantee against hangs; the outer deadline is a soft cap on how long the client itself is willing to wait. - Whitelist post-failure rediscovery. Stale cached host failure clears the cache and runs one extra discovery round.
- Log rate-limit. Per-candidate
WARNrate-limited to 1 per 10s per(pool, host:port); suppressed lines log at DEBUG. pg_doorman_fallback_hostcleanup on switchover. Old(host, port)label removed when whitelist changes.- New metric
pg_doorman_fallback_candidate_failures_total{pool, reason}. Reasons:connect_error,startup_error,server_unavailable,timeout,other.
Use IP addresses (not hostnames) in member.host: a 5s DNS hang consumes the full per-candidate budget.
3.6.3 Apr 28, 2026
Fix: per-connection read buffer leak under multi-MiB simple-query INSERTs
Per-connection reusable read buffers (Client.read_buf, Server.read_buf) retained the largest allocation each connection had served. After one multi-MiB simple-query INSERT, every subsequent small message split out of that allocation, and the reusable buffer reclaimed the multi-MiB region as soon as the previous BytesMut was dropped. Across thousands of clients in transaction mode, occasional megabyte-sized payloads compounded into a 100 MB → 4 GB pooler RSS regression.
read_message_reuse and read_message_body_reuse now drop the backing allocation before each read when the buffer's capacity exceeds 256 KiB and fall back to a fresh 16 KiB buffer. The steady-state path (capacity within threshold) is unchanged.
3.6.2 Apr 27, 2026
New features:
-
Unix socket listener.
unix_socket_dircreates.s.PGSQL.<port>socket file. Connect withpsql -h <dir>orpgbench -h <dir>. No TCP overhead on local connections. -
HBA
localrule matching.localrules in pg_hba now apply to Unix socket connections.host/hostssl/hostnosslrules apply only to TCP. Previouslylocalrules were parsed but ignored. -
unix_socket_modecontrols socket file permissions. New[general]setting fixes the permission bits on.s.PGSQL.<port>after bind, so the access surface no longer depends on the process umask. Octal string, default"0600"(owner only). Set to"0660"to grant a Unix group, or"0666"to allow any local user. Validated at config load — invalid octal values, setuid/setgid/sticky bits, and overflow into bits above0o777are rejected upfront.
Known limitations (Unix socket):
- Unix listener not handed off during
SIGUSR2binary upgrade. New process re-creates the socket; connections refused for ~100ms. only_ssl_connectionsdoes not reject Unix socket connections. Unix sockets do not need TLS for transport security.
3.6.1 Apr 27, 2026
openssl 0.10.78 (CVE-2026-41678, CVE-2026-41681)
openssl 0.10.72 is affected by CVE-2026-41678 and CVE-2026-41681; some registry mirrors refuse downloads on that basis. pg_doorman now depends on openssl 0.10.78 and openssl-sys 0.9.114. API-compatible — no source changes.
3.6.0 Apr 24, 2026
Patroni-assisted fallback
When pg_doorman runs next to PostgreSQL on the same machine and connects via unix socket, a Patroni switchover or PostgreSQL crash leaves the pooler without a backend. With patroni_api_urls configured, pg_doorman queries the Patroni REST API /cluster endpoint, picks a live cluster member, and routes new connections there.
Candidate selection: sync_standby first (most likely next leader), then replica, then any other member. Members with noloadbalance, nofailover, or archive tags are excluded. All candidates are TCP-probed in parallel; the first responding sync_standby wins immediately.
The local backend stays in cooldown for fallback_cooldown (default 30s). During the cooldown, subsequent connection requests reuse the cached fallback host without re-querying Patroni. Fallback connections use a short fallback_lifetime (defaults to fallback_cooldown) so the pool returns to the local backend once it recovers.
Configuration:
pools:
mydb:
patroni_api_urls:
- "http://10.0.0.1:8008"
- "http://10.0.0.2:8008"
fallback_cooldown: "30s"
patroni_api_timeout: "5s"
fallback_connect_timeout: "5s"
Prometheus metrics: pg_doorman_patroni_api_requests_total, pg_doorman_fallback_connections_total, pg_doorman_patroni_api_errors_total, pg_doorman_fallback_active, pg_doorman_patroni_api_duration_seconds, pg_doorman_fallback_host, pg_doorman_fallback_cache_hits_total.
If you tracked this feature under its working name in 3.5.x dev builds, the config keys and metric names changed before the public release: patroni_discovery_urls → patroni_api_urls, failover_blacklist_duration → fallback_cooldown, failover_discovery_timeout → patroni_api_timeout, failover_connect_timeout → fallback_connect_timeout, failover_server_lifetime → fallback_lifetime. Old pg_doorman_failover_* metrics are renamed to pg_doorman_patroni_api_* / pg_doorman_fallback_*.
Server-side TLS (pg_doorman → PostgreSQL)
Six SSL modes matching libpq semantics: disable, allow (default), prefer, require, verify-ca, verify-full. Mutual TLS supported via server_tls_certificate / server_tls_private_key.
Configuration is per-pool with global defaults in [general]. Cancel requests use TLS when the main connection used TLS.
Breaking change: server_tls (bool) and verify_server_certificate (bool) are removed. They were parsed but non-functional. Replace with:
| Old config | New config |
|---|---|
server_tls: false | server_tls_mode: "disable" |
server_tls: true | server_tls_mode: "require" |
server_tls: true + verify_server_certificate: true | server_tls_mode: "verify-full" |
| (not set) | server_tls_mode: "allow" (new default) |
The new default allow tries plain TCP first. If the server rejects the connection (e.g. pg_hba.conf requires TLS), pg_doorman retries with TLS on a new TCP socket. This matches libpq sslmode=allow.
SHOW SERVERS now includes a tls column showing whether each backend connection uses TLS.
3.5.3 Apr 22, 2026
Prepared statement cache overflow under concurrent load
The pool-level prepared statement cache could grow well above its configured prepared_statements_cache_size under concurrent client traffic. Production showed 480 entries with a limit of 300. The check-then-insert sequence in the cache had a race: multiple clients passed the size check simultaneously, each inserted without evicting. Now insertion happens first, followed by eviction in a loop until the cache is within bounds.
3.5.2 Apr 21, 2026
Semaphore permit leak on direct handoff
Each return_object handoff (delivering a connection to a waiting client via oneshot channel) permanently consumed one semaphore permit. After max_size handoffs the pool semaphore was fully drained, blocking all new timeout_get callers. The pool could not create connections and stabilized at whatever size it reached during cold start (typically 4-8 out of 40).
Root cause: wrap_checkout calls permit.forget(), and the handoff path in return_object skipped add_permits(1). Now return_object restores the permit on both the handoff and idle-queue paths. Compensating add_permits(1) in pre_replace_one removed (no longer needed).
Burst gate select race
The tokio::select! in the burst gate loop randomly picked among ready branches. When sleep(5ms) or create_done won over an already-delivered oneshot, the connection was silently dropped, inflating slots.size without a live server. Fixed with biased; (oneshot checked first) and a try_recv drain that pushes orphaned connections to idle without double-counting the permit.
Migration fixes
-
Client ID collision after migration. The new process started its connection counter at 0, colliding with migrated client IDs. Now the counter advances past the highest migrated ID.
-
SCRAM passthrough state preserved. The ClientKey from the first client's SCRAM handshake is serialized in the migration payload (v2 format, backward compatible). The new process skips the
ScramPendingfallback toserver_password.
Session mode statistics fix
xact_time percentiles in session mode showed the entire session duration instead of individual transaction time. Now recorded per-transaction at each ReadyForQuery(Idle), matching transaction mode semantics.
query_time had the same accumulation bug: the timer was set once before the inner loop and never reset, so each subsequent query reported the cumulative session duration. Now reset per-query in session mode.
Adaptive anticipation budget
Anticipation wait (formerly fixed 300-500ms) scales with real transaction latency: xact_p99 * 2 +/- 20% jitter, clamped to [5ms, 500ms]. Cold start default: 100ms.
Diagnostic logging
Slow checkout warnings (>500ms) now include pool state: size, avail, waiting, inflight, creates, gate_waits, antic_ok, antic_to, fallback. Phase-specific warnings added for semaphore timeout, burst gate timeout, coordinator exhaustion, and create failure.
3.5.1 Apr 20, 2026
systemd Type=notify support
pg_doorman now sends sd_notify(READY=1) on startup and sd_notify(MAINPID=<child_pid>) during binary upgrade. With Type=notify in the systemd unit, systemctl reload performs a zero-downtime binary upgrade without PID tracking issues — systemd follows the new process correctly and does not restart the service.
The shipped pg_doorman.service changes from Type=forking + --daemon to Type=notify (foreground). Existing installations using --daemon continue to work but do not benefit from client migration.
Docker STOPSIGNAL changed from SIGINT to SIGTERM to prevent binary upgrade in containers (where PID 1 exit kills the container).
3.5.0 Apr 15, 2026
Client migration during binary upgrade
Idle clients now transfer to the new process via Unix socket (SCM_RIGHTS) without reconnecting. Active-transaction clients finish their transaction on the old process, then migrate. Prepared statement caches are serialized and transparently re-parsed on the new backend. The old process exits once all clients have migrated or shutdown_timeout expires.
TLS connection migration (opt-in)
Build with --features tls-migration to migrate TLS sessions without re-handshake. A patched vendored OpenSSL 3.5.5 exports/imports symmetric cipher state (keys, IVs, sequence numbers). Linux-only. Offline builds supported via OPENSSL_SOURCE_TARBALL env var with SHA-256 verification.
3.4.0 Apr 11, 2026
Pool Coordinator — database-level connection limits
When multiple user pools share one PostgreSQL database, the sum of their pool_size values can exceed max_connections. A spike in one pool starves the others, or PostgreSQL rejects connections outright.
max_db_connections caps total backend connections per database across all user pools. When the cap is reached, the coordinator frees capacity through three mechanisms, tried in order:
-
Reserve pool. If
reserve_pool_size > 0and the reserve has headroom, a permit is granted immediately — no eviction, no wait. The reserve is a burst buffer: idle reserve connections are upgraded to main permits by the retain cycle once pressure drops, and closed if they stay idle longer thanmin_connection_lifetime. -
Eviction. The coordinator closes one idle connection from a peer pool with the largest surplus above its
min_guaranteed_pool_sizefloor. Candidates are ranked by p95 transaction time — slow pools donate first, because a 1 ms reconnect cost is negligible against a 15 ms p95 but doubles a 0.96 ms one. Only connections older thanmin_connection_lifetime(default 30 s) are eligible, which suppresses cyclic reconnect between pools that take turns stealing slots. -
Wait. If nothing is evictable, the caller parks for up to
reserve_pool_timeout(default 3 s), waking on any peer connection return or permit drop. After the wait, the reserve is retried once more before the client receives an error.
Disabled by default (max_db_connections = 0) — zero overhead when not configured. The hot path (idle connection reuse) never touches the coordinator; only new connection creation does, at the cost of one atomic operation.
New pool-level config fields:
| Parameter | Default | Purpose |
|---|---|---|
max_db_connections | 0 (disabled) | Hard cap on backend connections per database |
min_connection_lifetime | 30000 ms | Eviction age floor — connections younger than this are immune |
reserve_pool_size | 0 (disabled) | Extra permits above the cap, granted on burst |
reserve_pool_timeout | 3000 ms | Coordinator wait budget before error |
min_guaranteed_pool_size | 0 | Per-user eviction protection floor |
New admin commands: SHOW POOL_COORDINATOR (per-database coordinator state), SHOW POOL_SCALING (per-pool checkout counters). Both are also exported as Prometheus metrics under pg_doorman_pool_coordinator{type, database} and pg_doorman_pool_scaling{type, user, database}.
See the pool pressure tutorial for acquisition phases, tuning recipes, and alert examples.
Connection checkout under pressure
Replaces scaling_cooldown_sleep (a fixed 10 ms delay before creating a backend connection) with a multi-phase checkout that reuses connections about to be returned before resorting to connect().
When the idle pool is empty and the pool is above its warm threshold (scaling_warm_pool_ratio, default 20%), a caller first spins briefly (scaling_fast_retries, default 10 yield iterations), then registers a direct-handoff waiter. Connections returned by other clients are delivered through the waiter channel — no idle-queue round-trip, no race with other checkout attempts. The waiter deadline is bounded by query_wait_timeout minus a 500 ms reserve for the create path. If no connection arrives, the caller proceeds to create.
Backend connect() calls are capped at scaling_max_parallel_creates (default 2) per pool. Callers above the cap wait for a peer create to finish or a connection to be returned. Background replenish (min_pool_size) respects the same cap and defers to the next retain cycle when the gate is full, so it does not compete with client-driven creates during spikes.
Connections nearing server_lifetime expiry (95% of age) trigger a pre-replacement: a background task creates a successor before the old connection fails recycle, so the next checkout hits the hot path.
The direct-handoff queue is FIFO. On a 500-client / 40-connection AWS Fargate benchmark, p99/p50 ratio is 1.08 (pg_doorman) vs 25.5 (Odyssey). Every client pays roughly the same queue cost.
Migration: remove scaling_cooldown_sleep from your config if present. Replace with scaling_max_parallel_creates (default 2) if you need to tune the concurrency cap.
Improvements:
-
Runtime log level control.
SET log_level = 'debug'changes the log filter without restart;SET log_level = 'warn,pg_doorman::pool::pool_coordinator=debug'targets specific modules.SHOW LOG_LEVELdisplays the current filter. Changes are ephemeral (lost on restart). -
Log readability overhaul. Consistent
[user@pool #cN]prefix. Durations as4m30sinstead of raw milliseconds. Stats line in logfmt. PG error newlines escaped. Expensive debug computations guarded bylog_enabled!()to avoid allocations at production log levels. -
Auth failure logs include client IP. SCRAM, MD5, JWT, and PAM failures show the source address.
-
Replenish failure noise suppression. Repeated
min_pool_sizefailures log once at warn, then a periodic reminder every ~10 minutes with the failure count. -
avg_xact_timecolumn inSHOW POOLS. Average transaction time per pool, visible alongside existing connection counts. -
Smart session cleanup in transaction mode. pg_doorman tracks which session state a client dirtied (
SET,DECLARE CURSOR, prepared statements) and sends the matching reset on checkin. If the client cleaned up after itself —RESET ALL,CLOSE ALL,DEALLOCATE ALL, orDISCARD ALL— pg_doorman sees the confirmation and skips its own reset. Drivers likejackc/pgxthat send a cleanup batch on disconnect no longer cause a redundant round-trip to PostgreSQL. ASETwithout a follow-up reset still triggers cleanup as before.
3.3.5 Mar 31, 2026
Bug Fixes:
- Prepared statement eviction during batch breaks buffered Bind. When a client sent a batch like
Parse(A), Bind(A), Parse(C), SyncandParse(C)triggered server-side LRU eviction of statement A, theClose(A)was sent to PostgreSQL immediately (out-of-band), deleting A before the client buffer was flushed.Bind(A)then failed withprepared statement "DOORMAN_X" does not exist(error 26000). Two fixes: (1)has_prepared_statement()now promotes entries in the LRU on access (get()instead ofcontains()), so actively-used statements resist eviction. (2) EvictionCloseis deferred until after the batch completes — the statement stays alive on PostgreSQL while Binds in the buffer are processed, thenCloseis sent as post-batch cleanup. If the client disconnects beforeSync,checkin_cleanupdetects the pending deferred closes and triggersDEALLOCATE ALL.
3.3.4 Mar 30, 2026
Bug Fixes:
- Prepared statement cache desync after client disconnect. When a client sent Parse but disconnected before Sync/Flush, pg_doorman registered the statement in the server-side LRU cache but never sent the actual Parse to PostgreSQL (it was still in the client buffer, which was dropped on disconnect). The next client that got the same server connection and used the same query saw the stale cache entry, skipped sending Parse, and received
prepared statement "DOORMAN_X" does not exist(error 26000) from PostgreSQL. Fixed by tracking ahas_pending_cache_entriesflag on the server connection: set when a statement is added to the cache without immediate Parse confirmation, cleared after successful buffer flush. If the client disconnects before flushing,checkin_cleanupdetects the flag and triggersDEALLOCATE ALLto re-synchronize the cache. Zero overhead on the normal path (one boolean check per checkin).
3.3.3 Mar 26, 2026
Bug Fixes:
-
Log spam from missing
/proc/net/tcp6when IPv6 disabled.get_socket_states_countfailed entirely if any of the three /proc files was absent, logging errors every 15 seconds and losing tcp/unix metrics that were available. Missing files are now skipped — counters stay at zero. Other I/O errors (permission denied) still propagate. -
Protocol violation when streaming large DataRow with cached prepared statements.
handle_large_data_rowwrote accumulated protocol messages (BindComplete, RowDescription) directly to the client socket, bypassingreorder_parse_complete_responses. When Parse was skipped (prepared statement cache hit), the client received BindComplete without the synthetic ParseComplete — causingReceived backend message BindComplete while expecting ParseCompleteMessagein Npgsql and similar drivers. Triggered whenmessage_size_to_be_stream≤ 64KB. Fixed by returning accumulated messages fromrecv()before entering the streaming path, so response reordering runs first. Same fix applied tohandle_large_copy_data.
3.3.2 Mar 1, 2026
Breaking Changes:
auth_queryconfig field renames: Two fields in theauth_querysection have been renamed for clarity.auth_query.pool_size(number of connections for running auth queries) is nowauth_query.workers.auth_query.default_pool_size(data pool size for dynamic users) is nowauth_query.pool_size, matching the same parameter name used in static pools. Migration: renamepool_sizetoworkersanddefault_pool_sizetopool_sizein yourauth_queryconfig. If you don't update, the oldpool_sizevalue (typically 1-2) will be interpreted as the data pool size, drastically reducing connection capacity. The olddefault_pool_sizekey is silently ignored and defaults to 40.
Bug Fixes:
-
Session mode: keep server connections alive after SQL errors. A query like
SELECT 1/0returns anErrorResponsefrom PostgreSQL but leaves the connection fully usable. Previously,handle_error_responsecalledmark_bad()unconditionally in async mode, so the connection was destroyed at session end. Nowmark_badis skipped when the pool runs in session mode. Transaction mode still callsmark_badbecause the connection returns to a shared pool where protocol desync is dangerous. -
Pool-level
server_lifetimeandidle_timeoutoverrides ignored: Pool-level overrides forserver_lifetimeandidle_timeoutwere silently ignored — the general (global) values were always used instead. Fixed in 6 places across 3 pool creation contexts (static pools, auth_query shared pools, dynamic pools). Nowpool.server_lifetimeandpool.idle_timeoutcorrectly override the general settings when specified. -
idle_timeoutdefault was 83 hours instead of 10 minutes: The defaultidle_timeoutwas set to 300,000,000ms (83 hours), effectively disabling idle connection cleanup. Idle server connections could accumulate indefinitely. Changed default to 600,000ms (10 minutes). -
retain_connections_maxquota exhaustion causing unlimited closure: Whenretain_connections_max > 0and the global counter reached the limit, the remaining quota became0viasaturating_sub. Since0means "unlimited" inretain_oldest_first(), pools processed after quota exhaustion lost ALL idle connections in a single retain cycle instead of none. With non-deterministic HashMap iteration order, this bug manifested as random pools losing all connections. Fixed by adding an early return when the quota is exhausted. -
retain_connections_maxdoc comment incorrectly stated default as0(unlimited): The actual default is3. -
server_lifetimedefault changed from 5 minutes to 20 minutes: The previous default of 5 minutes was shorter thanidle_timeout(10 minutes), which meantidle_timeoutcould never trigger — connections were always killed byserver_lifetimefirst. Changed to 20 minutes so thatidle_timeout(10 min) handles idle cleanup whileserver_lifetime(20 min) rotates long-lived connections. Note:idle_timeoutonly applies to connections that have been used at least once — prewarmed/replenished connections that were never checked out by a client are not subject toidle_timeoutand will only be closed whenserver_lifetimeexpires. -
idle_timeout = 0did not disable idle timeout:idle_timeout = 0should disable idle connection cleanup, matching PgBouncer'sserver_idle_timeout = 0and pg_doorman'sserver_lifetime = 0. Instead, pg_doorman closed connections after ~1 ms of idle time. Fixed by adding anidle_timeout_ms > 0guard before the elapsed-time check. -
idle_timeouthad no jitter — synchronized mass closures: Unlikeserver_lifetimewhich applies ±20% per-connection jitter to prevent thundering herd,idle_timeoutused a single pool-wide value. When many connections became idle simultaneously (e.g., after a traffic burst), they all expired at the exact same moment, causing mass closures in one retain cycle. Nowidle_timeoutapplies the same ±20% per-connection jitter asserver_lifetime. -
retain_connections_maxunfair quota distribution across pools: The retain cycle iterated pools via HashMap, whose order is deterministic within a process (fixed RandomState seed). The same pool always got iterated first and consumed the entireretain_connections_maxquota, starving other pools. Expired connections in starved pools were never cleaned up by retain — clients had to discover them via failedrecycle()checks, adding latency. Fixed by shuffling pool iteration order each cycle. -
Retain and replenish used separate pool snapshots: The retain and replenish phases each called
get_all_pools()separately. IfPOOLSwas atomically updated between them (config reload, dynamic pool GC), retain operated on one set of pools and replenish on another, potentially missing pools that need replenishment. Fixed by using a single snapshot for both phases.
Testing:
- PHP PDO_PGSQL driver added to test infrastructure. PHP 8.4 with
pdo_pgsqlextension is now included in the Nix-based Docker test image. Two BDD scenarios verify basic connectivity (SELECT 1) and session mode behavior (SQL error does not change backend PID). Run withmake test-phpor--tags @php.
New Features:
-
pool_sizeobservability: Newpg_doorman_pool_sizePrometheus gauge exposes the configured maximum pool size per user/database. Thepool_sizecolumn is also added toSHOW POOLSandSHOW POOLS_EXTENDEDadmin commands (aftersv_login), allowing operators to compare current server connections against configured capacity directly from the admin console. Works for both static and dynamic (auth_query) pools. -
PAUSE, RESUME, RECONNECT admin commands: New admin console commands for managing connection pools.
PAUSE [db]blocks new backend connection acquisition (active transactions continue).RESUME [db]lifts the pause and unblocks waiting clients.RECONNECT [db]forces connection rotation by incrementing the pool epoch — idle connections are immediately closed and active connections are discarded when returned to the pool. Without arguments, all pools are affected; with a database name, only matching pools. Specifying a nonexistent database returns an error. UseSHOW POOLSto see thepausedstatus column. -
min_pool_sizefor dynamic auth_query passthrough pools: Newauth_query.min_pool_sizesetting controls the minimum number of backend connections maintained per dynamic user pool in passthrough mode. Connections are prewarmed in the background when the pool is first created and replenished by the retain cycle afterserver_lifetimeexpiry. Pools withmin_pool_size > 0are never garbage-collected. Default is0(no prewarm — backward compatible). Note: total backend connections scale asactive_users × min_pool_size.
3.3.1 Feb 26, 2026
Bug Fixes:
-
Fix Ctrl+C in foreground mode: Pressing Ctrl+C in foreground mode (with TTY attached) now performs a clean graceful shutdown instead of triggering a binary upgrade. Previously, each Ctrl+C would spawn a new pg_doorman process via
--inherit-fd, leaving orphan processes accumulating. SIGINT in daemon mode (no TTY) retains its legacy binary upgrade behavior for backward compatibility with existingsystemdunits. -
Minimum pool size enforcement (
min_pool_size): Themin_pool_sizeuser setting is now enforced at runtime. After each connection retain cycle, pg_doorman checks pool sizes and creates new connections to maintain the configured minimum. Previously,min_pool_sizewas accepted in config but never applied — pools started empty and could drop to 0 connections even withmin_pool_sizeset. Replenishment stops on the first connection failure to avoid hammering an unavailable server.
New Features:
-
SIGUSR2 for binary upgrade: New dedicated signal
SIGUSR2triggers binary upgrade + graceful shutdown in all modes (daemon and foreground). This is now the recommended signal for binary upgrades. Thesystemdservice file has been updated to useSIGUSR2forExecReload. -
UPGRADEadmin command: New admin console command that triggers binary upgrade via SIGUSR2. Use it frompsqlconnected to the admin database:UPGRADE;.
Improvements:
-
Pool prewarm at startup: When
min_pool_sizeis configured, pg_doorman now creates the minimum number of connections immediately at startup, before the first retain cycle. Previously, pools started empty and connections were only created lazily on first client request or after the first retain interval (default 60s). This eliminates cold-start latency for the first clients connecting after pg_doorman restart. -
Configurable connection scaling parameters: New
generalsettingsscaling_warm_pool_ratio,scaling_fast_retries, andscaling_cooldown_sleepallow tuning connection pool scaling behavior. All three can be overridden at the pool level.scaling_cooldown_sleepuses the human-readableDurationtype (e.g."10ms","1s") consistent with other timeout fields. -
max_concurrent_createssetting: Controls the maximum number of server connections that can be created concurrently per pool. Uses a semaphore instead of a mutex for parallel connection creation.
3.3.0 Feb 23, 2026
New Features:
-
Dynamic user authentication (
auth_query): PgDoorman can now authenticate users dynamically by querying PostgreSQL at connection time — no need to list every user in the config. Supportspg_shadow, custom tables, andSECURITY DEFINERfunctions. The query must return a column namedpasswdorpassword(or any single column) containing an MD5 or SCRAM-SHA-256 hash. -
Passthrough authentication: Default mode for both static and dynamic users — PgDoorman reuses the client's cryptographic proof (MD5 hash or SCRAM ClientKey) to authenticate to the backend automatically. No plaintext
server_passwordin config needed when the pool user matches the backend PostgreSQL user. -
Two auth_query modes:
- Passthrough mode (default) — each dynamic user gets their own backend connection pool and authenticates as themselves, preserving per-user identity on the backend.
- Dedicated mode (
server_userset) — all dynamic users share a single backend pool under one PostgreSQL role.
-
Auth query caching: DashMap-based cache with configurable TTL, double-checked locking, rate-limited refetch, and request coalescing. Supports separate TTLs for successful and failed lookups.
-
SHOW AUTH_QUERYadmin command: Displays per-pool metrics — cache entries/hits/misses, auth success/failure counters, executor stats, and dynamic pool count. -
Prometheus metrics for auth_query: New metric families
pg_doorman_auth_query_cache,pg_doorman_auth_query_auth,pg_doorman_auth_query_executor,pg_doorman_auth_query_dynamic_pools. -
Idle dynamic pool garbage collection: Background task cleans up expired dynamic pools when all connections have been idle beyond
server_lifetime. Zero overhead for static-only configs. -
Smart password column lookup: Password column resolved by name (
passwd→password→ single-column fallback), works withpg_shadow, custom tables, and arbitrary single-column queries.
Improvements:
-
server_username/server_passwordnow optional: Previously documented as required for MD5/SCRAM hash configs. Now only needed when the backend user differs from the pool user (username mapping, JWT auth). -
Data-driven config & docs generation:
fields.yamlis the single source of truth for all config field descriptions (EN/RU). Reference docs, annotated configs, and inline comments are all generated from it.
Testing:
- 39 new BDD scenarios (260+ steps) covering auth_query executor, end-to-end auth, HBA integration, passthrough mode, SCRAM-only auth, RELOAD/GC lifecycle, observability, and static user passthrough.
3.2.4 Feb 20, 2026
New Features:
-
Annotated config generation: The
generatecommand now produces well-documented configuration files with inline comments for every parameter by default. Previously it only did plain serde serialization without any documentation. -
--referenceflag: Generates a complete reference config with example values without requiring a PostgreSQL connection. The rootpg_doorman.tomlandpg_doorman.yamlare now auto-generated from this flag, ensuring they always stay in sync with the codebase. -
--format(-f) flag: Explicitly choose output format (yamlortoml). Default output format changed from TOML to YAML. When--outputis specified, format is auto-detected from file extension;--formatoverrides auto-detection. -
--russian-comments(--ru) flag: Generates comments in Russian for quick start guide. All ~100+ comment strings are translated to clear, simple Russian. -
--no-commentsflag: Disables inline comments for minimal config output (plain serde serialization, the old default behavior). -
Passthrough authentication documentation: Documents passthrough auth as the default mode —
server_username/server_passwordare no longer needed when the pool user matches the backend PostgreSQL user. PgDoorman reuses the client's MD5 hash or SCRAM ClientKey to authenticate to the backend automatically.
Testing:
-
Config field coverage guarantee: New test parses config struct source files (
general.rs,pool.rs,user.rs, etc.) at compile time and verifies everypubfield appears in annotated output. If someone adds a new config parameter but forgets to add it toannotated.rs, CI will fail with a clear message listing the missing fields. -
BDD tests for generate command: End-to-end tests that generate TOML and YAML configs, start pg_doorman with them, and verify client connectivity.
Bug Fixes:
-
Fixed protocol desynchronization on prepared statement cache eviction in async mode: When asyncpg/SQLAlchemy uses
Flush(instead ofSync) for pipelinedParse+Describebatches and the prepared statement LRU cache is full, eviction sendsClose+Syncto the server. In async mode,recv()was exiting immediately whenexpected_responses==0, leavingCloseCompleteandReadyForQueryunread in the TCP buffer. The nextrecv()call would then read these stale messages instead of the expected response, causing protocol desynchronization. Fixed by temporarily disabling async mode during eviction so thatrecv()waits forReadyForQueryas the natural loop terminator. -
Fixed generated config startup failure:
syslog_prog_nameanddaemon_pid_fileare now commented out by default in generated configs. Previously they were uncommented, causing pg_doorman to fail when started in foreground mode or when syslog was unavailable. -
Fixed Go test goroutine leak:
TestLibPQPreparednow usessync.WaitGroupto wait for all goroutines before test exit, fixing sporadic panics caused by logging after test completion. -
Fixed protocol violation on flush timeout — client now receives ErrorResponse: When the 5-second flush timeout fires (server TCP write blocks because the backend is overloaded or unreachable), the
FlushTimeouterror was propagating via?throughhandle_sync_flush→ transaction loop →handle()without sending any PostgreSQL protocol message to the client. The TCP connection was simply dropped, causing drivers like Npgsql to report "protocol violation" due to unexpected EOF. Now pg_doorman sends a properErrorResponsewith SQLSTATE58006and message containing "pooler is shut down now" before closing the connection, allowing client drivers to detect the error and reconnect gracefully.
3.2.3 Feb 10, 2026
Improvements:
- Jitter for
server_lifetime(±20%): Connection lifetimes now have a random ±20% jitter applied to prevent mass disconnections from PostgreSQL. When pg_doorman is under heavy load, it creates many connections simultaneously, which previously caused them all to expire at the same time, creating spikes of connection closures. Now each connection gets an individual lifetime calculated asbase_lifetime ± random(20%). For example, withserver_lifetime: 300000(5 minutes), actual lifetimes range from 240s to 360s, spreading connection closures evenly over time.
3.2.2 Feb 9, 2026
New Features:
-
Configuration test mode (
-t/--test-config): Added nginx-style configuration validation flag. Runningpg_doorman -torpg_doorman --test-configwill parse and validate the configuration file, report success or errors, and exit without starting the server. Useful for CI/CD pipelines and pre-deployment configuration checks. -
Configuration validation before binary upgrade: When receiving SIGINT for graceful shutdown/binary upgrade, the server now validates the new binary's configuration using
-tflag before proceeding. If the configuration test fails, the shutdown is cancelled and critical error messages are logged to alert the operator. This prevents accidental downtime from deploying a binary with invalid configuration. -
New
retain_connections_maxconfiguration parameter: Controls the maximum number of idle connections to close per retain cycle. When set to0, all idle connections that exceedidle_timeoutorserver_lifetimeare closed immediately. Default is3, providing controlled cleanup while preventing connection buildup. Previously, only 1 connection was closed per cycle, which could lead to slow connection cleanup when many connections became idle simultaneously. Connection closures are now logged for better observability. -
Oldest-first connection closure: When
retain_connections_max > 0, connections are now closed in order of age (oldest first) rather than in queue order. This ensures that the oldest connections are always prioritized for closure, providing more predictable connection rotation behavior. -
New
server_idle_check_timeoutconfiguration parameter: Time after which an idle server connection should be checked before being given to a client (default: 30s). This helps detect dead connections caused by PostgreSQL restart, network issues, or server-side idle timeouts. When a connection has been idle longer than this timeout, pg_doorman sends a minimal query (;) to verify the connection is alive before returning it to the client. Set to0to disable. -
New
tcp_user_timeoutconfiguration parameter: Sets theTCP_USER_TIMEOUTsocket option for client connections (in seconds). This helps detect dead client connections faster than keepalive probes when the connection is actively sending data but the remote end has become unreachable. Prevents 15-16 minute delays caused by TCP retransmission timeout. Only supported on Linux. Default is60seconds. Set to0to disable. -
Removed
wait_rollbackmechanism: The pooler no longer attempts to automatically wait for ROLLBACK from clients when a transaction enters an aborted state. This complex mechanism was causing protocol desynchronization issues with async clients and extended query protocol. Server connections in aborted transactions are now simply returned to the pool and cleaned up normally via ROLLBACK during checkin. -
Removed savepoint tracking: Removed the
use_savepointflag and related logic that was tracking SAVEPOINT usage. The pooler now treats savepoints as regular PostgreSQL commands without special handling.
Bug Fixes:
- Fixed protocol desynchronization in async mode with simple prepared statements: When
prepared_statementswas disabled but clients used extended query protocol (Parse, Bind, Describe, Execute, Flush), the pooler wasn't tracking batch operations, causingexpected_responsesto be calculated as 0. This led to the pooler exiting the response loop immediately without waiting for server responses (ParseComplete, BindComplete, etc.). Now batch operations are tracked regardless of theprepared_statementssetting.
Performance:
- Removed timeout-based waiting in async protocol: The pooler now tracks expected responses based on batch operations (Parse, Bind, Execute, etc.) and exits immediately when all responses are received. This eliminates unnecessary latency in pipeline/async workloads.
3.1.8 Jan 31, 2026
Bug Fixes:
-
Fixed ParseComplete desynchronization in pipeline on errors: Fixed a protocol desynchronization issue (especially noticeable in .NET Npgsql driver) where synthetic
ParseCompletemessages were not being inserted if an error occurred during a pipelined batch. When the pooler caches a prepared statement and skips sendingParseto the server, it must still provide aParseCompleteto the client. If an error occurs before subsequent commands are processed, the server skips them, and the pooler now ensures all missing syntheticParseCompletemessages are inserted into the response stream upon receiving anErrorResponseorReadyForQuery. -
Fixed incorrect
use_savepointstate persistence: Fixed a bug where theuse_savepointflag (which disables automatic rollback on connection return if a savepoint was used) was not reset after a transaction ended.
3.1.7 Jan 28, 2026
Memory Optimization:
-
DEALLOCATE now clears client prepared statements cache: When a client sends
DEALLOCATE <name>orDEALLOCATE ALLvia simple query protocol, the pooler now properly clears the corresponding entries from the client's internal prepared statements cache. Previously, synthetic OK responses were sent but the client cache was not cleared, causing memory to grow indefinitely for long-running connections using many unique prepared statements. This fix allows memory to be reclaimed when clients properly deallocate their statements. -
New
client_prepared_statements_cache_sizeconfiguration parameter: Added protection against malicious or misbehaving clients that don't callDEALLOCATEand could exhaust server memory by creating unlimited prepared statements. When the per-client cache limit is reached, the oldest entry is evicted automatically. Set to0for unlimited (default, relies on client callingDEALLOCATE). Example:client_prepared_statements_cache_size: 1024limits each client to 1024 cached prepared statements.
3.1.6 Jan 27, 2026
Bug Fixes:
-
Fixed incorrect timing statistics (xact_time, wait_time, percentiles): The statistics module was using
recent()(cached clock) without proper clock cache updates, causing transaction time, wait time, and their percentiles to show extremely large incorrect values (e.g., 100+ seconds instead of actual milliseconds). The root cause was that thequanta::Upkeephandle was not being stored, causing the upkeep thread to stop immediately after starting. Now the handle is properly retained for the lifetime of the server, ensuringClock::recent()returns accurate cached time values. -
Fixed query time accumulation bug in transaction loop: Query times were incorrectly accumulated when multiple queries were executed within a single transaction. The
query_start_attimestamp was only set once at the beginning of the transaction, causing each subsequent query's elapsed time to include all previous queries' durations (e.g., 10 queries of 100ms each would report the last query as ~1 second instead of 100ms). Nowquery_start_atis updated for each new message in the transaction loop, ensuring accurate per-query timing.
New Features:
-
New
clock_resolution_statisticsconfiguration parameter: Addedgeneral.clock_resolution_statisticsparameter (default:0.1ms= 100 microseconds) that controls how often the internal clock cache is updated. Lower values provide more accurate timing measurements for query/transaction percentiles, while higher values reduce CPU overhead. This parameter affects the accuracy of all timing statistics reported in the admin console and Prometheus metrics. -
Sub-millisecond precision for Duration values: Duration configuration parameters now support sub-millisecond precision:
- New
ussuffix for microseconds (e.g.,"100us"= 100 microseconds) - Decimal milliseconds support (e.g.,
"0.1ms"= 100 microseconds) - Internal representation changed from milliseconds to microseconds for higher precision
- Full backward compatibility maintained: plain numbers are still interpreted as milliseconds
- New
3.1.5 Jan 25, 2026
Bug Fixes:
- Fixed PROTOCOL VIOLATION with batch PrepareAsync
- Rewritten ParseComplete insertion algorithm
Performance:
- Deferred connection acquisition for standalone BEGIN: When a client sends a standalone
BEGIN;orbegin;query (simple query protocol), the pooler now defers acquiring a server connection until the next message arrives. SinceBEGINitself doesn't perform any actual database operations, this optimization reduces connection pool contention when clients are slow to send their next query after starting a transaction.- Micro-optimized detection: first checks message size (12 bytes), then content using case-insensitive comparison
- If client sends Terminate (
X) afterBEGIN, no server connection is acquired at all - The deferred
BEGINis automatically sent to the server before the actual query
3.1.0 Jan 18, 2026
New Features:
- YAML configuration support: Added support for YAML configuration files (
.yaml,.yml) as the primary and recommended format. The format is automatically detected based on file extension. TOML format remains fully supported for backward compatibility.- The
generatecommand now outputs YAML or TOML based on the output file extension. - Include files can mix YAML and TOML formats.
- New array syntax for users in YAML:
users: [{ username: "user1", ... }]
- The
- TOML backward compatibility: Full backward compatibility with legacy TOML format
[pools.*.users.0]is maintained. Both the legacy map format and the new array format[[pools.*.users]]are supported. - Username uniqueness validation: Added validation to reject duplicate usernames within a pool, ensuring configuration correctness.
- Human-readable configuration values: Duration and byte size parameters now support human-readable formats while maintaining backward compatibility with numeric values:
- Duration:
"3s","5m","1h","1d"(or milliseconds:3000) - Byte size:
"1MB","256M","1GB"(or bytes:1048576) - Example:
connect_timeout: "3s"instead ofconnect_timeout: 3000
- Duration:
- Foreground mode binary upgrade: Added support for binary upgrade in foreground mode by passing the listener socket to the new process via
--inherit-fdargument. This enables zero-downtime upgrades without requiring daemon mode. - Optional tokio runtime parameters: The following tokio runtime parameters are now optional and default to
None(using tokio's built-in defaults):tokio_global_queue_interval,tokio_event_interval,worker_stack_size, and the newmax_blocking_threads. Modern tokio versions handle these parameters well by default, so explicit configuration is no longer required in most cases. - Improved graceful shutdown behavior:
- During graceful shutdown, only clients with active transactions are now counted (instead of all connected clients), allowing faster shutdown when clients are idle.
- After a client completes their transaction during shutdown, they receive a proper PostgreSQL protocol error (
58006 - pooler is shut down now) instead of a connection reset. - Server connections are immediately released (marked as bad) after transaction completion during shutdown to conserve PostgreSQL connections.
- All idle connections are immediately drained from pools when graceful shutdown starts, releasing PostgreSQL connections faster.
Performance:
- Statistics module optimization: Major refactoring of the
src/statsmodule for improved performance:- Replaced
VecDequewith HDR histograms (hdrhistogramcrate) for percentile calculations — O(1) percentile queries instead of O(n log n) sorting, ~95% memory reduction for latency tracking. - Histograms are now reset after each stats period (15 seconds) to provide accurate rolling window percentiles.
- Replaced
3.0.5 Jan 16, 2026
Bug Fixes:
- Fixed panic (
capacity overflow) in startup message handling when receiving malformed messages with invalid length (less than 8 bytes or exceeding 10MB). Now gracefully rejects such connections withClientBadStartuperror.
Testing:
- Integration fuzz tests: Added BDD fuzz tests (
@fuzztag) for malformed PostgreSQL protocol messages. - All fuzz tests connect and authenticate first, then send malformed data to test post-authentication resilience.
CI/CD:
- Added dedicated fuzz test job in GitHub Actions workflow (without retries, as fuzz tests should not be flaky).
3.0.4 Jan 16, 2026
New Features:
- Enhanced DEBUG logging for PostgreSQL protocol messages: Added grouped debug logging that displays message types in a compact format (e.g.,
[P(stmt1),B,D,E,S]or[3xD,C,Z]). Messages are buffered and flushed every 100ms or 100 messages to reduce log noise. - Protocol violation detection: Added real-time protocol state tracking that detects and warns about protocol violations (e.g., receiving ParseComplete when no Parse was pending). Helps diagnose client-server synchronization issues.
Bug Fixes:
- Fixed potential protocol violation when client disconnects during batch operations with cached prepared statements: disabled fast_release optimization when there are pending prepared statement operations.
- Fixed ParseComplete insertion for Describe flow: now correctly inserts one ParseComplete before each ParameterDescription ('t') or NoData ('n') message instead of inserting all at once.
3.0.3 Jan 15, 2026
Bug Fixes:
- Improved handling of Describe flow for cached prepared statements: added a separate counter (
pending_parse_complete_for_describe) to correctly insert ParseComplete messages before ParameterDescription or NoData responses when Parse was skipped due to caching.
Testing:
- Added .NET client tests for Describe flow with cached prepared statements (
describe_flow_cached.cs). - Added mixed tests combining batch operations, prepared statements, and extended protocol (
aggressive_mixed.cs).
3.0.2 Jan 14, 2026
Bug Fixes:
- Fixed protocol mismatch for .NET clients (Npgsql) using named prepared statements with
Prepare(): ParseComplete messages are now correctly inserted before ParameterDescription and NoData messages in the Describe flow, not just before BindComplete.
3.0.1 Jan 14, 2026
Bug Fixes:
- Fixed protocol mismatch for .NET clients (Npgsql): prevented insertion of ParseComplete messages between DataRow messages when server has more data available.
Testing:
- Extended Node.js client test coverage with additional scenarios for prepared statements, error handling, transactions, and edge cases.
3.0.0 Jan 12, 2026
Architecture refactor
PgDoorman 3.0.0 reorganizes the client, config, admin, auth, and
prometheus modules, and adds the patroni_proxy binary.
New Features:
- patroni_proxy — a TCP proxy for Patroni-managed PostgreSQL clusters:
- Zero-downtime connection management — existing connections are preserved during cluster topology changes
- Hot upstream updates — automatic discovery of cluster members via Patroni REST API without connection drops
- Role-based routing — route connections to leader, sync replicas, or async replicas based on configuration
- Replication lag awareness with configurable
max_lag_in_bytesper port - Least connections load balancing strategy
Improvements:
- Module split:
- Client handling split into dedicated modules (core, entrypoint, protocol, startup, transaction)
- Configuration system reorganized into focused modules (general, pool, user, tls, prometheus, talos)
- Admin, auth, and prometheus subsystems extracted into separate modules
- Async protocol support — improved handling of asynchronous PostgreSQL protocol messages.
- Extended protocol — improved client buffering and message handling.
- xxhash3 for prepared statement hashing — faster hash computation for prepared statement cache
- BDD test framework — multi-language integration tests (Go, Rust, Python, Node.js, .NET) in a Docker-based environment.
2.5.0 Nov 18, 2025
Improvements:
- Reworked the statistics collection system, yielding up to 20% performance gain on fast queries.
- Improved detection of
SAVEPOINTusage, allowing the auto-rollback feature to be applied in more situations.
Bug Fixes / Behavior:
- Less aggressive behavior on write errors when sending a response to the client: the server connection is no longer immediately marked as "bad" and evicted from the pool. We now read the remaining server response and clean up its state, returning the connection to the pool in a clean state. This improves performance during client reconnections.
2.4.3 Nov 15, 2025
Bug Fixes:
- Fixed handling of nested transactions via
SAVEPOINT: auto-rollback now correctly rolls back to the savepoint instead of breaking the outer transaction. This prevents clients from getting stuck in an inconsistent transactional state.
2.4.2 Nov 13, 2025
Improvements:
pg_hbarules now apply to the admin console as well; thetrustmethod can be used for admin connections when a matching rule is present (use with caution; restrict by address/TLS).
Bug Fixes:
- Fixed
pg_hbaevaluation:localrecords were mistakenly considered; PgDoorman only handles TCP connections, solocalentries are now correctly ignored.
2.4.1 Nov 12, 2025
Improvements:
- Performance optimizations in request handling and message processing paths to reduce latency and CPU usage.
pg_hbarules now apply to the admin console as well; thetrustmethod can be used for admin connections when a matching rule is present (use with caution; restrict by address/TLS).
Bug Fixes:
- Corrected logic where
COMMITcould be mishandled similarly toROLLBACKin certain error states; now transactional state handling is aligned with PostgreSQL semantics.
2.4.0 Nov 10, 2025
Features:
- Added
pg_hbasupport to control client access in PostgreSQL format. Newgeneral.pg_hbasetting supports inline content or file path. - Clients that enter the
aborted in transactionstate are detached from their server backend; the proxy waits for the client to sendROLLBACK.
Improvements:
- Refined admin and metrics counters: separated
cancelconnections and corrected calculation oferrorconnections in admin output and Prometheus metrics descriptions. - Added configuration validation to prevent simultaneous use of legacy
general.hbaCIDR list with the newgeneral.pg_hbarules. - Improved validation and error messages for Talos token authentication.
2.2.2 Aug 17, 2025
Features:
- Added new generate feature functionality
Bug Fixes:
- Fixed deallocate issues with PGX5 compatibility
2.2.1 Aug 6, 2025
Features:
- Improve Prometheus exporter functionality
2.2.0 Aug 5, 2025
Features:
- Added Prometheus exporter functionality that provides metrics about connections, memory usage, pools, queries, and transactions
2.1.2 Aug 4, 2025
Features:
- Added docker image
ghcr.io/ozontech/pg_doorman
2.1.0 Aug 1, 2025
Features:
- The new command
generateconnects to your PostgreSQL server, automatically detects all databases and users, and creates a complete configuration file with appropriate settings. This is especially useful for quickly setting up PgDoorman in new environments or when you have many databases and users to configure.
2.0.1 July 24, 2025
Bug Fixes:
- Fixed
max_memory_usagecounter leak when clients disconnect improperly.
2.0.0 July 22, 2025
Features:
- Added
tls_modeconfiguration option to enhance security with flexible TLS connection management and client certificate validation capabilities.
1.9.0 July 20, 2025
Features:
- Added PAM authentication support.
- Added
talosJWT authentication support.
Improvements:
- Implemented streaming for COPY protocol with large columns to prevent memory exhaustion.
- Updated Rust and Tokio dependencies.
1.8.3 Jun 11, 2025
Bug Fixes:
- Fixed critical bug where Client's buffer wasn't cleared when no free connections were available in the Server pool (query_wait_timeout), leading to incorrect response errors. #38
- Fixed Npgsql-related issue. Npgsql#6115
1.8.2 May 24, 2025
Features:
- Added
application_nameparameter in pool. #30 - Added support for
DISCARD ALLandDEALLOCATE ALLclient queries.
Improvements:
- Implemented link-time optimization. #29
Bug Fixes:
- Fixed panics in admin console.
- Fixed connection leakage on improperly handled errors in client's copy mode.
1.8.1 April 12, 2025
Bug Fixes:
- Fixed config value of prepared_statements. #21
- Fixed handling of declared cursors closure. #23
- Fixed proxy server parameters. #25
1.8.0 Mar 20, 2025
Bug Fixes:
- Fixed dependencies issue. #15
Improvements:
- Added release vendor-licenses.txt file. Related thread
1.7.9 Mar 16, 2025
Improvements:
- Added release vendor.tar.gz for offline build. Related thread
Bug Fixes:
- Fixed issues with pqCancel messages over TLS protocol. Drivers should send pqCancel messages exclusively via TLS if the primary connection was established using TLS. Npgsql follows this rule, while PGX currently does not. Both behaviors are now supported.
1.7.8 Mar 8, 2025
Bug Fixes:
- Fixed message ordering issue when using batch processing with the extended protocol.
- Improved error message detail in logs for server-side login attempt failures.
1.7.7 Mar 8, 2025
Features:
- Enhanced
show clientscommand with new fields:state(waiting/idle/active) andwait(read/write/idle). - Enhanced
show serverscommand with new fields:state(login/idle/active),wait(read/write/idle), andserver_process_pid. - Added 15-second proxy timeout for streaming large
message_size_to_be_streamresponses.
Bug Fixes:
- Fixed
max_memory_usagecounter leak when clients disconnect improperly.