Obsfly

PostgreSQL Monitoring

Monitorización Postgres que sí encuentra la consulta lenta.

Obsfly lee pg_stat_statements, pg_stat_activity, auto_explain y pg_locks a 1 Hz, normaliza todo a una firma y saca a la luz los caminos lentos antes que tus usuarios.

Why monitor Postgres

Postgres ships with deep instrumentation (17 columns in pg_stat_statements, full plan capture via auto_explain, lock visibility via pg_locks) but using it well requires plumbing it into a system that retains, correlates, and alerts. Obsfly is that system, with no agent on the database itself — only a read-only monitoring user.

What we scrape

Obsfly reads Postgres through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.

pg_stat_statements

Per-signature execution counts, total/mean/stddev time, rows touched, buffer hit ratios, WAL volume.

pg_stat_activity

Live session state, current query, wait_event_type / wait_event, blocking PIDs.

auto_explain

EXPLAIN ANALYZE plans captured automatically when queries cross a latency threshold.

pg_locks + pg_blocking_pids()

Lock chains and AccessExclusiveLock detection sampled at 1 Hz.

pg_stat_user_tables / pg_stat_user_indexes

Bloat, dead tuples, last vacuum, last analyze, index usage and unused indexes.

pg_stat_replication / pg_stat_wal_receiver

Per-replica lag in bytes and seconds, sync state, write/flush/replay LSNs.

pg_stat_io (PG 16+)

Per-backend, per-context I/O breakdown — replaces guesswork about who's doing what to disk.

Key metrics tracked

Query latency p50 / p95 / p99 / p99.9
Per signature, merged across hosts via t-digest.
Active connections / max connections
Pool saturation alerts before connections refuse.
Cache hit ratio
shared_blks_hit / (hit + read) — anything under 95% wants attention.
WAL bytes per minute
Write amplification detector; pairs with bgwriter checkpoints.
Replica lag (bytes + seconds)
Per replica, with predicted-overrun forecast.
Dead tuple ratio
Per-table autovacuum tuning signal.
Lock waits per minute
Lock-chain pre-warning before something stalls.
Plan changes per signature
Plan-flip detection — surfaces the moment a deploy moves a query off an index.

Common Postgres pains, and how Obsfly surfaces each

Slow query that's only sometimes slow

Sign

stddev_exec_time is 5–10× mean_exec_time in pg_stat_statements.

Fix

Capture EXPLAIN on the slow path with auto_explain. Check for plan flips and parameter sensitivity.

Random latency spikes correlated with autovacuum

Sign

VACUUM events in logs align with the spikes; pg_stat_user_tables.last_autovacuum coincides.

Fix

Per-table autovacuum tuning. Lower autovacuum_vacuum_scale_factor on hot tables.

Replica lag growing under write load

Sign

pg_stat_replication.write_lag climbing; replay_lag larger than write_lag.

Fix

Single-threaded replay is the bottleneck. Bump shared_buffers on replicas, or split read load to a less-laggy replica.

Connection refusals despite low CPU

Sign

Aborted_connects climbs while idle connection count is high.

Fix

PgBouncer in transaction mode. Default Postgres handles ~100 backends well, transaction pooling handles 10×.

Bloated tables, week-over-week query slowdowns

Sign

pg_stat_user_tables.n_dead_tup > 20% of n_live_tup; cache hit ratio drops slowly.

Fix

Aggressive autovacuum on the affected tables. VACUUM (FULL) only during planned maintenance.

vs Datadog DBM for Postgres

Datadog DBM Postgres ships pg_stat_statements + pg_stat_activity scraping. Obsfly adds plan-change history per signature, t-digest-based percentile merging across hosts, automatic EXPLAIN diff on plan flips, and forecast bands on every metric — at roughly 1/3 the per-DB cost.
Full Datadog DBM comparison →

FAQ

Does Obsfly install anything on my Postgres host?+

No. The Obsfly agent runs on a separate host (or as a sidecar) and connects via the standard Postgres protocol with a read-only monitoring user. No extensions installed by us — pg_stat_statements is the only required extension, and most installations have it.

Does it work with RDS, Aurora, Cloud SQL, Crunchy Bridge?+

Yes — every managed Postgres offering. The agent uses standard libpq; managed providers expose pg_stat_statements, pg_stat_activity, and pg_locks identically.

What about pgBouncer in front of Postgres?+

Obsfly scrapes both — pgBouncer's stats (SHOW POOLS) for connection metrics, and Postgres directly for query metrics. Connection attribution is preserved through transaction pooling.

Does the monitoring user need superuser?+

No. The pg_read_all_stats role (PG 10+) is enough. We provide a setup script that creates the user with minimal privileges.

How is this different from pgBadger or pganalyze?+

pgBadger is log-based and offline. pganalyze is closest to Datadog DBM in scope. Obsfly ships the same query-and-plan analysis plus AI-native anomaly detection, BYOC and Sovereign deployments, and per-DB pricing — pganalyze charges per server.

· · ·

See Obsfly on your Postgres.

20-min demo. We connect to a sample Postgres on the call and reproduce your slowest query in the tool.

Postgres monitoring — query performance, plans, anomalies · Obsfly