PostgreSQL Monitoring
Postgres-Monitoring, das die langsame Query wirklich findet.
Obsfly liest pg_stat_statements, pg_stat_activity, auto_explain und pg_locks bei 1 Hz, normalisiert alles auf eine Query-Signatur und zeigt dir die Slow Paths, bevor deine Nutzer es tun.
Why monitor Postgres
Postgres ships with deep instrumentation (17 columns in pg_stat_statements, full plan capture via auto_explain, lock visibility via pg_locks) but using it well requires plumbing it into a system that retains, correlates, and alerts. Obsfly is that system, with no agent on the database itself — only a read-only monitoring user.
What we scrape
Obsfly reads Postgres through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.
pg_stat_statements
Per-signature execution counts, total/mean/stddev time, rows touched, buffer hit ratios, WAL volume.
pg_stat_activity
Live session state, current query, wait_event_type / wait_event, blocking PIDs.
auto_explain
EXPLAIN ANALYZE plans captured automatically when queries cross a latency threshold.
pg_locks + pg_blocking_pids()
Lock chains and AccessExclusiveLock detection sampled at 1 Hz.
pg_stat_user_tables / pg_stat_user_indexes
Bloat, dead tuples, last vacuum, last analyze, index usage and unused indexes.
pg_stat_replication / pg_stat_wal_receiver
Per-replica lag in bytes and seconds, sync state, write/flush/replay LSNs.
pg_stat_io (PG 16+)
Per-backend, per-context I/O breakdown — replaces guesswork about who's doing what to disk.
Key metrics tracked
Common Postgres pains, and how Obsfly surfaces each
Slow query that's only sometimes slow
Sign
stddev_exec_time is 5–10× mean_exec_time in pg_stat_statements.
Fix
Capture EXPLAIN on the slow path with auto_explain. Check for plan flips and parameter sensitivity.
Random latency spikes correlated with autovacuum
Sign
VACUUM events in logs align with the spikes; pg_stat_user_tables.last_autovacuum coincides.
Fix
Per-table autovacuum tuning. Lower autovacuum_vacuum_scale_factor on hot tables.
Replica lag growing under write load
Sign
pg_stat_replication.write_lag climbing; replay_lag larger than write_lag.
Fix
Single-threaded replay is the bottleneck. Bump shared_buffers on replicas, or split read load to a less-laggy replica.
Connection refusals despite low CPU
Sign
Aborted_connects climbs while idle connection count is high.
Fix
PgBouncer in transaction mode. Default Postgres handles ~100 backends well, transaction pooling handles 10×.
Bloated tables, week-over-week query slowdowns
Sign
pg_stat_user_tables.n_dead_tup > 20% of n_live_tup; cache hit ratio drops slowly.
Fix
Aggressive autovacuum on the affected tables. VACUUM (FULL) only during planned maintenance.
vs Datadog DBM for Postgres
Obsfly features for Postgres
Feature
Query Summary
Top-N normalized queries with p50 / p95 / p99 latency, QPS, total time, rows touched, and plan-change history.
Feature
Explain Plan
Auto-captured EXPLAIN (ANALYZE, BUFFERS) plans on slow queries, plan diff over time, regression detection.
Feature
Deadlock Detection
Catch every deadlock with full lock-chain context, victim and aggressor stacks, and remediation suggestions.
Feature
Anomaly Detection
ML-driven anomaly detection on every metric. Forecast bands, change-point detection, no thresholds to tune.
FAQ
Does Obsfly install anything on my Postgres host?+
No. The Obsfly agent runs on a separate host (or as a sidecar) and connects via the standard Postgres protocol with a read-only monitoring user. No extensions installed by us — pg_stat_statements is the only required extension, and most installations have it.
Does it work with RDS, Aurora, Cloud SQL, Crunchy Bridge?+
Yes — every managed Postgres offering. The agent uses standard libpq; managed providers expose pg_stat_statements, pg_stat_activity, and pg_locks identically.
What about pgBouncer in front of Postgres?+
Obsfly scrapes both — pgBouncer's stats (SHOW POOLS) for connection metrics, and Postgres directly for query metrics. Connection attribution is preserved through transaction pooling.
Does the monitoring user need superuser?+
No. The pg_read_all_stats role (PG 10+) is enough. We provide a setup script that creates the user with minimal privileges.
How is this different from pgBadger or pganalyze?+
pgBadger is log-based and offline. pganalyze is closest to Datadog DBM in scope. Obsfly ships the same query-and-plan analysis plus AI-native anomaly detection, BYOC and Sovereign deployments, and per-DB pricing — pganalyze charges per server.
Deep dives on Postgres
Postgres
pg_stat_statements: the complete 2026 guide
Every column, every gotcha, the queries you should run today, and why pg_stat_statements is still the most useful 80 lines of telemetry in Postgres — even with five new alternatives in 2026.
Postgres
Postgres slow queries: 12 causes and how to find each one
A field-tested playbook for diagnosing a slow Postgres query in production — from missing indexes to plan flips to bloated tables — with the SQL to find each cause and the fix.
Postgres
Postgres lock chains: how to find the session blocking yours
A practical walkthrough of pg_locks, pg_blocking_pids, and the recursive CTE that gives you the full chain — including the AccessExclusiveLocks that quietly take your DB down.
Postgres
Why your Postgres p99 latency lies — and what to track instead
p99 over 1m windows is the most-displayed and most-misleading number on every DBM dashboard. Here's the histogram math, the seasonality math, and a saner default.
· · ·
See Obsfly on your Postgres.
20-min demo. We connect to a sample Postgres on the call and reproduce your slowest query in the tool.