AWS

RDS Performance Insights: where it stops and what you actually need next

PI is free up to 7 days, ships with every RDS, and surfaces top SQL by wait class. It also stops short on plan history, multi-host correlation, multi-engine fleets, alerting, and AI suggestions. Here's where the line is and what to bolt on.

Published 2026-05-26·11 min read

If you’re on RDS, Performance Insights is right there. Free for 7 days of retention, ships with every RDS instance, one toggle to enable. Every senior DBA you ask will tell you “PI is a starter, you’ll need more.” The advice is correct. This post is about exactly where the line is.

On this page

Seven things PI does well
Seven gaps that bite around month two
Three bolt-on patterns
What RDS Extended Support buys you (and doesn't)
When you've outgrown PI
FAQ

Seven things PI does well

Wait-event analysis.The chart-by-wait-class view is genuinely good — it’s what AWS bought from the Oracle ASH playbook and ported across engines.
Top SQL by elapsed time / IO / cpu.The SQL digest table answers “what was slow in the last hour” without instrumentation.
SQL digest normalization. Same shape as pg_stat_statements / MySQL Performance Schema. You can correlate PI digests with your own queries.
7 days free.The default retention is enough to investigate yesterday’s incident. Free is the right price.
CloudWatch integration. DBLoad metrics export to CloudWatch so you can set alarms via the AWS-native stack.
Zero-install.No agent, no exporter, no extension activation (for Postgres ≥ 14). One console toggle.
Aurora-native. Works identically across Aurora Postgres, Aurora MySQL, and standard RDS engines. One UI.

Seven gaps that bite around month two

Gap	What it means in practice	When it hurts
1. No plan history	PI shows the latest plan only. You can’t see when the optimizer chose a different one or correlate plan flips with regressions.	Plan-flip incident → 2 hours of bisection instead of 5 minutes.
2. No multi-host correlation	Each PI dashboard is per-instance. To see your whole RDS fleet you flip between tabs.	Fleet of 20+ DBs becomes unmanageable.
3. Single-engine view	If your stack is Postgres + MongoDB + Redis you have three different consoles.	Most real fleets are polyglot.
4. Threshold-only alerting via CloudWatch	DBLoad is one metric. You build the rest in CloudWatch Logs Insights and SNS yourself.	Multi-variate anomalies (qps + p99 + lock wait together) miss CloudWatch’s model.
5. No anomaly / forecast	PI is reactive. There’s no “hey, this metric is trending toward breach in 9 days” surface.	Capacity planning is still spreadsheet work.
6. No AI suggestions	PI shows you the slow query. It doesn’t propose a rewrite or an index.	Junior engineers in the on-call rotation can’t self-serve.
7. 7 days retention free, longer is paid	Long-term retention (2 yr) is $0.01 / vCPU-hour. For a 20-vCPU fleet that’s ~$1,400 / yr just for storage.	Compliance + post-mortem use cases need 12+ mo retention.

Three bolt-on patterns we see work

PI + CloudWatch Logs Insights + Lambda

Use PI for the live console, ship slow-query logs to CloudWatch, run scheduled Lambdas that compute derived metrics and post to SNS. Works. Total build is ~3 engineer-weeks. The downside is everything is on AWS — multi-cloud teams hit a wall.

PI + Grafana + custom exporter

Wire PI’s CloudWatch metrics into Grafana, build per-query dashboards with a custom pg_stat_statements exporter on the side. Common pattern. Now you maintain two systems — Grafana for cross-cutting, PI for the AWS-native deep dive.

PI + commercial DBM

Keep PI for the “what’s burning right now” AWS-native view. Add a DBM tool for plan history, multi-host correlation, multi-engine, anomaly, and AI. That’s the pattern we see on every RDS fleet over ~30 instances.

What RDS Extended Support buys you (and doesn’t)

AWS’s “Performance Insights Premium” tier extends retention to 24 months at higher per-vCPU cost. It does not add plan history, anomaly detection, AI suggestions, or multi-DB fan-out. The premium tier solves problem #7 above and leaves the other six in place.

When you’ve outgrown PI

15+ RDS instances. Per-instance tabs are no longer workable.
Multi-engine fleet. Postgres + MySQL + Mongo or similar polyglot.
You had a plan-flip incident.You promised the post-mortem this wouldn’t happen again, and PI alone can’t deliver that.
Compliance retention. 12+ months on slow-query samples is mandated.
Forecast / capacity planning is on the roadmap.PI doesn’t do this; it’s a separate tool either way.

FAQ

Is Performance Insights enough to ditch a DBM tool entirely?+

For a single-DB shop with a Postgres-only fleet under ~10 instances, yes. The moment you add a second engine or a 15-month retention requirement, the gaps become full-time engineering work to backfill.

Does Performance Insights work on Aurora Serverless v2?+

Yes, identically. The PI agent runs in the AWS-managed compute layer.

Can I use Performance Insights with Obsfly?+

Yes — and most of our RDS customers do. They keep PI enabled for the live console and AWS-native view, and use Obsfly for plan history, multi-host, multi-engine, anomaly, and AI rewrite.

What's the real cost difference?+

PI free tier is $0 with 7-day retention. PI Premium is roughly $0.01–$0.03/vCPU-hour for 24-month retention — call it ~$200–$1,500/mo for a typical fleet. Obsfly Team is $39/DB/mo flat; replacing PI Premium on a 30-DB fleet runs $1,170/mo, with all seven gaps closed.

Keep reading

Postgres

pg_stat_statements: the complete 2026 guide

Every column, every gotcha, the queries you should run today, and why pg_stat_statements is still the most useful 80 lines of telemetry in Postgres — even with five new alternatives in 2026.

MySQL

MySQL Performance Schema vs sys schema: a 2026 monitoring guide

Performance Schema is unreadable. sys schema is friendly but lossy. Here's exactly which to use for which production question, with the eight queries every MySQL DBA should know by heart.

BYOC

Why regulated SaaS can't use Datadog DBM — and the BYOC fix

Walking through the architecture of a BYOC observability deployment: where data lives, what crosses the boundary, and how to satisfy SOC2 / HIPAA / GDPR without giving up the UX.

← All posts