Obsfly

MongoDB Monitoring

MongoDB monitoring built around how MongoDB actually works.

Four surfaces (serverStatus, db.stats, currentOp, profiler), one pane. Obsfly scrapes them all, normalizes operation shapes, and gives you replica lag, oplog window, aggregation cost, and live activity — without driver changes.

Why monitor MongoDB

Monitoring MongoDB well means scraping four different command surfaces and reasoning about replica sets, oplog windows, sharded chunks, and aggregation pipelines. Obsfly does all of that out of the box.

What we scrape

Obsfly reads MongoDB through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.

serverStatus()

Host-level rollups: opcounters, connections, WiredTiger cache, network, locks.

db.stats()

Per-database storage, index size, collection count, document count.

db.currentOp()

Live in-flight operations sampled at 1 Hz with lock waits and plan summaries.

system.profile (per-DB profiler)

Persisted slow-op log with execution stats, examined/returned ratios, write conflicts.

rs.status() / db.getReplicationInfo()

Replica set health, per-member lag, oplog window, election history.

$indexStats aggregation

Per-index access counts to find unused indexes.

Key metrics tracked

Op latency percentiles per shape
p50/p95/p99 derived from profiler millis.
WiredTiger cache pressure
'pages evicted by application threads' growing = workers doing eviction synchronously.
Replica set max lag (seconds)
Across all secondaries; alert with forecast band.
Oplog window (hours)
tFirst → tLast. If a secondary falls behind the window, it needs resync.
docsExamined / nReturned ratio
Per shape. > 100 means missing index almost always.
Connection pool usage
connections.current / connections.available, with forecast.
writeConflicts per minute
MongoDB's deadlock-equivalent on hot collections.
Election count per replica set
Repeated elections signal instability.

Common MongoDB pains, and how Obsfly surfaces each

Slow aggregation pipeline

Sign

executionStats.totalDocsExamined >> nReturned in explain output.

Fix

Reorder $match before $group/$lookup. Add index on $match fields. Profile the heavy stage with stages[i].executionTimeMillisEstimate.

Replica lag spikes under write load

Sign

rs.status() shows secondaries' optimeDate falling behind primary.

Fix

Single-threaded oplog application by default. Bump WiredTiger cache on replicas; consider write concern w:majority for important writes.

WiredTiger cache thrashing

Sign

'pages evicted by application threads' growing; query latency rising despite stable workload.

Fix

Working set exceeds cache. Bump WiredTiger cache to ~50% of host RAM (default). Or shard.

Hot shard in a sharded cluster

Sign

Per-shard opcounters show one shard handling most writes; balancer not moving chunks.

Fix

Shard key choice. Often the shard key is too low-cardinality or monotonically increasing. Range-sharding by ObjectId is the classic anti-pattern.

vs Datadog DBM for MongoDB

Datadog DBM MongoDB covers the basics. Obsfly adds aggregation-pipeline cost analysis ($lookup/$group/$unwind hot-stage detection), oplog-window forecasting, and structured currentOp at 1 Hz — instead of Datadog's 10s default.
Full Datadog DBM comparison →

FAQ

What MongoDB versions and topologies?+

4.4, 5.0, 6.0, 7.0, 8.0. Standalone, replica sets, and sharded clusters. MongoDB Atlas and Amazon DocumentDB.

Does the profiler add load?+

Level 1 (slow ops only) at slowms=100ms typically adds <2% on OLTP. Level 2 (all ops) is for diagnosis only.

What permissions does the monitoring user need?+

clusterMonitor on the cluster + read on each database whose system.profile you want to scrape. We supply the createRole script.

Atlas Performance Advisor — do I still need this?+

Atlas covers slow-query advice on Atlas-hosted clusters only. Obsfly gives the same coverage plus self-hosted, plus replica/oplog/sharded metrics not in Performance Advisor.

· · ·

See Obsfly on your MongoDB.

20-min demo. We connect to a sample MongoDB on the call and reproduce your slowest query in the tool.

MongoDB monitoring — currentOp, profiler, replica lag, anomalies · Obsfly