Obsfly

Apache Cassandra Monitoring

及时发现 compaction 积压的 Cassandra 监控。

Obsfly 抓取 JMX MBean、compaction 统计、hints 队列大小与 repair 状态,翻译为每节点可执行的预测带。

Why monitor Cassandra

Cassandra's pathologies are unique: hot partitions, compaction storms, repair backlogs, and DC-to-DC latency tail. Generic DBM tools miss the JMX surface that exposes them.

What we scrape

Obsfly reads Cassandra through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.

JMX MBeans (org.apache.cassandra.metrics)

Read/Write latency histograms per keyspace and table.

JMX (org.apache.cassandra.db)

Compaction state, hint queue, dropped messages.

JMX (org.apache.cassandra.net)

Cross-DC and cross-node messaging metrics.

Slow query log (5.0+)

Per-CQL slow execution log with attribution.

system.* tables

system.peers, system.local for topology, hints state.

Key metrics tracked

Read/write latency p99 per table
Per-keyspace, per-table histograms.
Pending compactions
Total + per-table; alert when growing for > 30 min.
Hinted handoff queue depth
If > 0 for long, replicas are missing data.
Dropped messages per minute
By type (READ, MUTATION, RANGE_SLICE, etc.).
Read repair rate
Background repairs triggered per query — high = data inconsistency.
Tombstone scan ratio
Per-query tombstones examined; high values mean delete-heavy workload.

Common Cassandra pains, and how Obsfly surfaces each

Compaction backlog growing under write load

Sign

Pending compactions climbs; SSTable count per partition grows.

Fix

Increase concurrent_compactors, tune compaction_throughput_mb_per_sec. Consider switching from STCS to LCS for read-heavy tables.

Hot partition detected

Sign

Latency tail dominated by one or two partitions; read repair rate spikes.

Fix

Schema problem. Re-shard the partition key to spread load.

Cross-DC tail latency

Sign

QUORUM/EACH_QUORUM consistency reads have long tails crossing DC boundaries.

Fix

Use LOCAL_QUORUM or LOCAL_ONE where consistency allows. Check DC link health via dropped messages.

vs Datadog DBM for Cassandra

Datadog Cassandra is JMX-scraping with limited per-table granularity. Obsfly extracts every Cassandra-specific MBean and surfaces hot partitions, repair lag, and tombstone ratios as first-class metrics with forecast bands.
Full Datadog DBM comparison →

FAQ

Cassandra vs ScyllaDB — both supported?+

Yes. ScyllaDB exposes Cassandra-compatible JMX (or the modern HTTP REST API). The agent picks the best-available surface.

Versions?+

Apache Cassandra 3.11, 4.x, 5.0. DataStax Enterprise. ScyllaDB 5.x and 6.x.

· · ·

See Obsfly on your Cassandra.

20-min demo. We connect to a sample Cassandra on the call and reproduce your slowest query in the tool.

Cassandra monitoring — JMX, nodetool, compactions, read/write latency · Obsfly