Redis Monitoring
本当に重要な5つのフィールドを見つける Redis 監視。
Obsfly は INFO、slowlog、latency monitor、keyspace stats を抽出 — 200 以上のフィールドをすべてのインシデントを予測する 8 つのアラートに変えます。
Why monitor Redis
Redis is famously fast and famously easy to mis-tune. Big keys, blocking commands, and unbounded sorted sets are the production killers — and they're invisible without a tool that reads slowlog and tracks keyspace size over time.
What we scrape
Obsfly reads Redis through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.
INFO sections
server / clients / memory / stats / replication / commandstats / keyspace.
SLOWLOG GET
Recent slow commands with arguments and execution time.
LATENCY HISTORY / LATENCY GRAPH
Latency monitor events with sub-millisecond resolution.
CLUSTER NODES / CLUSTER INFO
Cluster topology, slot ownership, node failures.
DEBUG OBJECT / MEMORY USAGE
Per-key inspection (sampled, not continuous).
CLIENT LIST
Connected clients with idle time, addr, name, sub state.
Key metrics tracked
Common Redis pains, and how Obsfly surfaces each
Latency spikes under no obvious load change
Sign
LATENCY HISTORY shows fork or aof-write-then-fsync events; usually correlated with BGSAVE.
Fix
Tune save schedule; consider AOF-only persistence with everysec fsync. On RDB, save during low-traffic windows.
Memory growth despite consistent traffic
Sign
INFO memory shows used_memory growing; eviction policy not kicking in.
Fix
maxmemory not set, or maxmemory-policy is noeviction. Set both and re-deploy.
Slow commands with KEYS *
Sign
Slowlog dominated by KEYS / FLUSHDB / SMEMBERS on huge sets.
Fix
Replace with SCAN. Educate the team. Add a slowlog-based alert.
Replication breaks under burst writes
Sign
Replicas disconnect; master_repl_offset jumps; replicas full-resync.
Fix
Bump repl-backlog-size to absorb burst. Verify client-output-buffer-limit slave isn't terminating replicas.
vs Datadog DBM for Redis
Obsfly features for Redis
Feature
Query Summary
Top-N normalized queries with p50 / p95 / p99 latency, QPS, total time, rows touched, and plan-change history.
Feature
Query Activity
Live query stream with wait events, lock chains, slow-query alerts, and sample-once-per-second activity snapshots.
Feature
Anomaly Detection
ML-driven anomaly detection on every metric. Forecast bands, change-point detection, no thresholds to tune.
Feature
Configuration Tracking
Database parameter inventory, drift from baseline, recommended values, change history with attribution.
FAQ
Standalone, Sentinel, Cluster — all supported?+
Yes. The agent auto-detects topology and scrapes accordingly. Cluster mode collects from every node and reconstructs cluster-wide views.
Does it work with managed Redis (ElastiCache, MemoryDB, Cloud Memorystore, Upstash)?+
Yes. Standard Redis protocol; all commands we use are supported by managed providers.
Big-key detection — how does it not DoS my Redis?+
We sample at most 1 key per 200ms via SCAN + DEBUG OBJECT, with a configurable budget. No KEYS *, no MEMORY DOCTOR loops.