Elasticsearch Monitoring
Elasticsearch-Monitoring, das den GC-Pause-Sturm vorhersagt.
Obsfly liest Cluster-Health, JVM-Heap-Pressure, Slow Log und Shard-Allocation — übersetzt sie in Forecast-Bänder, die GC-Stürme und Heißknoten 30 Tage vorher fangen.
Why monitor Elasticsearch
Elasticsearch in production is mostly a JVM-tuning game with shard-allocation politics on top. The metrics that matter — heap pressure, GC time, indexing back-pressure, queue overflow — are buried in node stats. Obsfly surfaces them.
What we scrape
Obsfly reads Elasticsearch through the surfaces operators already know. No driver changes, no extensions installed by us, no agent on the database itself.
_cluster/health
Cluster status (green/yellow/red), unassigned shards, pending tasks.
_cluster/stats
Total shards, indices, fielddata size, query/fetch latency.
_nodes/stats
Per-node JVM heap, GC, thread pools, HTTP, transport.
Slow log (settings index.search.slowlog.*)
Slow searches and indexes captured per request.
_cat/shards / _cat/recovery
Shard placement and recovery state.
Key metrics tracked
Common Elasticsearch pains, and how Obsfly surfaces each
Old-gen GC pauses spiking
Sign
Old-gen GC count growing; pause time > 1s; heap usage stays high after collection.
Fix
Heap is too small or fielddata bloat. Increase heap (max 30.5 GB for compressed oops), or migrate to doc_values.
Indexing rate drops under load
Sign
bulk thread pool rejections climb; queue depth saturated.
Fix
Increase queue size (cautiously). Better: shard your indices more, or switch to time-series data streams (TSDS).
Unassigned shards stay yellow/red
Sign
_cluster/health shows unassigned > 0; _cat/shards shows reason.
Fix
Allocation explain API: GET /_cluster/allocation/explain. Common causes: disk watermark exceeded, allocation filter mismatch.
vs Datadog DBM for Elasticsearch
Obsfly features for Elasticsearch
Feature
Query Summary
Top-N normalized queries with p50 / p95 / p99 latency, QPS, total time, rows touched, and plan-change history.
Feature
Anomaly Detection
ML-driven anomaly detection on every metric. Forecast bands, change-point detection, no thresholds to tune.
Feature
Forecast
Capacity forecasts for QPS, IOPS, storage, connections — predict outages weeks ahead.
FAQ
OpenSearch supported?+
Yes — same APIs and JVM. Both Elasticsearch (OSS and Elastic.co's commercial) and OpenSearch.
Versions?+
Elasticsearch 7.x, 8.x, 9.x. OpenSearch 1.x, 2.x, 3.x. Older 6.x works with reduced detail.
· · ·
See Obsfly on your Elasticsearch.
20-min demo. We connect to a sample Elasticsearch on the call and reproduce your slowest query in the tool.