Monitoring Pipeline
The Monitoring Pipeline constitutes of following components.
Infrastructure Metric
- Resource Metric: Memory, CPU
- Cluster Metric: Nodes count
- LB Metric: 5xx Error rate, etc
Application Metric
- Metric generated by Application - response time, status code, request code (aka throughput)
- Application -> Client Agent -> Metric DB -> Metric Dashboard
- Client agent (e.g. statsd) installed at each node
- Metric DB (e.g. Graphite, New Relic) Storage for metric
- Metric Dashboard (e.g. New Relic, Grafana) Visualization graphs
Logs - Splunk
- Logs from Application to logging pipeline
- Application (Logs producer) -> Client Agent (Log receiver) -> Logs Storage\Search\Visualization
- Client Agent (e.g. Splunk agent) installed at each node
- Logs Storage\Search\Visualization (e.g. Splunk)
The Monitoring Graphs constitutes of the following:
Average Response Time
p99 (99th percentile) Response Time
5xx Error Count
4xx Error Count
Response code percentage - 1xx, 2xx, 3xx, 4xx, 5xx
Throughput
Apdex
Container Count
CPU Usage
RAM Usage
Threadpool Usage
API Dashboard
Average Response Time
p99 (99th percentile) Response Time
5xx Error Count
Splunk Dashboard
Exception from Logs