Monitoring Pipeline

The Monitoring Pipeline constitutes of following components.

Infrastructure Metric

- Resource Metric: Memory, CPU

- Cluster Metric: Nodes count

- LB Metric: 5xx Error rate, etc

Application Metric

- Metric generated by Application - response time, status code, request code (aka throughput)

- Application -> Client Agent -> Metric DB -> Metric Dashboard

- Client agent (e.g. statsd) installed at each node

- Metric DB (e.g. Graphite, New Relic) Storage for metric

- Metric Dashboard (e.g. New Relic, Grafana) Visualization graphs

Logs - Splunk

- Logs from Application to logging pipeline

- Application (Logs producer) -> Client Agent (Log receiver) -> Logs Storage\Search\Visualization

- Client Agent (e.g. Splunk agent) installed at each node

- Logs Storage\Search\Visualization (e.g. Splunk)

The Monitoring Graphs constitutes of the following:

    • Average Response Time

    • p99 (99th percentile) Response Time

    • 5xx Error Count

    • 4xx Error Count

    • Response code percentage - 1xx, 2xx, 3xx, 4xx, 5xx

    • Throughput

    • Apdex

    • Container Count

    • CPU Usage

    • RAM Usage

    • Threadpool Usage

API Dashboard

    • Average Response Time

    • p99 (99th percentile) Response Time

    • 5xx Error Count

Splunk Dashboard

    • Exception from Logs