Short answer: there isn’t a single built-in “average response time” value for Kong. You measure it from Kong’s latency telemetry and then compute the average. Kong exposes three latencies:
- Request latency (a.k.a. total): time from first byte in to last byte out.
- Upstream latency: time the upstream took to start responding.
- Kong latency: time spent inside Kong (routing + plugins). (Kong Docs)
Below are the quickest ways to get the average.
Option 1 — Prometheus plugin (fastest for averages/percentiles)
Enable the Prometheus plugin with config.latency_metrics=true. It exposes histograms:
kong_request_latency_ms_*(total)kong_upstream_latency_ms_*kong_kong_latency_ms_*(Kong Docs)
PromQL examples (last 5 minutes):
# Average total response time
sum(rate(kong_request_latency_ms_sum[5m]))
/
sum(rate(kong_request_latency_ms_count[5m]))
# Average per service
sum by (service) (rate(kong_request_latency_ms_sum[5m]))
/
sum by (service) (rate(kong_request_latency_ms_count[5m]))
# p95 latency (total)
histogram_quantile(
0.95,
sum by (le) (rate(kong_request_latency_ms_bucket[5m]))
)
Swap request for upstream or kong in the metric names to break the average down by where time is spent. (Kong Docs)
Option 2 — ELK/Kibana (since you’re using ELK)
Use a log plugin (e.g., File Log or HTTP Log) and ship the JSON to Elasticsearch. Each log line contains:
"latencies": {
"request": <total_ms>,
"proxy": <upstream_ms>,
"kong": <kong_ms>
}
In Kibana → Discover/Lens, set Average over latencies.request (or latencies.proxy / latencies.kong) to see the mean response time. (Kong Docs)
Option 3 — Quick curl check (headers)
Kong can add latency headers:
X-Kong-Proxy-Latency(≈ Kong time)X-Kong-Upstream-Latency(≈ upstream time)X-Kong-Total-Latency(total; with advanced tokens)
Enable with KONG_HEADERS=latency_tokens (or advanced_latency_tokens) and inspect with curl -i. (Kong Docs)
What’s a “good” number?
It’s workload-specific. Treat Kong latency as your gateway overhead (aim for low single-digit ms) and focus tuning on upstream latency (usually the dominant share). Use p95/p99 alongside the average to catch tail latency. (Kong Docs)
If you want, I can drop a Grafana panel (PromQL prewired) and a Kibana Lens example targeting latencies.request so you can paste them into your stack.