Use case · latency

Detecting server latency degradation

A server's average response time barely changes — 44.7ms on Day 1, 45.1ms on Day 14. On a dashboard, the line looks flat. But beneath that flat line, the system is destabilizing. The noise envelope is expanding, tail latencies are growing, and the distribution is shifting from predictable to erratic.

This is real data from the Numenta Anomaly Benchmark — a server in Amazon’s East Coast datacenter that ended in complete system failure from a documented AWS API outage. A winkComposer flow extracts 4-hour statistical fingerprints from raw latency and fuses them into a health assessment that flags structural degradation 4 days before the first visible anomaly.

Drag the slider and watch it detect what the raw chart hides.

Loading server health demo...

What You’re Seeing

The top chart shows raw request latency (cyan) tracked by an exponentially smoothed mean (lavender dashed). The filled envelope is the adaptive floor-to-ceiling range — tight when the server is stable, widening as latency behaviour becomes erratic.

Below it, the stddev sparkline shows the standard deviation computed over non-overlapping 4-hour windows. This is where the hidden structure emerges: the mean barely moved (0.3ms over 14 days), but the window-to-window standard deviation climbed 21% from the first quarter to the third — invisible in the raw signal, clear in the sparkline.

The vertical marker on the sparkline shows where a Page-Hinkley change-point detector fires — the noise floor has shifted to a new regime. With moderate sensitivity, this fires at Day 4.2, roughly 4 days before the first labeled anomaly at Day 8.2.

Three labeled anomalies appear in the dataset: a transient glitch (Day 8.2), a burst overload (Day 12.8), and system failure (Day 15.0). The assessment card tracks the escalation from Normal through Monitor and Degraded as the statistical fingerprints deteriorate.

The key signals table shows each signal’s intensity and persistence. The leading signal is volatility (window standard deviation), not the mean — exactly the kind of structural change that threshold-based alerting on the mean would miss entirely.

How It Works

One flow, five building blocks — each extracts a different perspective on server health:

Tumbling window statistics compute exact mean, standard deviation, range, kurtosis, and coefficient of variation over non-overlapping 4-hour windows of raw latency. No exponential decay, no tuning — pure batch statistics at streaming speed. This is where the hidden structure emerges: the standard deviation climbing from 1.63 to 2.02 while the mean barely moves.

Window range computes the spread within each window — the difference between the highest and lowest latency reading. A widening range means tail latencies are growing, even if the average holds steady.

Exponential statistics provide the visual layer — smoothed signal and adaptive envelope for the chart. The floor-to-ceiling envelope tracks recent extremes and makes the raw signal’s behaviour legible.

Change-point detection runs a Page-Hinkley test on the window standard deviation. When the noise floor shifts to a new level, it fires — the vertical marker on the sparkline. With moderate sensitivity (delta 0.008, lambda 0.8), it detects the first structural shift at Day 4.2 with zero false alarms in the first three days.

The health assessment fuses four signals into one verdict:

Volatility — window standard deviation exceeding the 1.63ms baseline (the dominant signal, weighted 1.0)
Tail latency — window range exceeding the 7.6ms baseline (how wide the extremes swing)
Distribution shape — excess kurtosis rising above zero (outlier frequency increasing; weighted 0.5 because it is impulsive, not progressive)
Consistency — coefficient of variation exceeding the 3.6% baseline (reliability dropping)

During the first few windows, the node calibrates a verdict scale from this server’s own baseline. Consistently jittery servers and consistently quiet ones are both judged against their own normal, not an absolute bar. Evidence for each signal then builds while the signal is outside its normal range and fades as it returns. Combined evidence determines the verdict — Healthy, Monitor, Degraded, or Critical — and the recommended action. Each signal’s intensity and persistence appear alongside.

Why Tumbling Windows?

Exponential smoothing adapts — it tracks the signal and decays old information. That is ideal for real-time alerting on individual samples. But for detecting structural change, you need statistics that do not adapt.

A tumbling window computes the exact standard deviation of those 48 readings — no weighting, no decay. When that number changes from window to window, the change is real. The standard deviation of Window 20 is the true standard deviation of those 4 hours of latency. When it rises from 1.63 to 1.94, a genuine structural shift has occurred — not an artefact of exponential forgetting.

References

Ahmad, S., Lavin, A., Purdy, S. & Agha, Z. (2017). Unsupervised real-time anomaly detection for streaming data. Neurocomputing, 262, 134–147. doi:10.1016/j.neucom.2017.04.070
Page, E.S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115. doi:10.2307/2333009
Dataset: Numenta Anomaly Benchmark (NAB) — realKnownCause/ec2_request_latency_system_failure.csv. 4,032 samples at 5-minute intervals, 14 days. Server in Amazon’s East Coast datacenter; documented system failure from AWS API outage.

Next Steps

Detecting Bearing Failure — same approach, physical asset health
Catching Process Drift — catalyst degradation in a chemical reactor
Composition Patterns — the patterns behind this pipeline