AutoCon5 · Hands-on workshop

Modern Network Observability¶

One workday on a new on-call rotation. Four hours, one laptop.

You arrive on a new on-call rotation. Over four hours you learn the lab's telemetry, build the dashboard your team needed yesterday, and watch the automation handle a real alert — with a senior engineer over your shoulder until lunch. By the time you close the laptop you've queried real-shape telemetry, made a Grafana panel answer an operational question, walked through how Alertmanager → Prefect routes a live alert, and decided which paths you'd trust an LLM-assisted RCA on at 02:14.

Open the workshop Quickstart Install Docker and uv

~4 hours total (with breaks) ~80% hands-on Whole stack runs locally — no shared backend, no live gear

What you'll do¶

Part 1 — Telemetry and queries

Morning of your first deep day on the rotation. Walk the lab's metrics and logs with your senior buddy. Find the broken peer. Bridge a metric anomaly to the log line that explains it.

Open Part 1
Part 2 — Dashboards and Alerts

Mid-morning. A post-mortem email lands — last night's page lost ten minutes because a flap-rate panel didn't exist. Build it now, with thresholds matching the alert rule, while the team is still in the room.

Open Part 2
Part 3 — Alert response, Automation and AI

Late morning. A real alert lands and your senior narrates how the workflow handles it. Walk the four canonical paths, toggle the AI RCA step, and decide what you'd trust the LLM narrative on.

Open Part 3
Advanced — The 02:14 page

The optional capstone. Hours after the senior signs off, your phone rings. Triage with PromQL and LogQL, contain with maintenance, fix the root cause, write the runbook. End-to-end, alone on the rotation.

Open the capstone

Why this workshop is different¶

One continuous investigation

Not four disconnected exercises. The same broken peer, the same dashboard, the same alert — followed across the morning. The story carries the technique.
Real-shape telemetry

Synthetic telemetry that mirrors what real gNMI streams and syslog UPDOWN events look like, with Telegraf normalizing the inputs into the canonical schema you'd query in production.
Senior-engineer voice

The guides talk to you the way a good buddy on rotation would. Story beats up front, click-and-run when it's time to type, expected output for every command.
Your laptop is enough

No shared backend. No real or containerized network devices. ~8 GB of RAM and Docker is all you need — nobs autocon5 up brings up the entire stack.
Concrete expected values

Every exercise tells you what to see when you run it. If your numbers disagree, that's a real signal, not "did I type it right?"
The 02:14 capstone

An optional final exercise: the page that lands when the senior is gone. End-to-end on-call investigation, designed for the attendee who wants to stress-test what they learned.

Get the lab on your laptop¶

Three commands, after Docker and uv are installed:

git clone https://github.com/network-observability/workshops.git
cd workshops
uv run nobs autocon5 up

The first up pulls 3–5 GB of images. After that, restarts are seconds.

New to Docker or uv?

The Install Docker and uv page has per-platform pointers and a nobs preflight command that catches the common gotchas before the workshop starts. Run it the night before — image pulls are the slow step.

What runs locally

Prometheus, Loki, Grafana, Alertmanager, Telegraf, Vector, Infrahub, Prefect, a FastAPI webhook receiver, and sonda generating the synthetic telemetry. About 21 containers, ~5.5 GB of RAM while the stack is running, fully torn down by nobs autocon5 destroy.

What if I've never written PromQL or LogQL?

A sketch-level idea of "metrics database" and "log database" is enough. Part 1 builds both query languages from first principles against live data — your senior is walking you through it.

Want the deeper lab?¶

The companion repo network-observability-lab is the book's full chapter-by-chapter playground — every collector, every variant, every scenario, with real cEOS / SR Linux containers in the loop. Larger surface area, more RAM, more network-engineering depth.

This workshops repo is the opposite trade-off: one CLI, one docker compose, one observability stack, one investigation arc. Pick it up if you want a tight four-hour on-ramp; pick the lab up if you want the full book experience.