Skip to content

AutoCon5 · Hands-on workshop

Modern Network Observability

One workday on a new on-call rotation. Four hours, one laptop.

You arrive on a new on-call rotation. Over four hours you learn the lab's telemetry, build the dashboard your team needed yesterday, and watch the automation handle a real alert — with a senior engineer over your shoulder until lunch. By the time you close the laptop you've queried real-shape telemetry, made a Grafana panel answer an operational question, walked through how Alertmanager → Prefect routes a live alert, and decided which paths you'd trust an LLM-assisted RCA on at 02:14.

Open the workshop Quickstart Install Docker and uv

~4 hours total (with breaks) ~80% hands-on Whole stack runs locally — no shared backend, no live gear


What you'll do

  • Part 1 — Telemetry and queries


    Morning of your first deep day on the rotation. Walk the lab's metrics and logs with your senior buddy. Find the broken peer. Bridge a metric anomaly to the log line that explains it.

    Open Part 1

  • Part 2 — Dashboards


    Mid-morning. A post-mortem email lands — last night's page lost ten minutes because a flap-rate panel didn't exist. Build it now, with thresholds matching the alert rule, while the team is still in the room.

    Open Part 2

  • Part 3 — Alerts, automation, AI


    Late morning. A real alert lands and your senior narrates how the workflow handles it. Walk the four canonical paths, toggle the AI RCA step, and decide what you'd trust the LLM narrative on.

    Open Part 3

  • Advanced — The 02:14 page


    The optional capstone. Hours after the senior signs off, your phone rings. Triage with PromQL and LogQL, contain with maintenance, fix the root cause, write the runbook. End-to-end, alone on the rotation.

    Open the capstone


Why this workshop is different

  • One continuous investigation


    Not four disconnected exercises. The same broken peer, the same dashboard, the same alert — followed across the morning. The story carries the technique.

  • Real-shape telemetry


    Synthetic telemetry that mirrors what real gNMI streams and syslog UPDOWN events look like, with Telegraf normalizing the inputs into the canonical schema you'd query in production.

  • Senior-engineer voice


    The guides talk to you the way a good buddy on rotation would. Story beats up front, click-and-run when it's time to type, expected output for every command.

  • Your laptop is enough


    No shared backend. No real or containerized network devices. ~8 GB of RAM and Docker is all you need — nobs autocon5 up brings up the entire stack.

  • Concrete expected values


    Every exercise tells you what to see when you run it. If your numbers disagree, that's a real signal, not "did I type it right?"

  • The 02:14 capstone


    An optional final exercise: the page that lands when the senior is gone. End-to-end on-call investigation, designed for the attendee who wants to stress-test what they learned.


Get the lab on your laptop

Three commands, after Docker and uv are installed:

git clone https://github.com/network-observability/workshops.git
cd workshops
uv run nobs autocon5 up

The first up pulls 3–5 GB of images. After that, restarts are seconds.

New to Docker or uv?

The Install Docker and uv page has per-platform pointers and a nobs preflight command that catches the common gotchas before the workshop starts. Run it the night before — image pulls are the slow step.

What runs locally

Prometheus, Loki, Grafana, Alertmanager, Telegraf, Vector, Infrahub, Prefect, a FastAPI webhook receiver, and sonda generating the synthetic telemetry. About 21 containers, ~5.5 GB of RAM while the stack is running, fully torn down by nobs autocon5 destroy.

What if I've never written PromQL or LogQL?

A sketch-level idea of "metrics database" and "log database" is enough. Part 1 builds both query languages from first principles against live data — your senior is walking you through it.


Want the deeper lab?

The companion repo network-observability-lab is the book's full chapter-by-chapter playground — every collector, every variant, every scenario, with real cEOS / SR Linux containers in the loop. Larger surface area, more RAM, more network-engineering depth.

This workshops repo is the opposite trade-off: one CLI, one docker compose, one observability stack, one investigation arc. Pick it up if you want a tight four-hour on-ramp; pick the lab up if you want the full book experience.