kattila.status
A lightweight virtual network topology monitor for multi-layer, multi-network environments — WireGuard meshes, VPN overlays, and hybrid physical/virtual networks.
Follows a push-based Agent → Manager architecture. Agents run on each node, gather system and network telemetry, and push it to a central Manager. If the Manager is unreachable, agents relay reports through other agents on the same WireGuard subnet.
Architecture
┌─────────────────────────────────────────────────────────┐
│ Agents (Go, Linux) │
│ │
│ Agent A ──── HTTP/JSON ──────────────────────┐ │
│ Agent B ──── relay → Agent A → Manager ─────┤ │
│ Agent C ──── relay → Agent B → Agent A ──────┘ │
└──────────────────────────────────────┬──────────────────┘
│
┌────────▼────────┐
│ Manager │
│ (Python/Flask) │
│ SQLite WAL DB │
└─────────────────┘
Each agent reports every 30 seconds. Reports are authenticated with HMAC-SHA256 using a fleet-wide Pre-Shared Key (PSK) fetched via a DNS TXT record. The relay mechanism supports up to 3 hops with loop detection.
Repository Structure
kattila.status/
├── agent/ # Go agent
│ ├── main.go # Entry point + CLI flags
│ ├── config/ # .env / env var loading, AgentID persistence
│ ├── network/ # System data collection (interfaces, routes, WG peers)
│ ├── reporter/ # Report building, push to manager, relay logic
│ ├── security/ # PSK via DNS, HMAC signing, nonce generation
│ ├── api/ # Agent HTTP server (peer/relay/healthcheck endpoints)
│ ├── models/ # Shared data types (Report, SystemData, WGPeer, …)
│ └── bin/ # Compiled binaries (gitignored)
├── manager/ # Python manager
│ ├── app.py # Flask app and API endpoints
│ ├── db.py # SQLite schema, queries
│ ├── processor.py # Report ingestion + topology inference
│ ├── security.py # PSK history, HMAC verification, nonce/timestamp checks
│ └── requirements.txt
├── Makefile
├── .env # Local config (gitignored)
└── DESIGN.md # Full architecture and protocol specification
Getting Started
Prerequisites
| Component | Requirement |
|---|---|
| Agent | Go 1.21+, Linux |
| Manager | Python 3.11+, pip |
| Both | A DNS TXT record for PSK distribution |
1. Configuration
Copy or create a .env file in the repo root (it is gitignored):
DNS=kattila.example.com # DNS TXT record holding the fleet PSK
MANAGER_URL=http://10.0.0.1:5086 # Agent: where to push reports
Both the agent and manager load this file automatically on startup. Environment variables override .env values.
2. PSK Setup
The fleet PSK is discovered via a DNS TXT record. Set a TXT record on your domain:
kattila.example.com. 300 IN TXT "your-secret-psk-value"
Both the agent and manager must be able to resolve this record. The manager retries verification against the current + 2 previous PSKs to handle propagation delays during key rotation.
3. Build the Agent
make build-agent
This cross-compiles for both amd64 and arm64:
agent/bin/agent-amd64
agent/bin/agent-arm64
Note
: Requires Go in your
$PATH. If installed to a non-standard location (e.g.~/.local/go/bin/go), run:PATH="$HOME/.local/go/bin:$PATH" make build-agent
4. Run the Manager
make setup-manager # Create venv and install dependencies (once)
make run-manager # Start the Flask server on port 5086
5. Deploy the Agent
Copy the binary and .env to each node, then run:
./agent-amd64
The agent will generate and persist its agent_id.txt on first run.
Debug Tooling
The agent binary supports several CLI flags for diagnosing issues without running the full daemon:
-sysinfo
Collect and print all system telemetry as formatted JSON. Useful for verifying what the agent sees — interfaces, WireGuard peers, routes, load average:
./agent -sysinfo
-dump <file>
Run a single full data collection cycle, build a complete signed report payload (including HMAC, Nonce, AgentID), and write it to a file. This is the exact JSON that would be sent to the manager:
./agent -dump /tmp/report.json
cat /tmp/report.json
-discover
Actively probe all IPs from WireGuard AllowedIPs on port 5087 to find other live Kattila agents on the same mesh — the same discovery logic used by the relay mechanism:
./agent -discover
Agent API
The agent exposes a small HTTP server on port 5087 for peer communication:
| Endpoint | Method | Description |
|---|---|---|
/status/healthcheck |
GET | Agent liveness probe |
/status/peer |
GET | Returns local interface/route info (used by relay discovery) |
/status/relay |
POST | Accepts an enveloped report to forward toward the manager |
/status/reset |
POST | Wipes local state and generates a new agent_id |
Manager API
The manager listens on port 5086:
| Endpoint | Method | Description |
|---|---|---|
/status/updates |
POST | Receive periodic reports from agents |
/status/register |
POST | First-contact endpoint; issues an agent_id |
/status/healthcheck |
GET | Manager liveness probe |
/status/agents |
GET | List all known agents and their status |
/status/alarms |
GET | Fetch active network anomalies |
/status/admin/reset |
POST | Reset a specific agent or fleet state |
Security Model
- Authentication: HMAC-SHA256 over the
datapayload, signed with the fleet PSK. - Key distribution: PSK fetched from a DNS TXT record, refreshed hourly.
- Key rotation: Manager accepts current + 2 previous PSKs to allow propagation time.
- Replay protection: Monotonic tick counter + 120-entry nonce sliding window.
- Clock skew: Maximum 10-minute allowance between agent and manager timestamps.
- Relay loop detection: Agents check
relay_pathfor their ownagent_idand drop looping messages.
Makefile Reference
make build-agent # Cross-compile agent for amd64 + arm64
make setup-manager # Create Python venv and install dependencies
make run-manager # Start the manager Flask server
make clean # Remove built binaries, venv, and manager DB
Database
The manager uses a SQLite database (kattila_manager.db) with WAL mode. Key tables:
| Table | Purpose |
|---|---|
agents |
Fleet registry — presence, hostname, last seen |
reports |
Full report audit log |
agent_interfaces |
Network interface snapshots per agent |
topology_edges |
Inferred links between agents (WireGuard, relay, physical) |
alarms |
Event log for topology changes and anomalies |
See DESIGN.md for the full schema.