2026-04-18 21:41:04 +03:00
2026-04-18 20:56:28 +03:00
2026-04-18 21:41:04 +03:00
2026-04-17 20:51:28 +03:00
2026-04-17 19:23:04 +03:00
2026-04-17 19:23:04 +03:00
2026-04-17 20:51:28 +03:00
2026-04-12 23:12:41 +03:00
2026-04-18 21:04:16 +03:00

kattila.status

A lightweight virtual network topology monitor for multi-layer, multi-network environments — WireGuard meshes, VPN overlays, and hybrid physical/virtual networks.

Follows a push-based Agent → Manager architecture. Agents run on each node, gather system and network telemetry, and push it to a central Manager. If the Manager is unreachable, agents relay reports through other agents on the same WireGuard subnet.


Architecture

┌─────────────────────────────────────────────────────────┐
│  Agents (Go, Linux)                                     │
│                                                          │
│   Agent A ──── HTTP/JSON ──────────────────────┐        │
│   Agent B ──── relay → Agent A → Manager  ─────┤        │
│   Agent C ──── relay → Agent B → Agent A ──────┘        │
└──────────────────────────────────────┬──────────────────┘
                                       │
                              ┌────────▼────────┐
                              │  Manager        │
                              │  (Python/Flask) │
                              │  SQLite WAL DB  │
                              └─────────────────┘

Each agent reports every 30 seconds. Reports are authenticated with HMAC-SHA256 using a fleet-wide Pre-Shared Key (PSK) fetched via a DNS TXT record. The relay mechanism supports up to 3 hops with loop detection.


Repository Structure

kattila.status/
├── agent/               # Go agent
│   ├── main.go          # Entry point + CLI flags
│   ├── config/          # .env / env var loading, AgentID persistence
│   ├── network/         # System data collection (interfaces, routes, WG peers)
│   ├── reporter/        # Report building, push to manager, relay logic
│   ├── security/        # PSK via DNS, HMAC signing, nonce generation
│   ├── api/             # Agent HTTP server (peer/relay/healthcheck endpoints)
│   ├── models/          # Shared data types (Report, SystemData, WGPeer, …)
│   └── bin/             # Compiled binaries (gitignored)
├── manager/             # Python manager
│   ├── app.py           # Flask app and API endpoints
│   ├── db.py            # SQLite schema, queries
│   ├── processor.py     # Report ingestion + topology inference
│   ├── security.py      # PSK history, HMAC verification, nonce/timestamp checks
│   └── requirements.txt
├── Makefile
├── .env                 # Local config (gitignored)
└── DESIGN.md            # Full architecture and protocol specification

Getting Started

Prerequisites

Component Requirement
Agent Go 1.21+, Linux
Manager Python 3.11+, pip
Both A DNS TXT record for PSK distribution

1. Configuration

Copy or create a .env file in the repo root (it is gitignored):

DNS=kattila.example.com      # DNS TXT record holding the fleet PSK
MANAGER_URL=http://10.0.0.1:5086  # Agent: where to push reports

Both the agent and manager load this file automatically on startup. Environment variables override .env values.

2. PSK Setup

The fleet PSK is discovered via a DNS TXT record. Set a TXT record on your domain:

kattila.example.com. 300 IN TXT "your-secret-psk-value"

Both the agent and manager must be able to resolve this record. The manager retries verification against the current + 2 previous PSKs to handle propagation delays during key rotation.

3. Build the Agent

make build-agent

This cross-compiles for both amd64 and arm64:

agent/bin/agent-amd64
agent/bin/agent-arm64

Note

: Requires Go in your $PATH. If installed to a non-standard location (e.g. ~/.local/go/bin/go), run: PATH="$HOME/.local/go/bin:$PATH" make build-agent

4. Run the Manager

make setup-manager   # Create venv and install dependencies (once)
make run-manager     # Start the Flask server on port 5086

5. Deploy the Agent

Copy the binary and .env to each node, then run:

./agent-amd64

The agent will generate and persist its agent_id.txt on first run.


Debug Tooling

The agent binary supports several CLI flags for diagnosing issues without running the full daemon:

-sysinfo

Collect and print all system telemetry as formatted JSON. Useful for verifying what the agent sees — interfaces, WireGuard peers, routes, load average:

./agent -sysinfo

-dump <file>

Run a single full data collection cycle, build a complete signed report payload (including HMAC, Nonce, AgentID), and write it to a file. This is the exact JSON that would be sent to the manager:

./agent -dump /tmp/report.json
cat /tmp/report.json

-discover

Actively probe all IPs from WireGuard AllowedIPs on port 5087 to find other live Kattila agents on the same mesh — the same discovery logic used by the relay mechanism:

./agent -discover

Agent API

The agent exposes a small HTTP server on port 5087 for peer communication:

Endpoint Method Description
/status/healthcheck GET Agent liveness probe
/status/peer GET Returns local interface/route info (used by relay discovery)
/status/relay POST Accepts an enveloped report to forward toward the manager
/status/reset POST Wipes local state and generates a new agent_id

Manager API

The manager listens on port 5086:

Endpoint Method Description
/status/updates POST Receive periodic reports from agents
/status/register POST First-contact endpoint; issues an agent_id
/status/healthcheck GET Manager liveness probe
/status/agents GET List all known agents and their status
/status/alarms GET Fetch active network anomalies
/status/admin/reset POST Reset a specific agent or fleet state

Security Model

  • Authentication: HMAC-SHA256 over the data payload, signed with the fleet PSK.
  • Key distribution: PSK fetched from a DNS TXT record, refreshed hourly.
  • Key rotation: Manager accepts current + 2 previous PSKs to allow propagation time.
  • Replay protection: Monotonic tick counter + 120-entry nonce sliding window.
  • Clock skew: Maximum 10-minute allowance between agent and manager timestamps.
  • Relay loop detection: Agents check relay_path for their own agent_id and drop looping messages.

Makefile Reference

make build-agent     # Cross-compile agent for amd64 + arm64
make setup-manager   # Create Python venv and install dependencies
make run-manager     # Start the manager Flask server
make clean           # Remove built binaries, venv, and manager DB

Database

The manager uses a SQLite database (kattila_manager.db) with WAL mode. Key tables:

Table Purpose
agents Fleet registry — presence, hostname, last seen
reports Full report audit log
agent_interfaces Network interface snapshots per agent
topology_edges Inferred links between agents (WireGuard, relay, physical)
alarms Event log for topology changes and anomalies

See DESIGN.md for the full schema.

Description
No description provided
Readme 103 KiB
Languages
Python 67.1%
Go 30.8%
Makefile 2.1%