Makefile and agent start.

This commit is contained in:
Kalzu Rekku
2026-04-17 19:23:04 +03:00
parent 51e0355ec8
commit 99e0e0208c
14 changed files with 1060 additions and 1 deletions

View File

@@ -0,0 +1,55 @@
# Kattila Agent Implementation Plan
This document outlines the detailed architecture and implementation steps for the Go-based Kattila Agent.
## Overview
The Kattila Agent continuously gathers network topology information from the host OS (using `ip` and `wg` commands), cryptographically signs the data, and pushes it to the Kattila Manager. If direct communication fails, it uses peer scanning to find a relay path.
## User Review Required
> [!IMPORTANT]
> - Do we assume `wg`, `ip` commands are always available in the `PATH` of the agent?
> - The TXT record is returned with wrapping quotes (e.g., `"955f333e5b9cc..."`). The agent will strip these quotes. Is the PSK used exactly as-is for the HMAC key?
> - For Wireguard peer scanning during relay fallback: Will the agent scan the *entire subnet* of `allowed ips` (e.g., `172.16.100.8/29`) to find other agents on port `5087`, or just guess based on endpoints? Scanning the small subnet is usually reliable.
> - We should parse `wg show all dump` instead of raw `wg` if possible, since it's much easier and safer to parse TSV outputs programmatically. Is it okay to use `wg show all dump` instead of human-readable `wg`?
## Proposed Architecture / Packages
### 1. `config` Package
- Responsibilities: Load `.env` file containing `DNS`, `MANAGER_URL`, etc. Provide access to environment configurations.
- Store the agent's unique ID (which is generated and saved locally on first run to persist across restarts until `/status/reset`).
### 2. `security` Package
- **Key Discovery**: Periodically resolve the TXT record of the configured `DNS` name to get the Bootstrap PSK. Strip any surrounding quotes. Keep a history of the current and two previous keys.
- **HMAC Generation**: Provide a function to calculate `HMAC-SHA256` of JSON payloads using the current PSK.
- **Nonce Generation**: Generate cryptographically secure base64 strings for the `nonce` field.
### 3. `network` Package
- Execute OS commands and parse their outputs:
- `ip -j a`: Parse the JSON output into `Interface` structs.
- `ip -j -4 r`: Parse the JSON output into `Route` structs.
- `wg show all dump`: Parse the TSV wireguard connections. If `wg` human-readable parsing is strictly required, we will build a custom state-machine parser for the provided format.
- Maintain a gathering function `GatherStatus()` that bundles all these details into the expected `data` payload.
### 4. `api` Package (Agent HTTP Server)
- Runs an HTTP server on `0.0.0.0:5087` using standard `net/http`.
- Endpoints:
- `GET /status/healthcheck`: Return `200 OK {"status": "ok"}`
- `POST /status/reset`: Delete local `agent_id` state, delete internal cache, and trigger a fresh registration loop.
- `GET /status/peer`: Return local network info so peers can decide routing paths.
- `POST /status/relay`: Accept relay payloads, ensure own `agent_id` is not in `relay_path` (loop detection), and forward to manager.
### 5. `reporter` Package (Main Loop)
- Triggers every 30 seconds.
- Gathers data from `network` package.
- Wraps it in the report envelope: `version`, `tick`, `type`, `nonce`, `timestamp`, `agent_id`, `hmac`.
- Sends POST request to Manager.
- **Relay Fallback**: On failure, queries local wireguard interfaces, pings port `5087` on known subnets, and attempts to find a working peer to relay through.
## Verification Plan
### Automated testing
- Write unit tests for parsing the provided `ip` and `wg` example files.
- Write unit test for the PSK rotation logic.
### Manual Verification
- Run the agent locally and verify the logs show successful gathering of interfaces and routes.
- Force a bad manager URL and observe logs indicating relay peer scanning behavior.