Claude Code session 1.

This commit is contained in:
Kalzu Rekku
2026-01-08 12:11:26 +02:00
parent c59523060d
commit 6db2e58dcd
20 changed files with 5497 additions and 83 deletions

View File

@@ -1,7 +1,344 @@
# output service
# Output Service
Service to receive output from ping_service instances.
Builds database of mappable nodes.
Updates input services address lists with all working endpoints and working hops from the traces.
HTTP service that receives ping and traceroute results from distributed `ping_service` nodes, stores them in SQLite databases with automatic rotation, extracts intermediate hops from traceroute data, and feeds them back to `input_service`.
Have reporting api endpoints for the manager to monitor the progress.
## Purpose
- **Data Collection**: Store ping results and traceroute paths from multiple ping_service instances
- **Hop Discovery**: Extract intermediate hop IPs from traceroute data
- **Feedback Loop**: Send discovered hops to input_service to grow the target pool organically
- **Data Management**: Automatic database rotation and retention policy
- **Observability**: Expose metrics and statistics for monitoring
## Features
- **Multi-Instance Ready**: Each instance maintains its own SQLite database
- **Automatic Rotation**: Databases rotate weekly OR when reaching 100MB (whichever first)
- **Retention Policy**: Keeps 5 most recent database files, auto-deletes older ones
- **Hop Deduplication**: Tracks sent hops to minimize duplicate network traffic to input_service
- **Manual Operations**: API endpoints for manual rotation and database dumps
- **Health Monitoring**: Prometheus metrics, stats, and health checks
## Requirements
- Go 1.25+
- SQLite3 (via go-sqlite3 driver)
## Building
```bash
cd output_service
go build -o output_service main.go
```
## Usage
### Basic
```bash
./output_service
```
Starts on port 8081 for results, port 8091 for health checks.
### With Custom Configuration
```bash
./output_service \
--port=8082 \
--health-port=8092 \
--input-url=http://input-service:8080/hops \
--db-dir=/var/lib/output_service \
--max-size-mb=200 \
--rotation-days=14 \
--keep-files=10 \
--verbose
```
### Command Line Flags
| Flag | Default | Description |
|------|---------|-------------|
| `--port` | 8081 | Port for receiving results |
| `--health-port` | 8091 | Port for health/metrics endpoints |
| `--input-url` | `http://localhost:8080/hops` | Input service URL for hop submission |
| `--db-dir` | `./output_data` | Directory for database files |
| `--max-size-mb` | 100 | Max database size (MB) before rotation |
| `--rotation-days` | 7 | Rotate database after N days |
| `--keep-files` | 5 | Number of database files to retain |
| `-v, --verbose` | false | Enable verbose logging |
| `--version` | - | Show version |
| `--help` | - | Show help |
## API Endpoints
### Main Service (Port 8081)
#### `POST /results`
Receive ping results from ping_service nodes.
**Request Body**: JSON array of ping results
```json
[
{
"ip": "8.8.8.8",
"sent": 4,
"received": 4,
"packet_loss": 0,
"avg_rtt": 15000000,
"timestamp": "2026-01-07T22:30:00Z",
"traceroute": {
"method": "icmp",
"completed": true,
"hops": [
{"ttl": 1, "ip": "192.168.1.1", "rtt": 2000000},
{"ttl": 2, "ip": "10.0.0.1", "rtt": 5000000},
{"ttl": 3, "ip": "8.8.8.8", "rtt": 15000000}
]
}
}
]
```
**Response**:
```json
{
"status": "ok",
"received": 1
}
```
#### `POST /rotate`
Manually trigger database rotation.
**Response**:
```json
{
"status": "rotated",
"file": "results_2026-01-07_22-30-45.db"
}
```
#### `GET /dump`
Download current SQLite database file.
**Response**: Binary SQLite database file
### Health Service (Port 8091)
#### `GET /health`
Overall health status and statistics.
**Response**:
```json
{
"status": "healthy",
"version": "0.0.1",
"uptime": "2h15m30s",
"stats": {
"total_results": 15420,
"successful_pings": 14890,
"failed_pings": 530,
"hops_discovered": 2341,
"hops_sent": 2341,
"last_result_time": "2026-01-07T22:30:15Z",
"current_db_file": "results_2026-01-07.db",
"current_db_size": 52428800,
"last_rotation": "2026-01-07T00:00:00Z"
}
}
```
#### `GET /ready`
Readiness check (verifies database connectivity).
**Response**: `200 OK` if ready, `503 Service Unavailable` if not
#### `GET /metrics`
Prometheus-compatible metrics.
**Response** (text/plain):
```
# HELP output_service_total_results Total number of results processed
# TYPE output_service_total_results counter
output_service_total_results 15420
# HELP output_service_successful_pings Total successful pings
# TYPE output_service_successful_pings counter
output_service_successful_pings 14890
...
```
#### `GET /stats`
Detailed statistics in JSON format.
**Response**: Same as `stats` object in `/health`
#### `GET /recent?limit=100&ip=8.8.8.8`
Query recent ping results.
**Query Parameters**:
- `limit` (optional): Max results to return (default 100, max 1000)
- `ip` (optional): Filter by specific IP address
**Response**:
```json
[
{
"id": 12345,
"ip": "8.8.8.8",
"sent": 4,
"received": 4,
"packet_loss": 0,
"avg_rtt": 15000000,
"timestamp": "2026-01-07T22:30:00Z"
}
]
```
## Database Schema
### `ping_results`
| Column | Type | Description |
|--------|------|-------------|
| id | INTEGER | Primary key |
| ip | TEXT | Target IP address |
| sent | INTEGER | Packets sent |
| received | INTEGER | Packets received |
| packet_loss | REAL | Packet loss percentage |
| avg_rtt | INTEGER | Average RTT (nanoseconds) |
| timestamp | DATETIME | Ping timestamp |
| error | TEXT | Error message if failed |
| created_at | DATETIME | Record creation time |
**Indexes**: `ip`, `timestamp`
### `traceroute_results`
| Column | Type | Description |
|--------|------|-------------|
| id | INTEGER | Primary key |
| ping_result_id | INTEGER | Foreign key to ping_results |
| method | TEXT | Traceroute method (icmp/tcp) |
| completed | BOOLEAN | Whether trace completed |
| error | TEXT | Error message if failed |
### `traceroute_hops`
| Column | Type | Description |
|--------|------|-------------|
| id | INTEGER | Primary key |
| traceroute_id | INTEGER | Foreign key to traceroute_results |
| ttl | INTEGER | Time-to-live / hop number |
| ip | TEXT | Hop IP address |
| rtt | INTEGER | Round-trip time (nanoseconds) |
| timeout | BOOLEAN | Whether hop timed out |
**Indexes**: `ip` (for hop discovery)
## Database Rotation
Rotation triggers automatically when **either** condition is met:
- **Time**: Database age exceeds `rotation_days` (default 7 days)
- **Size**: Database size exceeds `max_size_mb` (default 100MB)
Rotation process:
1. Close current database connection
2. Create new database with timestamp filename (`results_2026-01-07_22-30-45.db`)
3. Initialize schema in new database
4. Delete oldest database files if count exceeds `keep_files`
Manual rotation: `curl -X POST http://localhost:8081/rotate`
## Hop Discovery and Feedback
1. **Extraction**: For each traceroute, extract non-timeout hop IPs
2. **Deduplication**: Track sent hops in memory to avoid re-sending
3. **Submission**: HTTP POST to input_service `/hops` endpoint:
```json
{
"hops": ["10.0.0.1", "172.16.5.3", "8.8.8.8"]
}
```
4. **Statistics**: Track `hops_discovered` and `hops_sent` metrics
## Multi-Instance Deployment
Each output_service instance:
- Maintains its **own SQLite database** in `db_dir`
- Manages its **own rotation schedule** independently
- Tracks its **own hop deduplication** (some duplicate hop submissions across instances are acceptable)
- Can receive results from **multiple ping_service nodes**
For central data aggregation:
- Use `/dump` endpoint to collect database files from all instances
- Merge databases offline for analysis/visualization
- Or use shared network storage for `db_dir` (with file locking considerations)
## Integration with ping_service
Configure ping_service to send results to output_service:
**`config.yaml`** (ping_service):
```yaml
output_file: "http://output-service:8081/results"
```
## Integration with input_service
Output service expects input_service to have a `/hops` endpoint:
**Expected endpoint**: `POST /hops`
**Payload**:
```json
{
"hops": ["10.0.0.1", "172.16.5.3"]
}
```
## Monitoring
**Check health**:
```bash
curl http://localhost:8091/health
```
**View metrics**:
```bash
curl http://localhost:8091/metrics
```
**Query recent failures**:
```bash
curl 'http://localhost:8091/recent?limit=50' | jq '.[] | select(.error != null)'
```
**Download database backup**:
```bash
curl http://localhost:8081/dump -o backup.db
```
## Development Testing
Use the Python demo output server to see example data format:
```bash
cd output_service
python3 http_ouput_demo.py # Note: file has typo in name
```
## Graceful Shutdown
Press `Ctrl+C` for graceful shutdown with 10s timeout.
The service will:
1. Stop accepting new requests
2. Finish processing in-flight requests
3. Close database connections cleanly
4. Exit
## Version
Current version: **0.0.1**
## Dependencies
- `github.com/mattn/go-sqlite3` - SQLite driver (requires CGO)