Claude Code session 1.
This commit is contained in:
@@ -1,7 +1,344 @@
|
||||
# output service
|
||||
# Output Service
|
||||
|
||||
Service to receive output from ping_service instances.
|
||||
Builds database of mappable nodes.
|
||||
Updates input services address lists with all working endpoints and working hops from the traces.
|
||||
HTTP service that receives ping and traceroute results from distributed `ping_service` nodes, stores them in SQLite databases with automatic rotation, extracts intermediate hops from traceroute data, and feeds them back to `input_service`.
|
||||
|
||||
Have reporting api endpoints for the manager to monitor the progress.
|
||||
## Purpose
|
||||
|
||||
- **Data Collection**: Store ping results and traceroute paths from multiple ping_service instances
|
||||
- **Hop Discovery**: Extract intermediate hop IPs from traceroute data
|
||||
- **Feedback Loop**: Send discovered hops to input_service to grow the target pool organically
|
||||
- **Data Management**: Automatic database rotation and retention policy
|
||||
- **Observability**: Expose metrics and statistics for monitoring
|
||||
|
||||
## Features
|
||||
|
||||
- **Multi-Instance Ready**: Each instance maintains its own SQLite database
|
||||
- **Automatic Rotation**: Databases rotate weekly OR when reaching 100MB (whichever first)
|
||||
- **Retention Policy**: Keeps 5 most recent database files, auto-deletes older ones
|
||||
- **Hop Deduplication**: Tracks sent hops to minimize duplicate network traffic to input_service
|
||||
- **Manual Operations**: API endpoints for manual rotation and database dumps
|
||||
- **Health Monitoring**: Prometheus metrics, stats, and health checks
|
||||
|
||||
## Requirements
|
||||
|
||||
- Go 1.25+
|
||||
- SQLite3 (via go-sqlite3 driver)
|
||||
|
||||
## Building
|
||||
|
||||
```bash
|
||||
cd output_service
|
||||
go build -o output_service main.go
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic
|
||||
|
||||
```bash
|
||||
./output_service
|
||||
```
|
||||
|
||||
Starts on port 8081 for results, port 8091 for health checks.
|
||||
|
||||
### With Custom Configuration
|
||||
|
||||
```bash
|
||||
./output_service \
|
||||
--port=8082 \
|
||||
--health-port=8092 \
|
||||
--input-url=http://input-service:8080/hops \
|
||||
--db-dir=/var/lib/output_service \
|
||||
--max-size-mb=200 \
|
||||
--rotation-days=14 \
|
||||
--keep-files=10 \
|
||||
--verbose
|
||||
```
|
||||
|
||||
### Command Line Flags
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--port` | 8081 | Port for receiving results |
|
||||
| `--health-port` | 8091 | Port for health/metrics endpoints |
|
||||
| `--input-url` | `http://localhost:8080/hops` | Input service URL for hop submission |
|
||||
| `--db-dir` | `./output_data` | Directory for database files |
|
||||
| `--max-size-mb` | 100 | Max database size (MB) before rotation |
|
||||
| `--rotation-days` | 7 | Rotate database after N days |
|
||||
| `--keep-files` | 5 | Number of database files to retain |
|
||||
| `-v, --verbose` | false | Enable verbose logging |
|
||||
| `--version` | - | Show version |
|
||||
| `--help` | - | Show help |
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Main Service (Port 8081)
|
||||
|
||||
#### `POST /results`
|
||||
Receive ping results from ping_service nodes.
|
||||
|
||||
**Request Body**: JSON array of ping results
|
||||
```json
|
||||
[
|
||||
{
|
||||
"ip": "8.8.8.8",
|
||||
"sent": 4,
|
||||
"received": 4,
|
||||
"packet_loss": 0,
|
||||
"avg_rtt": 15000000,
|
||||
"timestamp": "2026-01-07T22:30:00Z",
|
||||
"traceroute": {
|
||||
"method": "icmp",
|
||||
"completed": true,
|
||||
"hops": [
|
||||
{"ttl": 1, "ip": "192.168.1.1", "rtt": 2000000},
|
||||
{"ttl": 2, "ip": "10.0.0.1", "rtt": 5000000},
|
||||
{"ttl": 3, "ip": "8.8.8.8", "rtt": 15000000}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"status": "ok",
|
||||
"received": 1
|
||||
}
|
||||
```
|
||||
|
||||
#### `POST /rotate`
|
||||
Manually trigger database rotation.
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"status": "rotated",
|
||||
"file": "results_2026-01-07_22-30-45.db"
|
||||
}
|
||||
```
|
||||
|
||||
#### `GET /dump`
|
||||
Download current SQLite database file.
|
||||
|
||||
**Response**: Binary SQLite database file
|
||||
|
||||
### Health Service (Port 8091)
|
||||
|
||||
#### `GET /health`
|
||||
Overall health status and statistics.
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"version": "0.0.1",
|
||||
"uptime": "2h15m30s",
|
||||
"stats": {
|
||||
"total_results": 15420,
|
||||
"successful_pings": 14890,
|
||||
"failed_pings": 530,
|
||||
"hops_discovered": 2341,
|
||||
"hops_sent": 2341,
|
||||
"last_result_time": "2026-01-07T22:30:15Z",
|
||||
"current_db_file": "results_2026-01-07.db",
|
||||
"current_db_size": 52428800,
|
||||
"last_rotation": "2026-01-07T00:00:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### `GET /ready`
|
||||
Readiness check (verifies database connectivity).
|
||||
|
||||
**Response**: `200 OK` if ready, `503 Service Unavailable` if not
|
||||
|
||||
#### `GET /metrics`
|
||||
Prometheus-compatible metrics.
|
||||
|
||||
**Response** (text/plain):
|
||||
```
|
||||
# HELP output_service_total_results Total number of results processed
|
||||
# TYPE output_service_total_results counter
|
||||
output_service_total_results 15420
|
||||
|
||||
# HELP output_service_successful_pings Total successful pings
|
||||
# TYPE output_service_successful_pings counter
|
||||
output_service_successful_pings 14890
|
||||
...
|
||||
```
|
||||
|
||||
#### `GET /stats`
|
||||
Detailed statistics in JSON format.
|
||||
|
||||
**Response**: Same as `stats` object in `/health`
|
||||
|
||||
#### `GET /recent?limit=100&ip=8.8.8.8`
|
||||
Query recent ping results.
|
||||
|
||||
**Query Parameters**:
|
||||
- `limit` (optional): Max results to return (default 100, max 1000)
|
||||
- `ip` (optional): Filter by specific IP address
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": 12345,
|
||||
"ip": "8.8.8.8",
|
||||
"sent": 4,
|
||||
"received": 4,
|
||||
"packet_loss": 0,
|
||||
"avg_rtt": 15000000,
|
||||
"timestamp": "2026-01-07T22:30:00Z"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Database Schema
|
||||
|
||||
### `ping_results`
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| id | INTEGER | Primary key |
|
||||
| ip | TEXT | Target IP address |
|
||||
| sent | INTEGER | Packets sent |
|
||||
| received | INTEGER | Packets received |
|
||||
| packet_loss | REAL | Packet loss percentage |
|
||||
| avg_rtt | INTEGER | Average RTT (nanoseconds) |
|
||||
| timestamp | DATETIME | Ping timestamp |
|
||||
| error | TEXT | Error message if failed |
|
||||
| created_at | DATETIME | Record creation time |
|
||||
|
||||
**Indexes**: `ip`, `timestamp`
|
||||
|
||||
### `traceroute_results`
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| id | INTEGER | Primary key |
|
||||
| ping_result_id | INTEGER | Foreign key to ping_results |
|
||||
| method | TEXT | Traceroute method (icmp/tcp) |
|
||||
| completed | BOOLEAN | Whether trace completed |
|
||||
| error | TEXT | Error message if failed |
|
||||
|
||||
### `traceroute_hops`
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| id | INTEGER | Primary key |
|
||||
| traceroute_id | INTEGER | Foreign key to traceroute_results |
|
||||
| ttl | INTEGER | Time-to-live / hop number |
|
||||
| ip | TEXT | Hop IP address |
|
||||
| rtt | INTEGER | Round-trip time (nanoseconds) |
|
||||
| timeout | BOOLEAN | Whether hop timed out |
|
||||
|
||||
**Indexes**: `ip` (for hop discovery)
|
||||
|
||||
## Database Rotation
|
||||
|
||||
Rotation triggers automatically when **either** condition is met:
|
||||
- **Time**: Database age exceeds `rotation_days` (default 7 days)
|
||||
- **Size**: Database size exceeds `max_size_mb` (default 100MB)
|
||||
|
||||
Rotation process:
|
||||
1. Close current database connection
|
||||
2. Create new database with timestamp filename (`results_2026-01-07_22-30-45.db`)
|
||||
3. Initialize schema in new database
|
||||
4. Delete oldest database files if count exceeds `keep_files`
|
||||
|
||||
Manual rotation: `curl -X POST http://localhost:8081/rotate`
|
||||
|
||||
## Hop Discovery and Feedback
|
||||
|
||||
1. **Extraction**: For each traceroute, extract non-timeout hop IPs
|
||||
2. **Deduplication**: Track sent hops in memory to avoid re-sending
|
||||
3. **Submission**: HTTP POST to input_service `/hops` endpoint:
|
||||
```json
|
||||
{
|
||||
"hops": ["10.0.0.1", "172.16.5.3", "8.8.8.8"]
|
||||
}
|
||||
```
|
||||
4. **Statistics**: Track `hops_discovered` and `hops_sent` metrics
|
||||
|
||||
## Multi-Instance Deployment
|
||||
|
||||
Each output_service instance:
|
||||
- Maintains its **own SQLite database** in `db_dir`
|
||||
- Manages its **own rotation schedule** independently
|
||||
- Tracks its **own hop deduplication** (some duplicate hop submissions across instances are acceptable)
|
||||
- Can receive results from **multiple ping_service nodes**
|
||||
|
||||
For central data aggregation:
|
||||
- Use `/dump` endpoint to collect database files from all instances
|
||||
- Merge databases offline for analysis/visualization
|
||||
- Or use shared network storage for `db_dir` (with file locking considerations)
|
||||
|
||||
## Integration with ping_service
|
||||
|
||||
Configure ping_service to send results to output_service:
|
||||
|
||||
**`config.yaml`** (ping_service):
|
||||
```yaml
|
||||
output_file: "http://output-service:8081/results"
|
||||
```
|
||||
|
||||
## Integration with input_service
|
||||
|
||||
Output service expects input_service to have a `/hops` endpoint:
|
||||
|
||||
**Expected endpoint**: `POST /hops`
|
||||
**Payload**:
|
||||
```json
|
||||
{
|
||||
"hops": ["10.0.0.1", "172.16.5.3"]
|
||||
}
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
**Check health**:
|
||||
```bash
|
||||
curl http://localhost:8091/health
|
||||
```
|
||||
|
||||
**View metrics**:
|
||||
```bash
|
||||
curl http://localhost:8091/metrics
|
||||
```
|
||||
|
||||
**Query recent failures**:
|
||||
```bash
|
||||
curl 'http://localhost:8091/recent?limit=50' | jq '.[] | select(.error != null)'
|
||||
```
|
||||
|
||||
**Download database backup**:
|
||||
```bash
|
||||
curl http://localhost:8081/dump -o backup.db
|
||||
```
|
||||
|
||||
## Development Testing
|
||||
|
||||
Use the Python demo output server to see example data format:
|
||||
|
||||
```bash
|
||||
cd output_service
|
||||
python3 http_ouput_demo.py # Note: file has typo in name
|
||||
```
|
||||
|
||||
## Graceful Shutdown
|
||||
|
||||
Press `Ctrl+C` for graceful shutdown with 10s timeout.
|
||||
|
||||
The service will:
|
||||
1. Stop accepting new requests
|
||||
2. Finish processing in-flight requests
|
||||
3. Close database connections cleanly
|
||||
4. Exit
|
||||
|
||||
## Version
|
||||
|
||||
Current version: **0.0.1**
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `github.com/mattn/go-sqlite3` - SQLite driver (requires CGO)
|
||||
|
||||
Reference in New Issue
Block a user