345 lines
8.6 KiB
Markdown
345 lines
8.6 KiB
Markdown
# Output Service
|
|
|
|
HTTP service that receives ping and traceroute results from distributed `ping_service` nodes, stores them in SQLite databases with automatic rotation, extracts intermediate hops from traceroute data, and feeds them back to `input_service`.
|
|
|
|
## Purpose
|
|
|
|
- **Data Collection**: Store ping results and traceroute paths from multiple ping_service instances
|
|
- **Hop Discovery**: Extract intermediate hop IPs from traceroute data
|
|
- **Feedback Loop**: Send discovered hops to input_service to grow the target pool organically
|
|
- **Data Management**: Automatic database rotation and retention policy
|
|
- **Observability**: Expose metrics and statistics for monitoring
|
|
|
|
## Features
|
|
|
|
- **Multi-Instance Ready**: Each instance maintains its own SQLite database
|
|
- **Automatic Rotation**: Databases rotate weekly OR when reaching 100MB (whichever first)
|
|
- **Retention Policy**: Keeps 5 most recent database files, auto-deletes older ones
|
|
- **Hop Deduplication**: Tracks sent hops to minimize duplicate network traffic to input_service
|
|
- **Manual Operations**: API endpoints for manual rotation and database dumps
|
|
- **Health Monitoring**: Prometheus metrics, stats, and health checks
|
|
|
|
## Requirements
|
|
|
|
- Go 1.25+
|
|
- SQLite3 (via go-sqlite3 driver)
|
|
|
|
## Building
|
|
|
|
```bash
|
|
cd output_service
|
|
go build -o output_service main.go
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Basic
|
|
|
|
```bash
|
|
./output_service
|
|
```
|
|
|
|
Starts on port 8081 for results, port 8091 for health checks.
|
|
|
|
### With Custom Configuration
|
|
|
|
```bash
|
|
./output_service \
|
|
--port=8082 \
|
|
--health-port=8092 \
|
|
--input-url=http://input-service:8080/hops \
|
|
--db-dir=/var/lib/output_service \
|
|
--max-size-mb=200 \
|
|
--rotation-days=14 \
|
|
--keep-files=10 \
|
|
--verbose
|
|
```
|
|
|
|
### Command Line Flags
|
|
|
|
| Flag | Default | Description |
|
|
|------|---------|-------------|
|
|
| `--port` | 8081 | Port for receiving results |
|
|
| `--health-port` | 8091 | Port for health/metrics endpoints |
|
|
| `--input-url` | `http://localhost:8080/hops` | Input service URL for hop submission |
|
|
| `--db-dir` | `./output_data` | Directory for database files |
|
|
| `--max-size-mb` | 100 | Max database size (MB) before rotation |
|
|
| `--rotation-days` | 7 | Rotate database after N days |
|
|
| `--keep-files` | 5 | Number of database files to retain |
|
|
| `-v, --verbose` | false | Enable verbose logging |
|
|
| `--version` | - | Show version |
|
|
| `--help` | - | Show help |
|
|
|
|
## API Endpoints
|
|
|
|
### Main Service (Port 8081)
|
|
|
|
#### `POST /results`
|
|
Receive ping results from ping_service nodes.
|
|
|
|
**Request Body**: JSON array of ping results
|
|
```json
|
|
[
|
|
{
|
|
"ip": "8.8.8.8",
|
|
"sent": 4,
|
|
"received": 4,
|
|
"packet_loss": 0,
|
|
"avg_rtt": 15000000,
|
|
"timestamp": "2026-01-07T22:30:00Z",
|
|
"traceroute": {
|
|
"method": "icmp",
|
|
"completed": true,
|
|
"hops": [
|
|
{"ttl": 1, "ip": "192.168.1.1", "rtt": 2000000},
|
|
{"ttl": 2, "ip": "10.0.0.1", "rtt": 5000000},
|
|
{"ttl": 3, "ip": "8.8.8.8", "rtt": 15000000}
|
|
]
|
|
}
|
|
}
|
|
]
|
|
```
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"status": "ok",
|
|
"received": 1
|
|
}
|
|
```
|
|
|
|
#### `POST /rotate`
|
|
Manually trigger database rotation.
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"status": "rotated",
|
|
"file": "results_2026-01-07_22-30-45.db"
|
|
}
|
|
```
|
|
|
|
#### `GET /dump`
|
|
Download current SQLite database file.
|
|
|
|
**Response**: Binary SQLite database file
|
|
|
|
### Health Service (Port 8091)
|
|
|
|
#### `GET /health`
|
|
Overall health status and statistics.
|
|
|
|
**Response**:
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"version": "0.0.1",
|
|
"uptime": "2h15m30s",
|
|
"stats": {
|
|
"total_results": 15420,
|
|
"successful_pings": 14890,
|
|
"failed_pings": 530,
|
|
"hops_discovered": 2341,
|
|
"hops_sent": 2341,
|
|
"last_result_time": "2026-01-07T22:30:15Z",
|
|
"current_db_file": "results_2026-01-07.db",
|
|
"current_db_size": 52428800,
|
|
"last_rotation": "2026-01-07T00:00:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
#### `GET /ready`
|
|
Readiness check (verifies database connectivity).
|
|
|
|
**Response**: `200 OK` if ready, `503 Service Unavailable` if not
|
|
|
|
#### `GET /metrics`
|
|
Prometheus-compatible metrics.
|
|
|
|
**Response** (text/plain):
|
|
```
|
|
# HELP output_service_total_results Total number of results processed
|
|
# TYPE output_service_total_results counter
|
|
output_service_total_results 15420
|
|
|
|
# HELP output_service_successful_pings Total successful pings
|
|
# TYPE output_service_successful_pings counter
|
|
output_service_successful_pings 14890
|
|
...
|
|
```
|
|
|
|
#### `GET /stats`
|
|
Detailed statistics in JSON format.
|
|
|
|
**Response**: Same as `stats` object in `/health`
|
|
|
|
#### `GET /recent?limit=100&ip=8.8.8.8`
|
|
Query recent ping results.
|
|
|
|
**Query Parameters**:
|
|
- `limit` (optional): Max results to return (default 100, max 1000)
|
|
- `ip` (optional): Filter by specific IP address
|
|
|
|
**Response**:
|
|
```json
|
|
[
|
|
{
|
|
"id": 12345,
|
|
"ip": "8.8.8.8",
|
|
"sent": 4,
|
|
"received": 4,
|
|
"packet_loss": 0,
|
|
"avg_rtt": 15000000,
|
|
"timestamp": "2026-01-07T22:30:00Z"
|
|
}
|
|
]
|
|
```
|
|
|
|
## Database Schema
|
|
|
|
### `ping_results`
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| id | INTEGER | Primary key |
|
|
| ip | TEXT | Target IP address |
|
|
| sent | INTEGER | Packets sent |
|
|
| received | INTEGER | Packets received |
|
|
| packet_loss | REAL | Packet loss percentage |
|
|
| avg_rtt | INTEGER | Average RTT (nanoseconds) |
|
|
| timestamp | DATETIME | Ping timestamp |
|
|
| error | TEXT | Error message if failed |
|
|
| created_at | DATETIME | Record creation time |
|
|
|
|
**Indexes**: `ip`, `timestamp`
|
|
|
|
### `traceroute_results`
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| id | INTEGER | Primary key |
|
|
| ping_result_id | INTEGER | Foreign key to ping_results |
|
|
| method | TEXT | Traceroute method (icmp/tcp) |
|
|
| completed | BOOLEAN | Whether trace completed |
|
|
| error | TEXT | Error message if failed |
|
|
|
|
### `traceroute_hops`
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| id | INTEGER | Primary key |
|
|
| traceroute_id | INTEGER | Foreign key to traceroute_results |
|
|
| ttl | INTEGER | Time-to-live / hop number |
|
|
| ip | TEXT | Hop IP address |
|
|
| rtt | INTEGER | Round-trip time (nanoseconds) |
|
|
| timeout | BOOLEAN | Whether hop timed out |
|
|
|
|
**Indexes**: `ip` (for hop discovery)
|
|
|
|
## Database Rotation
|
|
|
|
Rotation triggers automatically when **either** condition is met:
|
|
- **Time**: Database age exceeds `rotation_days` (default 7 days)
|
|
- **Size**: Database size exceeds `max_size_mb` (default 100MB)
|
|
|
|
Rotation process:
|
|
1. Close current database connection
|
|
2. Create new database with timestamp filename (`results_2026-01-07_22-30-45.db`)
|
|
3. Initialize schema in new database
|
|
4. Delete oldest database files if count exceeds `keep_files`
|
|
|
|
Manual rotation: `curl -X POST http://localhost:8081/rotate`
|
|
|
|
## Hop Discovery and Feedback
|
|
|
|
1. **Extraction**: For each traceroute, extract non-timeout hop IPs
|
|
2. **Deduplication**: Track sent hops in memory to avoid re-sending
|
|
3. **Submission**: HTTP POST to input_service `/hops` endpoint:
|
|
```json
|
|
{
|
|
"hops": ["10.0.0.1", "172.16.5.3", "8.8.8.8"]
|
|
}
|
|
```
|
|
4. **Statistics**: Track `hops_discovered` and `hops_sent` metrics
|
|
|
|
## Multi-Instance Deployment
|
|
|
|
Each output_service instance:
|
|
- Maintains its **own SQLite database** in `db_dir`
|
|
- Manages its **own rotation schedule** independently
|
|
- Tracks its **own hop deduplication** (some duplicate hop submissions across instances are acceptable)
|
|
- Can receive results from **multiple ping_service nodes**
|
|
|
|
For central data aggregation:
|
|
- Use `/dump` endpoint to collect database files from all instances
|
|
- Merge databases offline for analysis/visualization
|
|
- Or use shared network storage for `db_dir` (with file locking considerations)
|
|
|
|
## Integration with ping_service
|
|
|
|
Configure ping_service to send results to output_service:
|
|
|
|
**`config.yaml`** (ping_service):
|
|
```yaml
|
|
output_file: "http://output-service:8081/results"
|
|
```
|
|
|
|
## Integration with input_service
|
|
|
|
Output service expects input_service to have a `/hops` endpoint:
|
|
|
|
**Expected endpoint**: `POST /hops`
|
|
**Payload**:
|
|
```json
|
|
{
|
|
"hops": ["10.0.0.1", "172.16.5.3"]
|
|
}
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
**Check health**:
|
|
```bash
|
|
curl http://localhost:8091/health
|
|
```
|
|
|
|
**View metrics**:
|
|
```bash
|
|
curl http://localhost:8091/metrics
|
|
```
|
|
|
|
**Query recent failures**:
|
|
```bash
|
|
curl 'http://localhost:8091/recent?limit=50' | jq '.[] | select(.error != null)'
|
|
```
|
|
|
|
**Download database backup**:
|
|
```bash
|
|
curl http://localhost:8081/dump -o backup.db
|
|
```
|
|
|
|
## Development Testing
|
|
|
|
Use the Python demo output server to see example data format:
|
|
|
|
```bash
|
|
cd output_service
|
|
python3 http_ouput_demo.py # Note: file has typo in name
|
|
```
|
|
|
|
## Graceful Shutdown
|
|
|
|
Press `Ctrl+C` for graceful shutdown with 10s timeout.
|
|
|
|
The service will:
|
|
1. Stop accepting new requests
|
|
2. Finish processing in-flight requests
|
|
3. Close database connections cleanly
|
|
4. Exit
|
|
|
|
## Version
|
|
|
|
Current version: **0.0.1**
|
|
|
|
## Dependencies
|
|
|
|
- `github.com/mattn/go-sqlite3` - SQLite driver (requires CGO)
|