Files
ping_service/manager/README.md
2026-01-08 12:11:26 +02:00

274 lines
9.6 KiB
Markdown

# Ping Service Manager - Control Panel
A secure, self-hosted web application for managing and monitoring distributed ping service infrastructure. Protected by TOTP (Time-based One-Time Password) authentication with multi-layered encryption.
## Features
* **🎯 Worker Management:** Register and monitor input, ping, and output service instances
* **📊 Real-time Dashboard:** Live status monitoring with auto-refresh and health checks
* **🔐 Two-Step Verification:** Mandatory TOTP (Google Authenticator, Authy, etc.)
* **🔒 Encrypted Storage:** User data is double-encrypted (AES-GCM) using both a Server Key and User-derived keys
* **🌐 Automatic HTTPS:** Built-in Let's Encrypt (ACME) support
* **🔄 Dynamic DNS (dy.fi):** Integrated updater with multi-instance failover
* **🚨 Security Logging:** `fail2ban`-ready logs to block brute-force attempts
* **🔧 REST Client:** Clean UI to test GET/POST/PUT/DELETE requests with custom headers
* **🛡️ Internet-Ready Hardening:** Rate limiting, security headers, timeout protection, input validation
* **🌉 Gateway Mode:** Proxy for external ping workers - API key auth, load balancing, health-aware routing
## Security Hardening (Internet-Exposed Deployment)
This application is designed to run directly on the internet without a reverse proxy. The following hardening measures are implemented:
### Rate Limiting
- **Authentication endpoints** (`/verify-user`, `/verify-totp`): 10 requests/minute per IP
- **API endpoints**: 100 requests/minute per IP
- Automatic cleanup of rate limiter memory
- Logs `RATE_LIMIT_EXCEEDED` events with source IP
### HTTP Security Headers
All responses include:
- `Strict-Transport-Security` (HSTS): Force HTTPS for 1 year
- `X-Frame-Options`: Prevent clickjacking (DENY)
- `X-Content-Type-Options`: Prevent MIME sniffing
- `X-XSS-Protection`: Legacy XSS filter for older browsers
- `Content-Security-Policy`: Restrictive CSP to prevent XSS
- `Referrer-Policy`: Control referrer information leakage
- `Permissions-Policy`: Disable unnecessary browser features
### DoS Protection
- **Request Body Limit**: 10MB maximum
- **Read Timeout**: 15 seconds (headers + body)
- **Write Timeout**: 30 seconds (response)
- **Idle Timeout**: 120 seconds (keep-alive)
- **Read Header Timeout**: 5 seconds (slowloris protection)
- **Max Header Size**: 1MB
### TLS Configuration
- Minimum TLS 1.2 enforced
- Strong cipher suites only (ECDHE with AES-GCM and ChaCha20-Poly1305)
- Server cipher suite preference enabled
- Perfect Forward Secrecy (PFS) guaranteed
### Input Validation
- All user inputs validated for length and content
- Null byte injection protection
- Maximum field lengths enforced
- Sanitization of user IDs and TOTP codes
### Monitoring Endpoint
- Public `/health` endpoint for monitoring systems and dy.fi failover
- Returns JSON: `{"status":"healthy"}`
- Does not require authentication
## Control Panel Features
### Worker Registration & Monitoring
The manager provides a central control panel to register and monitor all your service instances:
- **Input Services** - Track consumer count and IP serving status
- **Ping Services** - Monitor total pings, success/failure rates, uptime
- **Output Services** - View results processed, hops discovered, database size
**🔍 Auto-Discovery**: Workers are automatically detected! Just provide the URL - the manager queries `/service-info` to determine the service type and generates an appropriate name. Manual override is available if needed.
### Auto Health Checks
- Background health polling every **60 seconds**
- Automatic status detection (Online/Offline)
- Response time tracking
- Service-specific statistics aggregation
- Dashboard auto-refresh every **30 seconds**
### Multi-Instance dy.fi Failover
When running multiple manager instances with dy.fi DNS:
1. **Leader Detection**: Checks where DNS currently points
2. **Health Verification**: Validates if active instance is responding
3. **Automatic Failover**: Takes over DNS if primary instance is down
4. **Standby Mode**: Skips updates when another healthy instance is active
See the dy.fi failover logs for real-time status.
### Gateway Mode (Optional)
The manager can act as a gateway/proxy for external ping workers that cannot directly access internal services:
- **External Workers**: Ping services running outside your network (AWS, DigitalOcean, etc.)
- **API Key Authentication**: 256-bit keys with encrypted storage
- **Load Balancing**: Automatic round-robin across healthy input/output services
- **Simple Deployment**: Workers only need manager URL + API key
**Enable gateway mode:**
```bash
sudo ./manager --port=443 --domain=example.dy.fi --enable-gateway
```
**Gateway endpoints** (for external workers):
- `GET /api/gateway/target` - Get next IP to ping
- `POST /api/gateway/result` - Submit ping/traceroute results
**Management endpoints** (admin only):
- `POST /api/apikeys/generate` - Generate new API key
- `GET /api/apikeys/list` - List all API keys
- `DELETE /api/apikeys/revoke` - Revoke API key
See [GATEWAY.md](GATEWAY.md) for detailed documentation.
## Quick Start
### 1. Installation
```bash
go mod tidy
go build -o manager
```
### 2. Configuration
The application uses environment variables for sensitive data. Create a `.env` file or export them:
```bash
export SERVER_KEY="your-32-byte-base64-key" # Generated on first run if missing
export DYFI_DOMAIN="example.dy.fi"
export DYFI_USER="your-email@example.com"
export DYFI_PASS="dyfi-password"
export ACME_EMAIL="admin@example.com"
export LOG_FILE="/var/log/twostepauth.log"
```
### 3. Add a User
Run the application in CLI mode to generate a new user and their TOTP QR code:
```bash
go run . --add-user=myusername
```
*Scan the QR code printed in the terminal with your authenticator app.*
### 4. Run the Server
**Production (Port 443 with Let's Encrypt):**
```bash
sudo go run . --port=443 --domain=example.dy.fi
```
**Development (Localhost with Self-Signed Certs):**
```bash
go run . --port=8080
```
### 5. Access the Control Panel
1. Navigate to `https://localhost:8080` (or your domain)
2. Log in with your user ID and TOTP code
3. You'll be redirected to the **Dashboard**
4. Click **"Add Worker"** to register your service instances
### 6. Register Workers
From the dashboard, click **"Add Worker"** and provide:
- **Worker Name**: e.g., "Input Service EU-1"
- **Worker Type**: `input`, `ping`, or `output`
- **Base URL**: e.g., `http://10.0.0.5:8080`
- **Location** (optional): e.g., "Helsinki, Finland"
- **Description** (optional): e.g., "Raspberry Pi 4"
The health poller will automatically start checking the worker's status every 60 seconds.
## Fail2Ban Integration
The app logs `AUTH_FAILURE` and `RATE_LIMIT_EXCEEDED` events with the source IP. To enable automatic blocking:
**Filter (`/etc/fail2ban/filter.d/twostepauth.conf`):**
```ini
[Definition]
failregex = AUTH_FAILURE: .* from IP <HOST>
RATE_LIMIT_EXCEEDED: .* from IP <HOST>
ignoreregex =
```
**Jail (`/etc/fail2ban/jail.d/twostepauth.local`):**
```ini
[twostepauth]
enabled = true
port = 80,443
filter = twostepauth
logpath = /var/log/twostepauth.log
maxretry = 5
bantime = 3600 # Ban for 1 hour
findtime = 600 # Count failures in last 10 minutes
```
**Note**: The application already implements rate limiting (10 auth requests/minute), but fail2ban provides an additional layer by blocking persistent attackers at the firewall level.
## API Endpoints
### Dashboard & UI
- `GET /` - Login page
- `GET /dashboard` - Worker monitoring control panel (requires auth)
- `GET /rest-client` - REST API testing tool (requires auth)
### Worker Management API
All API endpoints require authentication.
- `POST /api/workers/register` - Register a new worker instance
- `GET /api/workers/list` - List all registered workers
- `GET /api/workers/get?id={id}` - Get specific worker details
- `DELETE /api/workers/remove?id={id}` - Remove a worker
**Example: Register a worker**
```bash
curl -X POST https://localhost:8080/api/workers/register \
-H "Cookie: auth_session=..." \
-H "Content-Type: application/json" \
-d '{
"name": "Ping Service 1",
"type": "ping",
"url": "http://10.0.0.10:8090",
"location": "Helsinki",
"description": "Primary ping worker"
}'
```
### REST Client API
- `POST /api/request` - Make authenticated HTTP requests (requires auth)
## Dashboard Statistics
The control panel displays:
- **Total Workers**: Count of all registered instances
- **Healthy/Unhealthy**: Status breakdown
- **Total Pings**: Aggregated across all ping services
- **Total Results**: Aggregated across all output services
Per-worker details include:
- Online/Offline status with visual indicators
- Response time in milliseconds
- Last health check timestamp
- Service-specific metrics (consumers, pings, hops discovered, etc.)
- Error messages for failed health checks
## Data Persistence
- **User Data**: `users_data` (encrypted)
- **Worker Registry**: `workers_data.json`
- **TLS Certificates**: `cert.pem` / `key.pem` (self-signed) or `certs_cache/` (Let's Encrypt)
- **Logs**: Configured via `--log` flag
## Security Architecture
1. **Server Key:** Encrypts the entire user database file
2. **User Key:** Derived from the User ID and Server Key via PBKDF2; encrypts individual user TOTP secrets
3. **Session Security:** Session IDs are encrypted with the Server Key before being stored in a `Secure`, `HttpOnly`, `SameSite=Strict` cookie
4. **TLS:** Minimum version TLS 1.2 enforced
5. **Worker Health Checks:** Accept self-signed certificates (InsecureSkipVerify) for internal service communication
## Requirements
* Go 1.21+
* Port 80/443 open (if using Let's Encrypt)
* Root privileges (if binding to ports < 1024 on Linux)