11 KiB
Distributed Internet Network Mapping System
A distributed system for continuously mapping internet routes through coordinated ping operations, traceroute analysis, and organic target discovery across geographically diverse nodes. The system builds an evolving graph of internet paths by bootstrapping from cloud provider IPs and recursively discovering intermediate network hops.
Architecture Overview
The system consists of four interconnected services that work together to discover, probe, and map internet routing paths:
┌─────────────────┐
│ Input Service │ ──── Serves IPs with subnet interleaving
└────────┬────────┘ Accepts discovered hops
│
▼
┌─────────────────┐
│ Ping Service │ ──── Distributed workers ping targets
│ (Workers) │ Runs traceroute on successes
└────────┬────────┘
│
▼
┌─────────────────┐
│ Output Service │ ──── Stores results in SQLite
└────────┬────────┘ Extracts intermediate hops
│ Feeds back to input service
│
▼
┌─────────────────┐
│ Manager │ ──── Web UI and control plane
└─────────────────┘ Worker monitoring and coordination
Design Philosophy
- Fault Tolerant: Nodes can join/leave freely; partial failures are expected
- Network Realistic: Designed for imperfect infrastructure (NAT, 4G, consumer hardware)
- Organic Growth: System learns by discovering hops and feeding them back as targets
- Multi-Instance Ready: All services designed to run with multiple instances in production
- No Time Guarantees: Latency variations normal; no assumption of always-online workers
Services
1. Input Service (input_service/)
HTTP service that intelligently feeds IP addresses to ping workers.
Key Features:
- Subnet interleaving (10-CIDR rotation) to avoid consecutive IPs from same subnet
- Per-consumer state tracking to prevent duplicate work
- Lazy CIDR expansion for memory efficiency
- Hop discovery feedback loop from output service
- Persistent state (export/import capability)
- IPv4 filtering with global deduplication
Endpoints:
GET /- Serve next IP address to workerPOST /hops- Accept discovered hops from output serviceGET /status- Service health and statisticsGET /export- Export current statePOST /import- Import saved stateGET /service-info- Service discovery metadata
Multi-Instance: Each instance maintains per-consumer state; use session affinity for clients.
[More details in input_service/README.md]
2. Ping Service (ping_service.go)
Distributed worker agents that execute ping and traceroute operations.
Key Features:
- ICMP and TCP ping support
- Per-IP cooldown enforcement to prevent excessive pinging
- Optional traceroute (ICMP/TCP) on successful pings
- Structured JSON output format
- Health/metrics/readiness endpoints
- Designed for unattended operation under systemd
Configuration: config.yaml - supports file/HTTP/Unix socket for input/output
Multi-Instance: Fully distributed; multiple workers can ping the same targets (cooldown prevents excessive frequency).
[More details in ping_service_README.md]
3. Output Service (output_service/)
HTTP service that receives, stores, and processes ping/traceroute results.
Key Features:
- SQLite storage with automatic rotation (weekly OR 100MB limit)
- Extracts intermediate hops from traceroute data
- Hop deduplication before forwarding to input service
- Remote database dumps for aggregation
- Prometheus metrics and health checks
- Keeps 5 most recent database files
Endpoints:
POST /results- Receive ping results from workersGET /health- Service health and statisticsGET /metrics- Prometheus metricsGET /stats- Detailed processing statisticsGET /recent?limit=100&ip=8.8.8.8- Query recent resultsGET /dump- Download current databasePOST /rotate- Manually trigger database rotationGET /service-info- Service discovery metadata
Multi-Instance: Each instance maintains its own SQLite database; use /dump for central aggregation.
[More details in output_service/README.md]
4. Manager (manager/)
Centralized web UI and control plane with TOTP authentication.
Key Features:
- Web dashboard for system observation and control
- TOTP two-factor authentication
- Worker registration and health monitoring (60s polling)
- Let's Encrypt ACME support for production SSL
- Dynamic DNS (dy.fi) integration with multi-instance failover
- Double-encrypted user store (AES-GCM)
- Fail2ban-ready security logging
- Optional gateway/proxy mode for external workers
- API key management for gateway authentication
- Service auto-discovery via
/service-infoendpoints
Security: Rate limiting, encrypted storage, audit logging, API keys for gateway mode.
[More details in manager/README.md and manager/GATEWAY.md]
Quick Start
Building All Services
# Build everything with one command
make
# Or build individually
make ping-service
make input-service
make output-service
make manager
# Clean built binaries
make clean
Running the System
# 1. Start input service (serves on :8080)
cd input_service
./http_input_service
# 2. Start output service (results on :8081, health on :8091)
cd output_service
./output_service --verbose
# 3. Start ping workers (as many as you want)
./ping_service -config config.yaml -verbose
# 4. Start manager (development mode)
cd manager
go run . --port=8080
# Or production mode with Let's Encrypt
sudo go run . --port=443 --domain=example.dy.fi --email=admin@example.com
Installing Ping Service as Systemd Service
chmod +x install.sh
sudo ./install.sh
sudo systemctl start ping-service
sudo systemctl status ping-service
Configuration
Ping Service (config.yaml)
input_file: "http://localhost:8080" # IP source
output_file: "http://localhost:8081/results" # Results destination
interval_seconds: 30 # Poll interval
cooldown_minutes: 10 # Per-IP cooldown
enable_traceroute: true # Enable traceroute
traceroute_max_hops: 30 # Max TTL
health_check_port: 8090 # Health server port
Output Service (CLI Flags)
--port=8081 # Results receiving port
--health-port=8091 # Health/metrics port
--input-url=http://localhost:8080/hops # Hop feedback URL
--db-dir=./output_data # Database directory
--max-size-mb=100 # DB size rotation trigger
--rotation-days=7 # Time-based rotation
--keep-files=5 # Number of DBs to keep
--verbose # Enable verbose logging
Manager (Environment Variables)
SERVER_KEY=<base64-key> # 32-byte encryption key (auto-generated)
DYFI_DOMAIN=example.dy.fi # Dynamic DNS domain
DYFI_USER=username # dy.fi username
DYFI_PASS=password # dy.fi password
ACME_EMAIL=admin@example.com # Let's Encrypt email
LOG_FILE=/var/log/manager-auth.log # fail2ban log path
MANAGER_PORT=8080 # HTTP/HTTPS port
Data Flow
- Bootstrap: Input service loads ~19,000 cloud provider IPs from CIDR ranges
- Distribution: Ping workers poll input service for targets (subnet-interleaved)
- Execution: Workers ping targets with cooldown enforcement
- Discovery: Successful pings trigger traceroute to discover intermediate hops
- Storage: Results sent to output service, stored in SQLite
- Extraction: Output service extracts new hops from traceroute data
- Feedback: Discovered hops fed back to input service as new targets
- Growth: System organically expands target pool over time
- Monitoring: Manager provides visibility and control
Service Discovery
All services expose a /service-info endpoint that returns service type, version, capabilities, and instance ID. This enables:
- Automatic worker type detection in manager
- Zero-config worker registration (just provide URL)
- Service identification for monitoring and debugging
Health Monitoring
Each service exposes health endpoints for monitoring:
GET /health- Status, uptime, statisticsGET /ready- Readiness checkGET /metrics- Prometheus-compatible metricsGET /service-info- Service metadata
Dependencies
Ping Service
github.com/go-ping/ping- ICMP ping librarygopkg.in/yaml.v3- YAML configuration- Go 1.25.0+
Output Service
github.com/mattn/go-sqlite3- SQLite driver (requires CGO)- Go 1.25.0+
Manager
github.com/pquerna/otp- TOTP authenticationgolang.org/x/crypto/acme/autocert- Let's Encrypt integration- Go 1.25.0+
Project Status
Current State:
- Functional distributed ping + traceroute workers
- Input service with persistent state and lazy CIDR expansion
- Output service with SQLite storage, rotation, and hop extraction
- Complete feedback loop (discovered hops become new targets)
- Manager with TOTP auth, encryption, SSL, and worker monitoring
Future Work:
- Data visualization and mapping interface
- Analytics and pattern detection
- BGP AS number integration
- Geographic correlation
Security Features
- TOTP two-factor authentication on manager
- Double-encrypted user storage (AES-GCM)
- Let's Encrypt automatic SSL certificate management
- fail2ban integration for brute force protection
- Rate limiting and session management
- API key authentication for gateway mode
Deployment Considerations
Multi-Instance Production
- All services designed to run with multiple instances
- Input service: Use session affinity or call
/hopson all instances - Output service: Each instance maintains separate database; aggregate via
/dump - Ping service: Fully distributed; cooldown prevents excessive overlap
- Manager: Requires external session store for multi-instance (currently in-memory)
Network Requirements
- Ping workers need ICMP (raw socket) permissions
- Input/output services should be reachable by ping workers
- Manager can run behind NAT with gateway mode for external workers
- Let's Encrypt requires port 80/443 accessible from internet
Documentation
CLAUDE.md- Comprehensive project documentation and guidanceMULTI_INSTANCE.md- Multi-instance deployment guide with production strategiesping_service_README.md- Ping service detailsinput_service/README.md- Input service detailsoutput_service/README.md- Output service detailsmanager/README.md- Manager detailsmanager/GATEWAY.md- Gateway mode documentation
License
[Specify your license here]
Contributing
[Specify contribution guidelines here]