Files
ping_service/README.md

11 KiB

Distributed Internet Network Mapping System

A distributed system for continuously mapping internet routes through coordinated ping operations, traceroute analysis, and organic target discovery across geographically diverse nodes. The system builds an evolving graph of internet paths by bootstrapping from cloud provider IPs and recursively discovering intermediate network hops.

Architecture Overview

The system consists of four interconnected services that work together to discover, probe, and map internet routing paths:

┌─────────────────┐
│  Input Service  │ ──── Serves IPs with subnet interleaving
└────────┬────────┘      Accepts discovered hops
         │
         ▼
┌─────────────────┐
│  Ping Service   │ ──── Distributed workers ping targets
│   (Workers)     │      Runs traceroute on successes
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Output Service  │ ──── Stores results in SQLite
└────────┬────────┘      Extracts intermediate hops
         │                Feeds back to input service
         │
         ▼
┌─────────────────┐
│    Manager      │ ──── Web UI and control plane
└─────────────────┘      Worker monitoring and coordination

Design Philosophy

  • Fault Tolerant: Nodes can join/leave freely; partial failures are expected
  • Network Realistic: Designed for imperfect infrastructure (NAT, 4G, consumer hardware)
  • Organic Growth: System learns by discovering hops and feeding them back as targets
  • Multi-Instance Ready: All services designed to run with multiple instances in production
  • No Time Guarantees: Latency variations normal; no assumption of always-online workers

Services

1. Input Service (input_service/)

HTTP service that intelligently feeds IP addresses to ping workers.

Key Features:

  • Subnet interleaving (10-CIDR rotation) to avoid consecutive IPs from same subnet
  • Per-consumer state tracking to prevent duplicate work
  • Lazy CIDR expansion for memory efficiency
  • Hop discovery feedback loop from output service
  • Persistent state (export/import capability)
  • IPv4 filtering with global deduplication

Endpoints:

  • GET / - Serve next IP address to worker
  • POST /hops - Accept discovered hops from output service
  • GET /status - Service health and statistics
  • GET /export - Export current state
  • POST /import - Import saved state
  • GET /service-info - Service discovery metadata

Multi-Instance: Each instance maintains per-consumer state; use session affinity for clients.

[More details in input_service/README.md]

2. Ping Service (ping_service.go)

Distributed worker agents that execute ping and traceroute operations.

Key Features:

  • ICMP and TCP ping support
  • Per-IP cooldown enforcement to prevent excessive pinging
  • Optional traceroute (ICMP/TCP) on successful pings
  • Structured JSON output format
  • Health/metrics/readiness endpoints
  • Designed for unattended operation under systemd

Configuration: config.yaml - supports file/HTTP/Unix socket for input/output

Multi-Instance: Fully distributed; multiple workers can ping the same targets (cooldown prevents excessive frequency).

[More details in ping_service_README.md]

3. Output Service (output_service/)

HTTP service that receives, stores, and processes ping/traceroute results.

Key Features:

  • SQLite storage with automatic rotation (weekly OR 100MB limit)
  • Extracts intermediate hops from traceroute data
  • Hop deduplication before forwarding to input service
  • Remote database dumps for aggregation
  • Prometheus metrics and health checks
  • Keeps 5 most recent database files

Endpoints:

  • POST /results - Receive ping results from workers
  • GET /health - Service health and statistics
  • GET /metrics - Prometheus metrics
  • GET /stats - Detailed processing statistics
  • GET /recent?limit=100&ip=8.8.8.8 - Query recent results
  • GET /dump - Download current database
  • POST /rotate - Manually trigger database rotation
  • GET /service-info - Service discovery metadata

Multi-Instance: Each instance maintains its own SQLite database; use /dump for central aggregation.

[More details in output_service/README.md]

4. Manager (manager/)

Centralized web UI and control plane with TOTP authentication.

Key Features:

  • Web dashboard for system observation and control
  • TOTP two-factor authentication
  • Worker registration and health monitoring (60s polling)
  • Let's Encrypt ACME support for production SSL
  • Dynamic DNS (dy.fi) integration with multi-instance failover
  • Double-encrypted user store (AES-GCM)
  • Fail2ban-ready security logging
  • Optional gateway/proxy mode for external workers
  • API key management for gateway authentication
  • Service auto-discovery via /service-info endpoints

Security: Rate limiting, encrypted storage, audit logging, API keys for gateway mode.

[More details in manager/README.md and manager/GATEWAY.md]

Quick Start

Building All Services

# Build everything with one command
make

# Or build individually
make ping-service
make input-service
make output-service
make manager

# Clean built binaries
make clean

Running the System

# 1. Start input service (serves on :8080)
cd input_service
./http_input_service

# 2. Start output service (results on :8081, health on :8091)
cd output_service
./output_service --verbose

# 3. Start ping workers (as many as you want)
./ping_service -config config.yaml -verbose

# 4. Start manager (development mode)
cd manager
go run . --port=8080

# Or production mode with Let's Encrypt
sudo go run . --port=443 --domain=example.dy.fi --email=admin@example.com

Installing Ping Service as Systemd Service

chmod +x install.sh
sudo ./install.sh
sudo systemctl start ping-service
sudo systemctl status ping-service

Configuration

Ping Service (config.yaml)

input_file: "http://localhost:8080"   # IP source
output_file: "http://localhost:8081/results"  # Results destination
interval_seconds: 30                  # Poll interval
cooldown_minutes: 10                  # Per-IP cooldown
enable_traceroute: true               # Enable traceroute
traceroute_max_hops: 30               # Max TTL
health_check_port: 8090               # Health server port

Output Service (CLI Flags)

--port=8081            # Results receiving port
--health-port=8091     # Health/metrics port
--input-url=http://localhost:8080/hops  # Hop feedback URL
--db-dir=./output_data # Database directory
--max-size-mb=100      # DB size rotation trigger
--rotation-days=7      # Time-based rotation
--keep-files=5         # Number of DBs to keep
--verbose              # Enable verbose logging

Manager (Environment Variables)

SERVER_KEY=<base64-key>      # 32-byte encryption key (auto-generated)
DYFI_DOMAIN=example.dy.fi    # Dynamic DNS domain
DYFI_USER=username           # dy.fi username
DYFI_PASS=password           # dy.fi password
ACME_EMAIL=admin@example.com # Let's Encrypt email
LOG_FILE=/var/log/manager-auth.log  # fail2ban log path
MANAGER_PORT=8080            # HTTP/HTTPS port

Data Flow

  1. Bootstrap: Input service loads ~19,000 cloud provider IPs from CIDR ranges
  2. Distribution: Ping workers poll input service for targets (subnet-interleaved)
  3. Execution: Workers ping targets with cooldown enforcement
  4. Discovery: Successful pings trigger traceroute to discover intermediate hops
  5. Storage: Results sent to output service, stored in SQLite
  6. Extraction: Output service extracts new hops from traceroute data
  7. Feedback: Discovered hops fed back to input service as new targets
  8. Growth: System organically expands target pool over time
  9. Monitoring: Manager provides visibility and control

Service Discovery

All services expose a /service-info endpoint that returns service type, version, capabilities, and instance ID. This enables:

  • Automatic worker type detection in manager
  • Zero-config worker registration (just provide URL)
  • Service identification for monitoring and debugging

Health Monitoring

Each service exposes health endpoints for monitoring:

  • GET /health - Status, uptime, statistics
  • GET /ready - Readiness check
  • GET /metrics - Prometheus-compatible metrics
  • GET /service-info - Service metadata

Dependencies

Ping Service

  • github.com/go-ping/ping - ICMP ping library
  • gopkg.in/yaml.v3 - YAML configuration
  • Go 1.25.0+

Output Service

  • github.com/mattn/go-sqlite3 - SQLite driver (requires CGO)
  • Go 1.25.0+

Manager

  • github.com/pquerna/otp - TOTP authentication
  • golang.org/x/crypto/acme/autocert - Let's Encrypt integration
  • Go 1.25.0+

Project Status

Current State:

  • Functional distributed ping + traceroute workers
  • Input service with persistent state and lazy CIDR expansion
  • Output service with SQLite storage, rotation, and hop extraction
  • Complete feedback loop (discovered hops become new targets)
  • Manager with TOTP auth, encryption, SSL, and worker monitoring

Future Work:

  • Data visualization and mapping interface
  • Analytics and pattern detection
  • BGP AS number integration
  • Geographic correlation

Security Features

  • TOTP two-factor authentication on manager
  • Double-encrypted user storage (AES-GCM)
  • Let's Encrypt automatic SSL certificate management
  • fail2ban integration for brute force protection
  • Rate limiting and session management
  • API key authentication for gateway mode

Deployment Considerations

Multi-Instance Production

  • All services designed to run with multiple instances
  • Input service: Use session affinity or call /hops on all instances
  • Output service: Each instance maintains separate database; aggregate via /dump
  • Ping service: Fully distributed; cooldown prevents excessive overlap
  • Manager: Requires external session store for multi-instance (currently in-memory)

Network Requirements

  • Ping workers need ICMP (raw socket) permissions
  • Input/output services should be reachable by ping workers
  • Manager can run behind NAT with gateway mode for external workers
  • Let's Encrypt requires port 80/443 accessible from internet

Documentation

  • CLAUDE.md - Comprehensive project documentation and guidance
  • ping_service_README.md - Ping service details
  • input_service/README.md - Input service details
  • output_service/README.md - Output service details
  • manager/README.md - Manager details
  • manager/GATEWAY.md - Gateway mode documentation

License

[Specify your license here]

Contributing

[Specify contribution guidelines here]