# KVS - Distributed Key-Value Store A minimalistic, clustered key-value database system written in Go that prioritizes **availability** and **partition tolerance** over strong consistency. KVS implements a gossip-style membership protocol with sophisticated conflict resolution for eventually consistent distributed storage. ## ๐Ÿš€ Key Features - **Hierarchical Keys**: Support for structured paths (e.g., `/home/room/closet/socks`) - **Eventual Consistency**: Local operations are fast, replication happens in background - **Merkle Tree Sync**: Efficient data synchronization with cryptographic integrity - **Sophisticated Conflict Resolution**: Oldest-node rule with UUID tie-breaking - **JWT Authentication**: Full authentication system with POSIX-inspired permissions - **Local-First Truth**: All operations work locally first, sync globally later - **Read-Only Mode**: Configurable mode for reducing write load - **Modular Architecture**: Clean separation of concerns with feature toggles - **Comprehensive Features**: TTL support, rate limiting, tamper logging, automated backups - **Zero External Dependencies**: Single binary with embedded BadgerDB storage ## ๐Ÿ—๏ธ Architecture ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Node A โ”‚ โ”‚ Node B โ”‚ โ”‚ Node C โ”‚ โ”‚ (Go Service) โ”‚ โ”‚ (Go Service) โ”‚ โ”‚ (Go Service) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚HTTP API+Authโ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚HTTP API+Authโ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚HTTP API+Authโ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚ Gossip โ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚ Gossip โ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚ Gossip โ”‚ โ”‚ โ”‚ โ”‚ Protocol โ”‚ โ”‚ โ”‚ โ”‚ Protocol โ”‚ โ”‚ โ”‚ โ”‚ Protocol โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚Merkle Sync โ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚Merkle Sync โ”‚ โ”‚โ—„โ”€โ”€โ–บโ”‚ โ”‚Merkle Sync โ”‚ โ”‚ โ”‚ โ”‚& Conflict โ”‚ โ”‚ โ”‚ โ”‚& Conflict โ”‚ โ”‚ โ”‚ โ”‚& Conflict โ”‚ โ”‚ โ”‚ โ”‚ Resolution โ”‚ โ”‚ โ”‚ โ”‚ Resolution โ”‚ โ”‚ โ”‚ โ”‚ Resolution โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ”‚ โ”‚Storage+ โ”‚ โ”‚ โ”‚ โ”‚Storage+ โ”‚ โ”‚ โ”‚ โ”‚Storage+ โ”‚ โ”‚ โ”‚ โ”‚Features โ”‚ โ”‚ โ”‚ โ”‚Features โ”‚ โ”‚ โ”‚ โ”‚Features โ”‚ โ”‚ โ”‚ โ”‚(BadgerDB) โ”‚ โ”‚ โ”‚ โ”‚(BadgerDB) โ”‚ โ”‚ โ”‚ โ”‚(BadgerDB) โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ–ฒ โ”‚ External Clients (JWT Auth) ``` ### Modular Design KVS features a clean modular architecture with dedicated packages: - **`auth/`** - JWT authentication and POSIX-inspired permissions - **`cluster/`** - Gossip protocol, Merkle tree sync, and conflict resolution - **`storage/`** - BadgerDB abstraction with compression and revisions - **`server/`** - HTTP API, routing, and lifecycle management - **`features/`** - TTL, rate limiting, tamper logging, backup utilities - **`config/`** - Configuration management with auto-generation ## ๐Ÿ“ฆ Installation ### Prerequisites - Go 1.21 or higher ### Build from Source ```bash git clone cd kvs go mod tidy go build -o kvs . ``` ### Quick Test ```bash # Start standalone node (uses config.yaml if it exists, or creates it) ./kvs start config.yaml # Test the API curl http://localhost:8080/health # Check status ./kvs status # Stop when done ./kvs stop config ``` ## ๐ŸŽฎ Process Management KVS includes systemd-style daemon commands for easy process management: ```bash # Start as background daemon ./kvs start config.yaml # or just: ./kvs start config ./kvs start node1.yaml # Start with custom config # Check status ./kvs status # Show all running instances ./kvs status node1 # Show specific instance # Stop daemon ./kvs stop node1 # Graceful shutdown # Restart daemon ./kvs restart node1 # Stop and start # Run in foreground (traditional) ./kvs node1.yaml # Logs to stdout ``` ### Daemon Features - **Global PID tracking**: PID files stored in `~/.kvs/pids/` (works from any directory) - **Automatic logging**: Logs written to `~/.kvs/logs/{config-name}.log` - **Flexible naming**: Config extension optional (`node1` or `node1.yaml` both work) - **Graceful shutdown**: SIGTERM sent for clean shutdown - **Stale PID cleanup**: Automatically detects and cleans dead processes - **Multi-instance**: Run multiple KVS instances on same machine ### Example Workflow ```bash # Start 3-node cluster as daemons ./kvs start node1.yaml ./kvs start node2.yaml ./kvs start node3.yaml # Check cluster status ./kvs status # View logs tail -f ~/.kvs/logs/kvs_node1.yaml.log # Stop entire cluster ./kvs stop node1 ./kvs stop node2 ./kvs stop node3 ``` ## โš™๏ธ Configuration KVS uses YAML configuration files. On first run, a default `config.yaml` is automatically generated: ```yaml node_id: "hostname" # Unique node identifier bind_address: "127.0.0.1" # IP address to bind to port: 8080 # HTTP port data_dir: "./data" # Directory for BadgerDB storage seed_nodes: [] # List of seed nodes for cluster joining read_only: false # Enable read-only mode log_level: "info" # Logging level (debug, info, warn, error) # Cluster timing configuration gossip_interval_min: 60 # Min gossip interval (seconds) gossip_interval_max: 120 # Max gossip interval (seconds) sync_interval: 300 # Regular sync interval (seconds) catchup_interval: 120 # Catch-up sync interval (seconds) bootstrap_max_age_hours: 720 # Max age for bootstrap sync (hours) throttle_delay_ms: 100 # Delay between sync requests (ms) fetch_delay_ms: 50 # Delay between data fetches (ms) # Feature configuration compression_enabled: true # Enable ZSTD compression compression_level: 3 # Compression level (1-19) default_ttl: "0" # Default TTL ("0" = no expiry) max_json_size: 1048576 # Max JSON payload size (1MB) rate_limit_requests: 100 # Requests per window rate_limit_window: "1m" # Rate limit window # Feature toggles auth_enabled: true # JWT authentication system tamper_logging_enabled: true # Cryptographic audit trail clustering_enabled: true # Gossip protocol and sync rate_limiting_enabled: true # Rate limiting revision_history_enabled: true # Automatic versioning # Anonymous access control (when auth_enabled: true) allow_anonymous_read: false # Allow unauthenticated read access to KV endpoints allow_anonymous_write: false # Allow unauthenticated write access to KV endpoints # Backup configuration backup_enabled: true # Automated backups backup_schedule: "0 0 * * *" # Daily at midnight (cron format) backup_path: "./backups" # Backup directory backup_retention: 7 # Days to keep backups ``` ### Custom Configuration ```bash # Use custom config file ./kvs /path/to/custom-config.yaml ``` ## ๐Ÿ”Œ REST API ### Data Operations (`/kv/`) #### Store Data ```bash PUT /kv/{path} Content-Type: application/json Authorization: Bearer # Required if auth_enabled && !allow_anonymous_write # Basic storage curl -X PUT http://localhost:8080/kv/users/john/profile \ -H "Content-Type: application/json" \ -H "Authorization: Bearer eyJ..." \ -d '{"name":"John Doe","age":30,"email":"john@example.com"}' # Storage with TTL curl -X PUT http://localhost:8080/kv/cache/session/abc123 \ -H "Content-Type: application/json" \ -H "Authorization: Bearer eyJ..." \ -d '{"data":{"user_id":"john"}, "ttl":"1h"}' # Response { "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef", "timestamp": 1672531200000 } ``` #### Retrieve Data ```bash GET /kv/{path} Authorization: Bearer # Required if auth_enabled && !allow_anonymous_read curl -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile # Response (full StoredValue format) { "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef", "timestamp": 1672531200000, "data": { "name": "John Doe", "age": 30, "email": "john@example.com" } } ``` #### Delete Data ```bash DELETE /kv/{path} Authorization: Bearer # Always required when auth_enabled (no anonymous delete) curl -X DELETE -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile # Returns: 204 No Content ``` ### Authentication Operations (`/auth/`) #### Create User ```bash POST /auth/users Content-Type: application/json curl -X POST http://localhost:8080/auth/users \ -H "Content-Type: application/json" \ -d '{"nickname":"john"}' # Response {"uuid": "user-abc123"} ``` #### Create API Token ```bash POST /auth/tokens Content-Type: application/json curl -X POST http://localhost:8080/auth/tokens \ -H "Content-Type: application/json" \ -d '{"user_uuid":"user-abc123", "scopes":["read","write"]}' # Response { "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...", "expires_at": 1672617600000 } ``` ### Cluster Operations (`/members/`) #### View Cluster Members ```bash GET /members/ curl http://localhost:8080/members/ # Response [ { "id": "node-alpha", "address": "192.168.1.10:8080", "last_seen": 1672531200000, "joined_timestamp": 1672530000000 } ] ``` #### Health Check ```bash GET /health curl http://localhost:8080/health # Response { "status": "ok", "mode": "normal", "member_count": 2, "node_id": "node-alpha" } ``` ### Merkle Tree Operations (`/sync/`) #### Get Merkle Root ```bash GET /sync/merkle/root # Used internally for data synchronization ``` #### Range Queries ```bash GET /kv/_range?start_key=users/&end_key=users/z&limit=100 # Fetch key ranges for synchronization ``` ## ๐Ÿ˜๏ธ Cluster Setup ### Single Node (Standalone) ```bash # config.yaml node_id: "standalone" port: 8080 seed_nodes: [] # Empty = standalone mode ``` ### Multi-Node Cluster #### Node 1 (Bootstrap Node) ```bash # node1.yaml node_id: "node1" port: 8081 seed_nodes: [] # First node, no seeds needed auth_enabled: true clustering_enabled: true ``` #### Node 2 (Joins via Node 1) ```bash # node2.yaml node_id: "node2" port: 8082 seed_nodes: ["127.0.0.1:8081"] # Points to node1 auth_enabled: true clustering_enabled: true ``` #### Node 3 (Joins via Node 1 & 2) ```bash # node3.yaml node_id: "node3" port: 8083 seed_nodes: ["127.0.0.1:8081", "127.0.0.1:8082"] # Multiple seeds for reliability auth_enabled: true clustering_enabled: true ``` #### Start the Cluster ```bash # Start as daemons ./kvs start node1.yaml sleep 2 ./kvs start node2.yaml sleep 2 ./kvs start node3.yaml # Verify cluster formation curl http://localhost:8081/members/ # Should show all 3 nodes # Check daemon status ./kvs status # Stop cluster when done ./kvs stop node1 ./kvs stop node2 ./kvs stop node3 ``` ## ๐Ÿ”„ How It Works ### Gossip Protocol - Nodes randomly select 1-3 peers every 1-2 minutes for membership exchange - Failed nodes are detected via timeout (5 minutes) and removed (10 minutes) - New members are automatically discovered and added to local member lists ### Merkle Tree Synchronization - **Merkle Trees**: Each node builds cryptographic trees of their data for efficient comparison - **Regular Sync**: Every 5 minutes, nodes compare Merkle roots and sync divergent branches - **Catch-up Sync**: Every 2 minutes when nodes detect they're significantly behind - **Bootstrap Sync**: New nodes gradually fetch historical data up to 30 days old - **Efficient Detection**: Only synchronizes actual differences, not entire datasets ### Sophisticated Conflict Resolution When two nodes have different data for the same key with identical timestamps: 1. **Detection**: Merkle tree comparison identifies conflicting keys 2. **Oldest-Node Rule**: The version from the node with earliest `joined_timestamp` wins 3. **UUID Tie-Breaker**: If join times are identical, lexicographically smaller UUID wins 4. **Automatic Resolution**: Losing nodes automatically fetch and store the winning version 5. **Consistency**: All nodes converge to the same data within seconds ### Authentication & Authorization - **JWT Tokens**: Secure API access with scoped permissions - **POSIX-Inspired ACLs**: 12-bit permission system (owner/group/others with create/delete/write/read) - **Resource Metadata**: Each stored item has ownership and permission information - **Feature Toggle**: Can be completely disabled for simpler deployments ### Operational Modes - **Normal**: Full read/write capabilities with all features - **Read-Only**: Rejects external writes but accepts internal replication - **Syncing**: Temporary mode during bootstrap, rejects external writes ## ๐Ÿ› ๏ธ Development ### Running Tests ```bash # Build and run comprehensive integration tests go build -o kvs . ./integration_test.sh # Manual basic functionality test ./kvs start config.yaml sleep 2 curl http://localhost:8080/health ./kvs stop config # Manual cluster test (requires creating configs) echo 'node_id: "test1" port: 8081 seed_nodes: [] auth_enabled: false' > test1.yaml echo 'node_id: "test2" port: 8082 seed_nodes: ["127.0.0.1:8081"] auth_enabled: false' > test2.yaml ./kvs start test1.yaml sleep 2 ./kvs start test2.yaml # Test data replication (wait for cluster formation) sleep 10 curl -X PUT http://localhost:8081/kv/test/data \ -H "Content-Type: application/json" \ -d '{"message":"hello world"}' # Wait for Merkle sync, then check replication sleep 30 curl http://localhost:8082/kv/test/data # Cleanup ./kvs stop test1 ./kvs stop test2 rm test1.yaml test2.yaml ``` ### Conflict Resolution Testing ```bash # Create conflicting data scenario using utility go run test_conflict.go /tmp/conflict1 /tmp/conflict2 # Create configs for conflict test echo 'node_id: "conflict1" port: 9111 data_dir: "/tmp/conflict1" seed_nodes: [] auth_enabled: false log_level: "debug"' > conflict1.yaml echo 'node_id: "conflict2" port: 9112 data_dir: "/tmp/conflict2" seed_nodes: ["127.0.0.1:9111"] auth_enabled: false log_level: "debug"' > conflict2.yaml # Start nodes with conflicting data ./kvs start conflict1.yaml sleep 2 ./kvs start conflict2.yaml # Watch logs for conflict resolution tail -f ~/.kvs/logs/kvs_conflict1.yaml.log ~/.kvs/logs/kvs_conflict2.yaml.log & # Both nodes will converge within ~10-30 seconds # Check final state sleep 30 curl http://localhost:9111/kv/test/conflict/data curl http://localhost:9112/kv/test/conflict/data # Cleanup ./kvs stop conflict1 ./kvs stop conflict2 rm conflict1.yaml conflict2.yaml ``` ### Code Quality ```bash # Format and lint go fmt ./... go vet ./... # Dependency management go mod tidy go mod verify # Build verification go build . ``` ### Project Structure ``` kvs/ โ”œโ”€โ”€ main.go # Main application entry point โ”œโ”€โ”€ config.yaml # Default configuration (auto-generated) โ”œโ”€โ”€ integration_test.sh # Comprehensive test suite โ”œโ”€โ”€ test_conflict.go # Conflict resolution testing utility โ”œโ”€โ”€ CLAUDE.md # Development guidance for Claude Code โ”œโ”€โ”€ go.mod # Go module dependencies โ”œโ”€โ”€ go.sum # Go module checksums โ”œโ”€โ”€ README.md # This documentation โ”‚ โ”œโ”€โ”€ auth/ # Authentication & authorization โ”‚ โ”œโ”€โ”€ auth.go # Main auth logic โ”‚ โ”œโ”€โ”€ jwt.go # JWT token management โ”‚ โ”œโ”€โ”€ middleware.go # HTTP middleware โ”‚ โ”œโ”€โ”€ permissions.go # POSIX-inspired ACL system โ”‚ โ””โ”€โ”€ storage.go # Auth data storage โ”‚ โ”œโ”€โ”€ cluster/ # Distributed systems components โ”‚ โ”œโ”€โ”€ bootstrap.go # New node integration โ”‚ โ”œโ”€โ”€ gossip.go # Membership protocol โ”‚ โ”œโ”€โ”€ merkle.go # Merkle tree implementation โ”‚ โ””โ”€โ”€ sync.go # Data synchronization & conflict resolution โ”‚ โ”œโ”€โ”€ config/ # Configuration management โ”‚ โ””โ”€โ”€ config.go # Config loading & defaults โ”‚ โ”œโ”€โ”€ daemon/ # Process management โ”‚ โ”œโ”€โ”€ daemonize.go # Background process spawning โ”‚ โ””โ”€โ”€ pid.go # PID file management โ”‚ โ”œโ”€โ”€ features/ # Utility features โ”‚ โ”œโ”€โ”€ auth.go # Auth utilities โ”‚ โ”œโ”€โ”€ backup.go # Backup system โ”‚ โ”œโ”€โ”€ features.go # Feature toggles โ”‚ โ”œโ”€โ”€ ratelimit.go # Rate limiting โ”‚ โ”œโ”€โ”€ revision.go # Revision history โ”‚ โ”œโ”€โ”€ tamperlog.go # Tamper-evident logging โ”‚ โ””โ”€โ”€ validation.go # TTL parsing โ”‚ โ”œโ”€โ”€ server/ # HTTP server & API โ”‚ โ”œโ”€โ”€ handlers.go # Request handlers โ”‚ โ”œโ”€โ”€ lifecycle.go # Server lifecycle โ”‚ โ”œโ”€โ”€ routes.go # Route definitions โ”‚ โ””โ”€โ”€ server.go # Server setup โ”‚ โ”œโ”€โ”€ storage/ # Data storage abstraction โ”‚ โ”œโ”€โ”€ compression.go # ZSTD compression โ”‚ โ”œโ”€โ”€ revision.go # Revision history โ”‚ โ””โ”€โ”€ storage.go # BadgerDB interface โ”‚ โ”œโ”€โ”€ types/ # Shared type definitions โ”‚ โ””โ”€โ”€ types.go # All data structures โ”‚ โ””โ”€โ”€ utils/ # Utilities โ””โ”€โ”€ hash.go # Cryptographic hashing ``` ### Key Data Structures #### Stored Value Format ```go type StoredValue struct { UUID string `json:"uuid"` // Unique version identifier Timestamp int64 `json:"timestamp"` // Unix timestamp (milliseconds) Data json.RawMessage `json:"data"` // Actual user JSON payload } ``` #### BadgerDB Storage - **Main Key**: Direct path mapping (e.g., `users/john/profile`) - **Index Key**: `_ts:{timestamp}:{path}` for efficient time-based queries - **Values**: JSON-marshaled `StoredValue` structures ## ๐Ÿ”ง Configuration Options Explained | Setting | Description | Default | Notes | |---------|-------------|---------|-------| | **Core Settings** | | `node_id` | Unique identifier for this node | hostname | Must be unique across cluster | | `bind_address` | IP address to bind HTTP server | "127.0.0.1" | Use 0.0.0.0 for external access | | `port` | HTTP port for API and cluster communication | 8080 | Must be accessible to peers | | `data_dir` | Directory for BadgerDB storage | "./data" | Will be created if doesn't exist | | `seed_nodes` | List of initial cluster nodes | [] | Empty = standalone mode | | `read_only` | Enable read-only mode | false | Accepts replication, rejects client writes | | `log_level` | Logging verbosity | "info" | debug/info/warn/error | | **Cluster Timing** | | `gossip_interval_min/max` | Gossip frequency range | 60-120 sec | Randomized interval | | `sync_interval` | Regular Merkle sync frequency | 300 sec | How often to sync with peers | | `catchup_interval` | Catch-up sync frequency | 120 sec | Faster sync when behind | | `bootstrap_max_age_hours` | Max historical data to sync | 720 hours | 30 days default | | **Feature Toggles** | | `auth_enabled` | JWT authentication system | true | Complete auth/authz system | | `allow_anonymous_read` | Allow unauthenticated read access | false | When auth_enabled, controls KV GET endpoints | | `allow_anonymous_write` | Allow unauthenticated write access | false | When auth_enabled, controls KV PUT endpoints | | `clustering_enabled` | Gossip protocol and sync | true | Distributed mode | | `compression_enabled` | ZSTD compression | true | Reduces storage size | | `rate_limiting_enabled` | Rate limiting | true | Per-client limits | | `tamper_logging_enabled` | Cryptographic audit trail | true | Security logging | | `revision_history_enabled` | Automatic versioning | true | Data history tracking | | `catchup_interval` | Catch-up sync frequency | 120 sec | Faster sync when behind | | `bootstrap_max_age_hours` | Max historical data to sync | 720 hours | 30 days default | | `throttle_delay_ms` | Delay between sync requests | 100 ms | Prevents overwhelming peers | | `fetch_delay_ms` | Delay between individual fetches | 50 ms | Rate limiting | ## ๐Ÿšจ Important Notes ### Consistency Model - **Eventual Consistency**: Data will eventually be consistent across all nodes - **Local-First**: All operations succeed locally first, then replicate - **No Transactions**: Each key operation is independent - **Conflict Resolution**: Automatic resolution of timestamp collisions ### Network Requirements - All nodes must be able to reach each other via HTTP - Firewalls must allow traffic on configured ports - IPv4 private networks supported (IPv6 not tested) ### Limitations - No encryption in transit (use reverse proxy for TLS) - No cross-key transactions or ACID guarantees - No complex queries (key-based lookups only) - No automatic data sharding (single keyspace per cluster) - No multi-datacenter replication ### Performance Characteristics - **Read Latency**: ~1ms (local BadgerDB lookup) - **Write Latency**: ~5ms (local write + indexing + optional compression) - **Replication Lag**: 10-30 seconds with Merkle tree sync - **Memory Usage**: Minimal (BadgerDB + Merkle tree caching) - **Disk Usage**: Raw JSON + metadata + optional compression (10-50% savings) - **Conflict Resolution**: Sub-second convergence time - **Cluster Formation**: ~10-20 seconds for gossip stabilization ## ๐Ÿ›ก๏ธ Production Considerations ### Deployment - Built-in daemon commands (`start`/`stop`/`restart`/`status`) for process management - Alternatively, use systemd or similar for advanced orchestration - Logs automatically written to `~/.kvs/logs/` (configure log rotation) - Set up monitoring for `/health` endpoint - Use reverse proxy (nginx/traefik) for TLS and load balancing ### Monitoring - Monitor `/health` endpoint for node status - Watch logs for conflict resolution events - Track member count for cluster health - Monitor disk usage in data directories ### Backup Strategy - BadgerDB supports snapshots - Data directories can be backed up while running - Consider backing up multiple nodes for redundancy ### Scaling - Add new nodes by configuring existing cluster members as seeds - Remove nodes gracefully using `/members/leave` endpoint - Cluster can operate with any number of nodes (tested with 2-10) ## ๐Ÿ“„ License This project is licensed under the MIT License - see the LICENSE file for details. ## ๐Ÿค Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ## ๐Ÿ“š Additional Resources - [BadgerDB Documentation](https://dgraph.io/docs/badger/) - [Gossip Protocol Paper](https://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf) - [Eventually Consistent Systems](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html) --- **Built with โค๏ธ in Go** | **Powered by BadgerDB** | **Inspired by distributed systems theory**