ryyst 5ab03331fc Implement Phase 2: Enterprise-grade KVS enhancements
This massive enhancement transforms KVS from a basic distributed key-value store
into a production-ready enterprise database system with comprehensive authentication,
authorization, data management, and security features.

PHASE 2.1: CORE AUTHENTICATION & AUTHORIZATION
• Complete JWT-based authentication system with SHA3-512 security
• User and group management with CRUD APIs (/api/users, /api/groups)
• POSIX-inspired 12-bit ACL permission model (Owner/Group/Others: CDWR)
• Token management system with configurable expiration (default 1h)
• Authorization middleware with resource-level permission checking
• SHA3-512 hashing utilities for secure credential storage

PHASE 2.2: ADVANCED DATA MANAGEMENT
• ZSTD compression system with configurable levels (1-19, default 3)
• TTL support with resource metadata and automatic expiration
• 3-version revision history system with automatic rotation
• JSON size validation with configurable limits (default 1MB)
• Enhanced storage utilities with compression/decompression
• Resource metadata tracking (owner, group, permissions, timestamps)

PHASE 2.3: ENTERPRISE SECURITY & OPERATIONS
• Per-user rate limiting with sliding window algorithm
• Tamper-evident logging with cryptographic signatures (SHA3-512)
• Automated backup scheduling using cron (default: daily at midnight)
• ZSTD-compressed database snapshots with automatic cleanup
• Configurable backup retention policies (default: 7 days)
• Backup status monitoring API (/api/backup/status)

TECHNICAL ADDITIONS
• New dependencies: JWT v4, crypto/sha3, zstd compression, cron v3
• Extended configuration system with comprehensive Phase 2 settings
• API endpoints: 13 new endpoints for authentication, management, monitoring
• Storage patterns: user:<uuid>, group:<uuid>, token:<hash>, ratelimit:<user>:<window>
• Revision history: data:<key>:rev:[1-3] with metadata integration
• Tamper logs: log:<timestamp>:<uuid> with permanent retention

BACKWARD COMPATIBILITY
• All existing APIs remain fully functional
• Existing Merkle tree replication system unchanged
• New features can be disabled via configuration
• Migration-ready design for upgrading existing deployments

This implementation adds 1,500+ lines of sophisticated enterprise code while
maintaining the distributed, eventually-consistent architecture. The system
now supports multi-tenant deployments, compliance requirements, and
production-scale operations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-11 18:17:41 +03:00

KVS - Distributed Key-Value Store

A minimalistic, clustered key-value database system written in Go that prioritizes availability and partition tolerance over strong consistency. KVS implements a gossip-style membership protocol with sophisticated conflict resolution for eventually consistent distributed storage.

🚀 Key Features

  • Hierarchical Keys: Support for structured paths (e.g., /home/room/closet/socks)
  • Eventual Consistency: Local operations are fast, replication happens in background
  • Gossip Protocol: Decentralized node discovery and failure detection
  • Sophisticated Conflict Resolution: Majority vote with oldest-node tie-breaking
  • Local-First Truth: All operations work locally first, sync globally later
  • Read-Only Mode: Configurable mode for reducing write load
  • Gradual Bootstrapping: New nodes integrate smoothly without overwhelming cluster
  • Zero Dependencies: Single binary with embedded BadgerDB storage

🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│     Node A      │    │     Node B      │    │     Node C      │
│  (Go Service)   │    │  (Go Service)   │    │  (Go Service)   │
│                 │    │                 │    │                 │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │ HTTP Server │ │◄──►│ │ HTTP Server │ │◄──►│ │ HTTP Server │ │
│ │    (API)    │ │    │ │    (API)    │ │    │ │    (API)    │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │   Gossip    │ │◄──►│ │   Gossip    │ │◄──►│ │   Gossip    │ │
│ │  Protocol   │ │    │ │  Protocol   │ │    │ │  Protocol   │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │  BadgerDB   │ │    │ │  BadgerDB   │ │    │ │  BadgerDB   │ │
│ │ (Local KV)  │ │    │ │ (Local KV)  │ │    │ │ (Local KV)  │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
└─────────────────┘    └─────────────────┘    └─────────────────┘
        ▲
        │
    External Clients

Each node is fully autonomous and communicates with peers via HTTP REST API for both external client requests and internal cluster operations.

📦 Installation

Prerequisites

  • Go 1.21 or higher

Build from Source

git clone <repository-url>
cd kvs
go mod tidy
go build -o kvs .

Quick Test

# Start standalone node
./kvs

# Test the API
curl http://localhost:8080/health

⚙️ Configuration

KVS uses YAML configuration files. On first run, a default config.yaml is automatically generated:

node_id: "hostname"           # Unique node identifier
bind_address: "127.0.0.1"     # IP address to bind to
port: 8080                    # HTTP port
data_dir: "./data"            # Directory for BadgerDB storage
seed_nodes: []                # List of seed nodes for cluster joining
read_only: false              # Enable read-only mode
log_level: "info"             # Logging level (debug, info, warn, error)
gossip_interval_min: 60       # Min gossip interval (seconds)
gossip_interval_max: 120      # Max gossip interval (seconds)
sync_interval: 300            # Regular sync interval (seconds)
catchup_interval: 120         # Catch-up sync interval (seconds)
bootstrap_max_age_hours: 720  # Max age for bootstrap sync (hours)
throttle_delay_ms: 100        # Delay between sync requests (ms)
fetch_delay_ms: 50            # Delay between data fetches (ms)

Custom Configuration

# Use custom config file
./kvs /path/to/custom-config.yaml

🔌 REST API

Data Operations (/kv/)

Store Data

PUT /kv/{path}
Content-Type: application/json

curl -X PUT http://localhost:8080/kv/users/john/profile \
  -H "Content-Type: application/json" \
  -d '{"name":"John Doe","age":30,"email":"john@example.com"}'

# Response
{
  "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "timestamp": 1672531200000
}

Retrieve Data

GET /kv/{path}

curl http://localhost:8080/kv/users/john/profile

# Response
{
  "name": "John Doe",
  "age": 30,
  "email": "john@example.com"
}

Delete Data

DELETE /kv/{path}

curl -X DELETE http://localhost:8080/kv/users/john/profile
# Returns: 204 No Content

Cluster Operations (/members/)

View Cluster Members

GET /members/

curl http://localhost:8080/members/
# Response
[
  {
    "id": "node-alpha",
    "address": "192.168.1.10:8080",
    "last_seen": 1672531200000,
    "joined_timestamp": 1672530000000
  }
]

Join Cluster (Internal)

POST /members/join
# Used internally during bootstrap process

Health Check

GET /health

curl http://localhost:8080/health
# Response
{
  "status": "ok",
  "mode": "normal",
  "member_count": 2,
  "node_id": "node-alpha"
}

🏘️ Cluster Setup

Single Node (Standalone)

# config.yaml
node_id: "standalone"
port: 8080
seed_nodes: []  # Empty = standalone mode

Multi-Node Cluster

Node 1 (Bootstrap Node)

# node1.yaml
node_id: "node1"
port: 8081
seed_nodes: []  # First node, no seeds needed

Node 2 (Joins via Node 1)

# node2.yaml  
node_id: "node2"
port: 8082
seed_nodes: ["127.0.0.1:8081"]  # Points to node1

Node 3 (Joins via Node 1 & 2)

# node3.yaml
node_id: "node3" 
port: 8083
seed_nodes: ["127.0.0.1:8081", "127.0.0.1:8082"]  # Multiple seeds for reliability

Start the Cluster

# Terminal 1
./kvs node1.yaml

# Terminal 2 (wait a few seconds)
./kvs node2.yaml

# Terminal 3 (wait a few seconds)
./kvs node3.yaml

🔄 How It Works

Gossip Protocol

  • Nodes randomly select 1-3 peers every 1-2 minutes for membership exchange
  • Failed nodes are detected via timeout (5 minutes) and removed (10 minutes)
  • New members are automatically discovered and added to local member lists

Data Synchronization

  • Regular Sync: Every 5 minutes, nodes compare their latest 15 data items with a random peer
  • Catch-up Sync: Every 2 minutes when nodes detect they're significantly behind
  • Bootstrap Sync: New nodes gradually fetch historical data up to 30 days old

Conflict Resolution

When two nodes have different data for the same key with identical timestamps:

  1. Majority Vote: Query all healthy cluster members for their version
  2. Tie-Breaker: If votes are tied, the version from the oldest node (earliest joined_timestamp) wins
  3. Automatic Resolution: Losing nodes automatically fetch and store the winning version

Operational Modes

  • Normal: Full read/write capabilities
  • Read-Only: Rejects external writes but accepts internal replication
  • Syncing: Temporary mode during bootstrap, rejects external writes

🛠️ Development

Running Tests

# Basic functionality test
go build -o kvs .
./kvs &
curl http://localhost:8080/health
pkill kvs

# Cluster test with provided configs
./kvs node1.yaml &
./kvs node2.yaml &  
./kvs node3.yaml &

# Test data replication
curl -X PUT http://localhost:8081/kv/test/data \
  -H "Content-Type: application/json" \
  -d '{"message":"hello world"}'

# Wait 30+ seconds for sync, then check other nodes
curl http://localhost:8082/kv/test/data
curl http://localhost:8083/kv/test/data

# Cleanup
pkill kvs

Conflict Resolution Testing

# Create conflicting data scenario
rm -rf data1 data2
mkdir data1 data2
go run test_conflict.go data1 data2

# Start nodes with conflicting data
./kvs node1.yaml &
./kvs node2.yaml &

# Watch logs for conflict resolution
# Both nodes will converge to same data within ~30 seconds

Project Structure

kvs/
├── main.go              # Main application with all functionality
├── config.yaml          # Default configuration (auto-generated)
├── test_conflict.go     # Conflict resolution testing utility
├── node1.yaml           # Example cluster node config
├── node2.yaml           # Example cluster node config  
├── node3.yaml           # Example cluster node config
├── go.mod               # Go module dependencies
├── go.sum               # Go module checksums
└── README.md            # This documentation

Key Data Structures

Stored Value Format

type StoredValue struct {
    UUID      string          `json:"uuid"`      // Unique version identifier
    Timestamp int64           `json:"timestamp"` // Unix timestamp (milliseconds)
    Data      json.RawMessage `json:"data"`      // Actual user JSON payload
}

BadgerDB Storage

  • Main Key: Direct path mapping (e.g., users/john/profile)
  • Index Key: _ts:{timestamp}:{path} for efficient time-based queries
  • Values: JSON-marshaled StoredValue structures

🔧 Configuration Options Explained

Setting Description Default Notes
node_id Unique identifier for this node hostname Must be unique across cluster
bind_address IP address to bind HTTP server "127.0.0.1" Use 0.0.0.0 for external access
port HTTP port for API and cluster communication 8080 Must be accessible to peers
data_dir Directory for BadgerDB storage "./data" Will be created if doesn't exist
seed_nodes List of initial cluster nodes [] Empty = standalone mode
read_only Enable read-only mode false Accepts replication, rejects client writes
log_level Logging verbosity "info" debug/info/warn/error
gossip_interval_min/max Gossip frequency range 60-120 sec Randomized interval
sync_interval Regular sync frequency 300 sec How often to sync with peers
catchup_interval Catch-up sync frequency 120 sec Faster sync when behind
bootstrap_max_age_hours Max historical data to sync 720 hours 30 days default
throttle_delay_ms Delay between sync requests 100 ms Prevents overwhelming peers
fetch_delay_ms Delay between individual fetches 50 ms Rate limiting

🚨 Important Notes

Consistency Model

  • Eventual Consistency: Data will eventually be consistent across all nodes
  • Local-First: All operations succeed locally first, then replicate
  • No Transactions: Each key operation is independent
  • Conflict Resolution: Automatic resolution of timestamp collisions

Network Requirements

  • All nodes must be able to reach each other via HTTP
  • Firewalls must allow traffic on configured ports
  • IPv4 private networks supported (IPv6 not tested)

Limitations

  • No authentication/authorization (planned for future releases)
  • No encryption in transit (use reverse proxy for TLS)
  • No cross-key transactions
  • No complex queries (key-based lookups only)
  • No data compression (planned for future releases)

Performance Characteristics

  • Read Latency: ~1ms (local BadgerDB lookup)
  • Write Latency: ~5ms (local write + timestamp indexing)
  • Replication Lag: 30 seconds - 5 minutes depending on sync cycles
  • Memory Usage: Minimal (BadgerDB handles caching efficiently)
  • Disk Usage: Raw JSON + metadata overhead (~20-30%)

🛡️ Production Considerations

Deployment

  • Use systemd or similar for process management
  • Configure log rotation for JSON logs
  • Set up monitoring for /health endpoint
  • Use reverse proxy (nginx/traefik) for TLS and load balancing

Monitoring

  • Monitor /health endpoint for node status
  • Watch logs for conflict resolution events
  • Track member count for cluster health
  • Monitor disk usage in data directories

Backup Strategy

  • BadgerDB supports snapshots
  • Data directories can be backed up while running
  • Consider backing up multiple nodes for redundancy

Scaling

  • Add new nodes by configuring existing cluster members as seeds
  • Remove nodes gracefully using /members/leave endpoint
  • Cluster can operate with any number of nodes (tested with 2-10)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📚 Additional Resources


Built with ❤️ in Go | Powered by BadgerDB | Inspired by distributed systems theory

Description
No description provided
Readme 180 KiB
Languages
Go 90.2%
Shell 9.8%