Kalzu Rekku 32b347f1fd Add API endpoints for resource metadata management (ownership & permissions)
New types: UpdateResourceMetadataRequest and GetResourceMetadataResponse in types.go
    AuthService methods: StoreResourceMetadata and GetResourceMetadata in auth/auth.go
    Handlers: getResourceMetadataHandler and updateResourceMetadataHandler in server/handlers.go
    Routes: /kv/{path}/metadata (GET for read, PUT for update) with auth middleware in server/routes.go

Enables fine-grained control over KV path ownership, group assignments, and POSIX-inspired permissions.
2025-09-29 19:04:28 +03:00

KVS - Distributed Key-Value Store

A minimalistic, clustered key-value database system written in Go that prioritizes availability and partition tolerance over strong consistency. KVS implements a gossip-style membership protocol with sophisticated conflict resolution for eventually consistent distributed storage.

🚀 Key Features

  • Hierarchical Keys: Support for structured paths (e.g., /home/room/closet/socks)
  • Eventual Consistency: Local operations are fast, replication happens in background
  • Merkle Tree Sync: Efficient data synchronization with cryptographic integrity
  • Sophisticated Conflict Resolution: Oldest-node rule with UUID tie-breaking
  • JWT Authentication: Full authentication system with POSIX-inspired permissions
  • Local-First Truth: All operations work locally first, sync globally later
  • Read-Only Mode: Configurable mode for reducing write load
  • Modular Architecture: Clean separation of concerns with feature toggles
  • Comprehensive Features: TTL support, rate limiting, tamper logging, automated backups
  • Zero External Dependencies: Single binary with embedded BadgerDB storage

🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│     Node A      │    │     Node B      │    │     Node C      │
│  (Go Service)   │    │  (Go Service)   │    │  (Go Service)   │
│                 │    │                 │    │                 │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │HTTP API+Auth│ │◄──►│ │HTTP API+Auth│ │◄──►│ │HTTP API+Auth│ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │   Gossip    │ │◄──►│ │   Gossip    │ │◄──►│ │   Gossip    │ │
│ │  Protocol   │ │    │ │  Protocol   │ │    │ │  Protocol   │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │Merkle Sync  │ │◄──►│ │Merkle Sync  │ │◄──►│ │Merkle Sync  │ │
│ │& Conflict   │ │    │ │& Conflict   │ │    │ │& Conflict   │ │
│ │ Resolution  │ │    │ │ Resolution  │ │    │ │ Resolution  │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
│ ┌─────────────┐ │    │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │Storage+     │ │    │ │Storage+     │ │    │ │Storage+     │ │
│ │Features     │ │    │ │Features     │ │    │ │Features     │ │
│ │(BadgerDB)   │ │    │ │(BadgerDB)   │ │    │ │(BadgerDB)   │ │
│ └─────────────┘ │    │ └─────────────┘ │    │ └─────────────┘ │
└─────────────────┘    └─────────────────┘    └─────────────────┘
        ▲
        │
External Clients (JWT Auth)

Modular Design

KVS features a clean modular architecture with dedicated packages:

  • auth/ - JWT authentication and POSIX-inspired permissions
  • cluster/ - Gossip protocol, Merkle tree sync, and conflict resolution
  • storage/ - BadgerDB abstraction with compression and revisions
  • server/ - HTTP API, routing, and lifecycle management
  • features/ - TTL, rate limiting, tamper logging, backup utilities
  • config/ - Configuration management with auto-generation

📦 Installation

Prerequisites

  • Go 1.21 or higher

Build from Source

git clone <repository-url>
cd kvs
go mod tidy
go build -o kvs .

Quick Test

# Start standalone node
./kvs

# Test the API
curl http://localhost:8080/health

⚙️ Configuration

KVS uses YAML configuration files. On first run, a default config.yaml is automatically generated:

node_id: "hostname"                    # Unique node identifier
bind_address: "127.0.0.1"              # IP address to bind to
port: 8080                             # HTTP port
data_dir: "./data"                     # Directory for BadgerDB storage
seed_nodes: []                         # List of seed nodes for cluster joining
read_only: false                       # Enable read-only mode
log_level: "info"                      # Logging level (debug, info, warn, error)

# Cluster timing configuration
gossip_interval_min: 60                # Min gossip interval (seconds)
gossip_interval_max: 120               # Max gossip interval (seconds)
sync_interval: 300                     # Regular sync interval (seconds)
catchup_interval: 120                  # Catch-up sync interval (seconds)
bootstrap_max_age_hours: 720           # Max age for bootstrap sync (hours)
throttle_delay_ms: 100                 # Delay between sync requests (ms)
fetch_delay_ms: 50                     # Delay between data fetches (ms)

# Feature configuration
compression_enabled: true              # Enable ZSTD compression
compression_level: 3                   # Compression level (1-19)
default_ttl: "0"                       # Default TTL ("0" = no expiry)
max_json_size: 1048576                 # Max JSON payload size (1MB)
rate_limit_requests: 100               # Requests per window
rate_limit_window: "1m"                # Rate limit window

# Feature toggles
auth_enabled: true                     # JWT authentication system
tamper_logging_enabled: true           # Cryptographic audit trail
clustering_enabled: true               # Gossip protocol and sync
rate_limiting_enabled: true            # Rate limiting
revision_history_enabled: true         # Automatic versioning

# Anonymous access control (when auth_enabled: true)
allow_anonymous_read: false            # Allow unauthenticated read access to KV endpoints
allow_anonymous_write: false           # Allow unauthenticated write access to KV endpoints

# Backup configuration
backup_enabled: true                   # Automated backups
backup_schedule: "0 0 * * *"           # Daily at midnight (cron format)
backup_path: "./backups"               # Backup directory
backup_retention: 7                    # Days to keep backups

Custom Configuration

# Use custom config file
./kvs /path/to/custom-config.yaml

🔌 REST API

Data Operations (/kv/)

Store Data

PUT /kv/{path}
Content-Type: application/json
Authorization: Bearer <jwt-token>  # Required if auth_enabled && !allow_anonymous_write

# Basic storage
curl -X PUT http://localhost:8080/kv/users/john/profile \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eyJ..." \
  -d '{"name":"John Doe","age":30,"email":"john@example.com"}'

# Storage with TTL
curl -X PUT http://localhost:8080/kv/cache/session/abc123 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer eyJ..." \
  -d '{"data":{"user_id":"john"}, "ttl":"1h"}'

# Response
{
  "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "timestamp": 1672531200000
}

Retrieve Data

GET /kv/{path}
Authorization: Bearer <jwt-token>  # Required if auth_enabled && !allow_anonymous_read

curl -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile

# Response (full StoredValue format)
{
  "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "timestamp": 1672531200000,
  "data": {
    "name": "John Doe",
    "age": 30,
    "email": "john@example.com"
  }
}

Delete Data

DELETE /kv/{path}
Authorization: Bearer <jwt-token>  # Always required when auth_enabled (no anonymous delete)

curl -X DELETE -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile
# Returns: 204 No Content

Authentication Operations (/auth/)

Create User

POST /auth/users
Content-Type: application/json

curl -X POST http://localhost:8080/auth/users \
  -H "Content-Type: application/json" \
  -d '{"nickname":"john"}'

# Response
{"uuid": "user-abc123"}

Create API Token

POST /auth/tokens
Content-Type: application/json

curl -X POST http://localhost:8080/auth/tokens \
  -H "Content-Type: application/json" \
  -d '{"user_uuid":"user-abc123", "scopes":["read","write"]}'

# Response
{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expires_at": 1672617600000
}

Cluster Operations (/members/)

View Cluster Members

GET /members/

curl http://localhost:8080/members/
# Response
[
  {
    "id": "node-alpha",
    "address": "192.168.1.10:8080",
    "last_seen": 1672531200000,
    "joined_timestamp": 1672530000000
  }
]

Health Check

GET /health

curl http://localhost:8080/health
# Response
{
  "status": "ok",
  "mode": "normal",
  "member_count": 2,
  "node_id": "node-alpha"
}

Merkle Tree Operations (/sync/)

Get Merkle Root

GET /sync/merkle/root
# Used internally for data synchronization

Range Queries

GET /kv/_range?start_key=users/&end_key=users/z&limit=100
# Fetch key ranges for synchronization

🏘️ Cluster Setup

Single Node (Standalone)

# config.yaml
node_id: "standalone"
port: 8080
seed_nodes: []  # Empty = standalone mode

Multi-Node Cluster

Node 1 (Bootstrap Node)

# node1.yaml
node_id: "node1"
port: 8081
seed_nodes: []  # First node, no seeds needed
auth_enabled: true
clustering_enabled: true

Node 2 (Joins via Node 1)

# node2.yaml  
node_id: "node2"
port: 8082
seed_nodes: ["127.0.0.1:8081"]  # Points to node1
auth_enabled: true
clustering_enabled: true

Node 3 (Joins via Node 1 & 2)

# node3.yaml
node_id: "node3" 
port: 8083
seed_nodes: ["127.0.0.1:8081", "127.0.0.1:8082"]  # Multiple seeds for reliability
auth_enabled: true
clustering_enabled: true

Start the Cluster

# Terminal 1
./kvs node1.yaml

# Terminal 2 (wait a few seconds)
./kvs node2.yaml

# Terminal 3 (wait a few seconds)
./kvs node3.yaml

# Verify cluster formation
curl http://localhost:8081/members/  # Should show all 3 nodes

🔄 How It Works

Gossip Protocol

  • Nodes randomly select 1-3 peers every 1-2 minutes for membership exchange
  • Failed nodes are detected via timeout (5 minutes) and removed (10 minutes)
  • New members are automatically discovered and added to local member lists

Merkle Tree Synchronization

  • Merkle Trees: Each node builds cryptographic trees of their data for efficient comparison
  • Regular Sync: Every 5 minutes, nodes compare Merkle roots and sync divergent branches
  • Catch-up Sync: Every 2 minutes when nodes detect they're significantly behind
  • Bootstrap Sync: New nodes gradually fetch historical data up to 30 days old
  • Efficient Detection: Only synchronizes actual differences, not entire datasets

Sophisticated Conflict Resolution

When two nodes have different data for the same key with identical timestamps:

  1. Detection: Merkle tree comparison identifies conflicting keys
  2. Oldest-Node Rule: The version from the node with earliest joined_timestamp wins
  3. UUID Tie-Breaker: If join times are identical, lexicographically smaller UUID wins
  4. Automatic Resolution: Losing nodes automatically fetch and store the winning version
  5. Consistency: All nodes converge to the same data within seconds

Authentication & Authorization

  • JWT Tokens: Secure API access with scoped permissions
  • POSIX-Inspired ACLs: 12-bit permission system (owner/group/others with create/delete/write/read)
  • Resource Metadata: Each stored item has ownership and permission information
  • Feature Toggle: Can be completely disabled for simpler deployments

Operational Modes

  • Normal: Full read/write capabilities with all features
  • Read-Only: Rejects external writes but accepts internal replication
  • Syncing: Temporary mode during bootstrap, rejects external writes

🛠️ Development

Running Tests

# Build and run comprehensive integration tests
go build -o kvs .
./integration_test.sh

# Manual basic functionality test
./kvs &
curl http://localhost:8080/health
pkill kvs

# Manual cluster test (requires creating configs)
echo 'node_id: "test1"
port: 8081
seed_nodes: []
auth_enabled: false' > test1.yaml

echo 'node_id: "test2"
port: 8082
seed_nodes: ["127.0.0.1:8081"]
auth_enabled: false' > test2.yaml

./kvs test1.yaml &
./kvs test2.yaml &

# Test data replication (wait for cluster formation)
sleep 10
curl -X PUT http://localhost:8081/kv/test/data \
  -H "Content-Type: application/json" \
  -d '{"message":"hello world"}'

# Wait for Merkle sync, then check replication
sleep 30
curl http://localhost:8082/kv/test/data

# Cleanup
pkill kvs
rm test1.yaml test2.yaml

Conflict Resolution Testing

# Create conflicting data scenario using utility
go run test_conflict.go /tmp/conflict1 /tmp/conflict2

# Create configs for conflict test
echo 'node_id: "conflict1"
port: 9111
data_dir: "/tmp/conflict1"
seed_nodes: []
auth_enabled: false
log_level: "debug"' > conflict1.yaml

echo 'node_id: "conflict2"
port: 9112
data_dir: "/tmp/conflict2"
seed_nodes: ["127.0.0.1:9111"]
auth_enabled: false
log_level: "debug"' > conflict2.yaml

# Start nodes with conflicting data
./kvs conflict1.yaml &
./kvs conflict2.yaml &

# Watch logs for conflict resolution
# Both nodes will converge within ~10-30 seconds
# Check final state
sleep 30
curl http://localhost:9111/kv/test/conflict/data
curl http://localhost:9112/kv/test/conflict/data

pkill kvs
rm conflict1.yaml conflict2.yaml

Code Quality

# Format and lint
go fmt ./...
go vet ./...

# Dependency management
go mod tidy
go mod verify

# Build verification
go build .

Project Structure

kvs/
├── main.go                    # Main application entry point
├── config.yaml                # Default configuration (auto-generated)
├── integration_test.sh        # Comprehensive test suite
├── test_conflict.go           # Conflict resolution testing utility
├── CLAUDE.md                  # Development guidance for Claude Code
├── go.mod                     # Go module dependencies
├── go.sum                     # Go module checksums
├── README.md                  # This documentation
│
├── auth/                      # Authentication & authorization
│   ├── auth.go               # Main auth logic
│   ├── jwt.go                # JWT token management
│   ├── middleware.go         # HTTP middleware
│   ├── permissions.go        # POSIX-inspired ACL system
│   └── storage.go            # Auth data storage
│
├── cluster/                   # Distributed systems components
│   ├── bootstrap.go          # New node integration
│   ├── gossip.go             # Membership protocol
│   ├── merkle.go             # Merkle tree implementation
│   └── sync.go               # Data synchronization & conflict resolution
│
├── config/                    # Configuration management
│   └── config.go             # Config loading & defaults
│
├── features/                  # Utility features
│   ├── auth.go               # Auth utilities
│   ├── backup.go             # Backup system
│   ├── features.go           # Feature toggles
│   ├── ratelimit.go          # Rate limiting
│   ├── revision.go           # Revision history
│   ├── tamperlog.go          # Tamper-evident logging
│   └── validation.go         # TTL parsing
│
├── server/                    # HTTP server & API
│   ├── handlers.go           # Request handlers
│   ├── lifecycle.go          # Server lifecycle
│   ├── routes.go             # Route definitions
│   └── server.go             # Server setup
│
├── storage/                   # Data storage abstraction
│   ├── compression.go        # ZSTD compression
│   ├── revision.go           # Revision history
│   └── storage.go            # BadgerDB interface
│
├── types/                     # Shared type definitions
│   └── types.go              # All data structures
│
└── utils/                     # Utilities
    └── hash.go               # Cryptographic hashing

Key Data Structures

Stored Value Format

type StoredValue struct {
    UUID      string          `json:"uuid"`      // Unique version identifier
    Timestamp int64           `json:"timestamp"` // Unix timestamp (milliseconds)
    Data      json.RawMessage `json:"data"`      // Actual user JSON payload
}

BadgerDB Storage

  • Main Key: Direct path mapping (e.g., users/john/profile)
  • Index Key: _ts:{timestamp}:{path} for efficient time-based queries
  • Values: JSON-marshaled StoredValue structures

🔧 Configuration Options Explained

Setting Description Default Notes
Core Settings
node_id Unique identifier for this node hostname Must be unique across cluster
bind_address IP address to bind HTTP server "127.0.0.1" Use 0.0.0.0 for external access
port HTTP port for API and cluster communication 8080 Must be accessible to peers
data_dir Directory for BadgerDB storage "./data" Will be created if doesn't exist
seed_nodes List of initial cluster nodes [] Empty = standalone mode
read_only Enable read-only mode false Accepts replication, rejects client writes
log_level Logging verbosity "info" debug/info/warn/error
Cluster Timing
gossip_interval_min/max Gossip frequency range 60-120 sec Randomized interval
sync_interval Regular Merkle sync frequency 300 sec How often to sync with peers
catchup_interval Catch-up sync frequency 120 sec Faster sync when behind
bootstrap_max_age_hours Max historical data to sync 720 hours 30 days default
Feature Toggles
auth_enabled JWT authentication system true Complete auth/authz system
allow_anonymous_read Allow unauthenticated read access false When auth_enabled, controls KV GET endpoints
allow_anonymous_write Allow unauthenticated write access false When auth_enabled, controls KV PUT endpoints
clustering_enabled Gossip protocol and sync true Distributed mode
compression_enabled ZSTD compression true Reduces storage size
rate_limiting_enabled Rate limiting true Per-client limits
tamper_logging_enabled Cryptographic audit trail true Security logging
revision_history_enabled Automatic versioning true Data history tracking
catchup_interval Catch-up sync frequency 120 sec Faster sync when behind
bootstrap_max_age_hours Max historical data to sync 720 hours 30 days default
throttle_delay_ms Delay between sync requests 100 ms Prevents overwhelming peers
fetch_delay_ms Delay between individual fetches 50 ms Rate limiting

🚨 Important Notes

Consistency Model

  • Eventual Consistency: Data will eventually be consistent across all nodes
  • Local-First: All operations succeed locally first, then replicate
  • No Transactions: Each key operation is independent
  • Conflict Resolution: Automatic resolution of timestamp collisions

Network Requirements

  • All nodes must be able to reach each other via HTTP
  • Firewalls must allow traffic on configured ports
  • IPv4 private networks supported (IPv6 not tested)

Limitations

  • No encryption in transit (use reverse proxy for TLS)
  • No cross-key transactions or ACID guarantees
  • No complex queries (key-based lookups only)
  • No automatic data sharding (single keyspace per cluster)
  • No multi-datacenter replication

Performance Characteristics

  • Read Latency: ~1ms (local BadgerDB lookup)
  • Write Latency: ~5ms (local write + indexing + optional compression)
  • Replication Lag: 10-30 seconds with Merkle tree sync
  • Memory Usage: Minimal (BadgerDB + Merkle tree caching)
  • Disk Usage: Raw JSON + metadata + optional compression (10-50% savings)
  • Conflict Resolution: Sub-second convergence time
  • Cluster Formation: ~10-20 seconds for gossip stabilization

🛡️ Production Considerations

Deployment

  • Use systemd or similar for process management
  • Configure log rotation for JSON logs
  • Set up monitoring for /health endpoint
  • Use reverse proxy (nginx/traefik) for TLS and load balancing

Monitoring

  • Monitor /health endpoint for node status
  • Watch logs for conflict resolution events
  • Track member count for cluster health
  • Monitor disk usage in data directories

Backup Strategy

  • BadgerDB supports snapshots
  • Data directories can be backed up while running
  • Consider backing up multiple nodes for redundancy

Scaling

  • Add new nodes by configuring existing cluster members as seeds
  • Remove nodes gracefully using /members/leave endpoint
  • Cluster can operate with any number of nodes (tested with 2-10)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📚 Additional Resources


Built with ❤️ in Go | Powered by BadgerDB | Inspired by distributed systems theory

Description
No description provided
Readme 426 KiB
Languages
Go 88.9%
Shell 11.1%