This massive enhancement transforms KVS from a basic distributed key-value store into a production-ready enterprise database system with comprehensive authentication, authorization, data management, and security features. PHASE 2.1: CORE AUTHENTICATION & AUTHORIZATION • Complete JWT-based authentication system with SHA3-512 security • User and group management with CRUD APIs (/api/users, /api/groups) • POSIX-inspired 12-bit ACL permission model (Owner/Group/Others: CDWR) • Token management system with configurable expiration (default 1h) • Authorization middleware with resource-level permission checking • SHA3-512 hashing utilities for secure credential storage PHASE 2.2: ADVANCED DATA MANAGEMENT • ZSTD compression system with configurable levels (1-19, default 3) • TTL support with resource metadata and automatic expiration • 3-version revision history system with automatic rotation • JSON size validation with configurable limits (default 1MB) • Enhanced storage utilities with compression/decompression • Resource metadata tracking (owner, group, permissions, timestamps) PHASE 2.3: ENTERPRISE SECURITY & OPERATIONS • Per-user rate limiting with sliding window algorithm • Tamper-evident logging with cryptographic signatures (SHA3-512) • Automated backup scheduling using cron (default: daily at midnight) • ZSTD-compressed database snapshots with automatic cleanup • Configurable backup retention policies (default: 7 days) • Backup status monitoring API (/api/backup/status) TECHNICAL ADDITIONS • New dependencies: JWT v4, crypto/sha3, zstd compression, cron v3 • Extended configuration system with comprehensive Phase 2 settings • API endpoints: 13 new endpoints for authentication, management, monitoring • Storage patterns: user:<uuid>, group:<uuid>, token:<hash>, ratelimit:<user>:<window> • Revision history: data:<key>:rev:[1-3] with metadata integration • Tamper logs: log:<timestamp>:<uuid> with permanent retention BACKWARD COMPATIBILITY • All existing APIs remain fully functional • Existing Merkle tree replication system unchanged • New features can be disabled via configuration • Migration-ready design for upgrading existing deployments This implementation adds 1,500+ lines of sophisticated enterprise code while maintaining the distributed, eventually-consistent architecture. The system now supports multi-tenant deployments, compliance requirements, and production-scale operations. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
6.9 KiB
KVS Development Phase 2: Implementation Specification
Executive Summary
This document specifies the next development phase for the KVS (Key-Value Store) distributed database. Phase 2 adds authentication, authorization, data management improvements, and basic security features while maintaining backward compatibility with the existing Merkle tree-based replication system.
1. Authentication & Authorization System
1.1 Core Components
Users
- Identified by UUID (generated server-side)
- Nickname stored as SHA3-512 hash
- Can belong to multiple groups
- Storage key:
user:<uuid>
Groups
- Identified by UUID (generated server-side)
- Group name stored as SHA3-512 hash
- Contains list of member user UUIDs
- Storage key:
group:<uuid>
API Tokens
- JWT tokens with SHA3-512 hashed storage
- 1-hour default expiration (configurable)
- Storage key:
token:<sha3-512-hash>
1.2 Permission Model
POSIX-inspired ACL framework with 12-bit permissions:
- 4 bits each for Owner/Group/Others
- Operations: Create(C), Delete(D), Write(W), Read(R)
- Default permissions: Owner(1111), Group(0110), Others(0010)
- Stored as integer bitmask in resource metadata
Resource Metadata Schema:
{
"owner_uuid": "string",
"group_uuid": "string",
"permissions": 3826, // 12-bit integer
"ttl": "24h"
}
1.3 API Endpoints
User Management
POST /api/users
Body: {"nickname": "string"}
Returns: {"uuid": "string"}
GET /api/users/{uuid}
PUT /api/users/{uuid}
Body: {"nickname": "string", "groups": ["uuid1", "uuid2"]}
DELETE /api/users/{uuid}
Group Management
POST /api/groups
Body: {"groupname": "string", "members": ["uuid1", "uuid2"]}
Returns: {"uuid": "string"}
GET /api/groups/{uuid}
PUT /api/groups/{uuid}
Body: {"members": ["uuid1", "uuid2"]}
DELETE /api/groups/{uuid}
Token Management
POST /api/tokens
Body: {"user_uuid": "string", "scopes": ["read", "write"]}
Returns: {"token": "jwt-string", "expires_at": "timestamp"}
All endpoints require Authorization: Bearer <token>
header.
1.4 Implementation Requirements
- Use
golang.org/x/crypto/sha3
for all hashing - Store token SHA3-512 hash in BadgerDB with TTL
- Implement
CheckPermission(userUUID, resourceKey, operation) bool
function - Include user/group data in existing Merkle tree replication
- Create migration script for existing data (add default metadata)
2. Database Enhancements
2.1 ZSTD Compression
Configuration:
database:
compression_enabled: true
compression_level: 3 # 1-19, balance performance/ratio
Implementation:
- Use
github.com/klauspost/compress/zstd
- Compress all JSON values before BadgerDB storage
- Decompress on read operations
- Optional: Batch recompression of existing data on startup
2.2 TTL (Time-To-Live)
Features:
- Per-key TTL support via resource metadata
- Global default TTL configuration (optional)
- Automatic expiration via BadgerDB's native TTL
- TTL applied to main data and revision keys
API Integration:
// In PUT/POST requests
{
"data": {...},
"ttl": "24h" // Go duration format
}
2.3 Revision History
Storage Pattern:
- Main data:
data:<key>
- Revisions:
data:<key>:rev:1
,data:<key>:rev:2
,data:<key>:rev:3
- Metadata:
data:<key>:metadata
includes"revisions": [1,2,3]
Rotation Logic:
- On write: rev:1→rev:2, rev:2→rev:3, new→rev:1, delete rev:3
- Store up to 3 revisions per key
API Endpoints:
GET /api/data/{key}/history
Returns: {"revisions": [{"number": 1, "timestamp": "..."}]}
GET /api/data/{key}/history/{revision}
Returns: StoredValue for specific revision
2.4 Backup System
Configuration:
backups:
enabled: true
schedule: "0 0 * * *" # Daily midnight
path: "/backups"
retention: 7 # days
Implementation:
- Use
github.com/robfig/cron/v3
for scheduling - Create ZSTD-compressed BadgerDB snapshots
- Filename format:
kvs-backup-YYYY-MM-DD.zstd
- Automatic cleanup of old backups
- Status API:
GET /api/backup/status
2.5 JSON Size Limits
Configuration:
database:
max_json_size: 1048576 # 1MB default
Implementation:
- Check size before compression/storage
- Return HTTP 413 if exceeded
- Apply to main data and revisions
- Log oversized attempts
3. Security Features
3.1 Rate Limiting
Configuration:
rate_limit:
requests: 100
window: "1m"
Implementation:
- Per-user rate limiting using BadgerDB counters
- Key pattern:
ratelimit:<user_uuid>:<window_start>
- Return HTTP 429 when limit exceeded
- Counters have TTL equal to window duration
3.2 Tamper-Evident Logs
Log Entry Schema:
{
"timestamp": "2025-09-11T17:29:00Z",
"action": "data_write", // Configurable actions
"user_uuid": "string",
"resource": "string",
"signature": "sha3-512 hash" // Hash of all fields
}
Storage:
- Key:
log:<timestamp>:<uuid>
- Compressed with ZSTD
- Hourly Merkle tree roots:
log:merkle:<timestamp>
- Include in cluster replication
Configurable Actions:
tamper_logs:
actions: ["data_write", "user_create", "auth_failure"]
4. Implementation Phases
Phase 2.1: Core Authentication
- Implement user/group storage schema
- Add SHA3-512 hashing utilities
- Create basic CRUD APIs for users/groups
- Implement JWT token generation/validation
- Add authorization middleware
Phase 2.2: Data Features
- Add ZSTD compression to BadgerDB operations
- Implement TTL support in resource metadata
- Build revision history system
- Add JSON size validation
Phase 2.3: Security & Operations
- Implement rate limiting middleware
- Add tamper-evident logging system
- Build backup scheduling system
- Create migration scripts for existing data
Phase 2.4: Integration & Testing
- Integrate auth with existing replication
- End-to-end testing of all features
- Performance benchmarking
- Documentation updates
5. Configuration Example
node_id: "node1"
bind_address: "127.0.0.1"
port: 8080
data_dir: "./data"
database:
compression_enabled: true
compression_level: 3
max_json_size: 1048576
default_ttl: "0" # No default TTL
backups:
enabled: true
schedule: "0 0 * * *"
path: "/backups"
retention: 7
rate_limit:
requests: 100
window: "1m"
tamper_logs:
actions: ["data_write", "user_create", "auth_failure"]
6. Migration Strategy
- Backward Compatibility: All existing APIs remain functional
- Optional Features: New features can be disabled via configuration
7. Dependencies
New Libraries:
golang.org/x/crypto/sha3
- SHA3-512 hashinggithub.com/klauspost/compress/zstd
- Compressiongithub.com/robfig/cron/v3
- Backup schedulinggithub.com/golang-jwt/jwt/v4
- JWT tokens (recommended)
Existing Libraries (no changes):
github.com/dgraph-io/badger/v4
github.com/google/uuid
github.com/gorilla/mux
github.com/sirupsen/logrus