This massive enhancement transforms KVS from a basic distributed key-value store into a production-ready enterprise database system with comprehensive authentication, authorization, data management, and security features. PHASE 2.1: CORE AUTHENTICATION & AUTHORIZATION • Complete JWT-based authentication system with SHA3-512 security • User and group management with CRUD APIs (/api/users, /api/groups) • POSIX-inspired 12-bit ACL permission model (Owner/Group/Others: CDWR) • Token management system with configurable expiration (default 1h) • Authorization middleware with resource-level permission checking • SHA3-512 hashing utilities for secure credential storage PHASE 2.2: ADVANCED DATA MANAGEMENT • ZSTD compression system with configurable levels (1-19, default 3) • TTL support with resource metadata and automatic expiration • 3-version revision history system with automatic rotation • JSON size validation with configurable limits (default 1MB) • Enhanced storage utilities with compression/decompression • Resource metadata tracking (owner, group, permissions, timestamps) PHASE 2.3: ENTERPRISE SECURITY & OPERATIONS • Per-user rate limiting with sliding window algorithm • Tamper-evident logging with cryptographic signatures (SHA3-512) • Automated backup scheduling using cron (default: daily at midnight) • ZSTD-compressed database snapshots with automatic cleanup • Configurable backup retention policies (default: 7 days) • Backup status monitoring API (/api/backup/status) TECHNICAL ADDITIONS • New dependencies: JWT v4, crypto/sha3, zstd compression, cron v3 • Extended configuration system with comprehensive Phase 2 settings • API endpoints: 13 new endpoints for authentication, management, monitoring • Storage patterns: user:<uuid>, group:<uuid>, token:<hash>, ratelimit:<user>:<window> • Revision history: data:<key>:rev:[1-3] with metadata integration • Tamper logs: log:<timestamp>:<uuid> with permanent retention BACKWARD COMPATIBILITY • All existing APIs remain fully functional • Existing Merkle tree replication system unchanged • New features can be disabled via configuration • Migration-ready design for upgrading existing deployments This implementation adds 1,500+ lines of sophisticated enterprise code while maintaining the distributed, eventually-consistent architecture. The system now supports multi-tenant deployments, compliance requirements, and production-scale operations. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
292 lines
6.9 KiB
Markdown
292 lines
6.9 KiB
Markdown
# KVS Development Phase 2: Implementation Specification
|
|
|
|
## Executive Summary
|
|
|
|
This document specifies the next development phase for the KVS (Key-Value Store) distributed database. Phase 2 adds authentication, authorization, data management improvements, and basic security features while maintaining backward compatibility with the existing Merkle tree-based replication system.
|
|
|
|
## 1. Authentication & Authorization System
|
|
|
|
### 1.1 Core Components
|
|
|
|
**Users**
|
|
- Identified by UUID (generated server-side)
|
|
- Nickname stored as SHA3-512 hash
|
|
- Can belong to multiple groups
|
|
- Storage key: `user:<uuid>`
|
|
|
|
**Groups**
|
|
- Identified by UUID (generated server-side)
|
|
- Group name stored as SHA3-512 hash
|
|
- Contains list of member user UUIDs
|
|
- Storage key: `group:<uuid>`
|
|
|
|
**API Tokens**
|
|
- JWT tokens with SHA3-512 hashed storage
|
|
- 1-hour default expiration (configurable)
|
|
- Storage key: `token:<sha3-512-hash>`
|
|
|
|
### 1.2 Permission Model
|
|
|
|
**POSIX-inspired ACL framework** with 12-bit permissions:
|
|
- 4 bits each for Owner/Group/Others
|
|
- Operations: Create(C), Delete(D), Write(W), Read(R)
|
|
- Default permissions: Owner(1111), Group(0110), Others(0010)
|
|
- Stored as integer bitmask in resource metadata
|
|
|
|
**Resource Metadata Schema**:
|
|
```json
|
|
{
|
|
"owner_uuid": "string",
|
|
"group_uuid": "string",
|
|
"permissions": 3826, // 12-bit integer
|
|
"ttl": "24h"
|
|
}
|
|
```
|
|
|
|
### 1.3 API Endpoints
|
|
|
|
**User Management**
|
|
```
|
|
POST /api/users
|
|
Body: {"nickname": "string"}
|
|
Returns: {"uuid": "string"}
|
|
|
|
GET /api/users/{uuid}
|
|
PUT /api/users/{uuid}
|
|
Body: {"nickname": "string", "groups": ["uuid1", "uuid2"]}
|
|
DELETE /api/users/{uuid}
|
|
```
|
|
|
|
**Group Management**
|
|
```
|
|
POST /api/groups
|
|
Body: {"groupname": "string", "members": ["uuid1", "uuid2"]}
|
|
Returns: {"uuid": "string"}
|
|
|
|
GET /api/groups/{uuid}
|
|
PUT /api/groups/{uuid}
|
|
Body: {"members": ["uuid1", "uuid2"]}
|
|
DELETE /api/groups/{uuid}
|
|
```
|
|
|
|
**Token Management**
|
|
```
|
|
POST /api/tokens
|
|
Body: {"user_uuid": "string", "scopes": ["read", "write"]}
|
|
Returns: {"token": "jwt-string", "expires_at": "timestamp"}
|
|
```
|
|
|
|
All endpoints require `Authorization: Bearer <token>` header.
|
|
|
|
### 1.4 Implementation Requirements
|
|
|
|
- Use `golang.org/x/crypto/sha3` for all hashing
|
|
- Store token SHA3-512 hash in BadgerDB with TTL
|
|
- Implement `CheckPermission(userUUID, resourceKey, operation) bool` function
|
|
- Include user/group data in existing Merkle tree replication
|
|
- Create migration script for existing data (add default metadata)
|
|
|
|
## 2. Database Enhancements
|
|
|
|
### 2.1 ZSTD Compression
|
|
|
|
**Configuration**:
|
|
```yaml
|
|
database:
|
|
compression_enabled: true
|
|
compression_level: 3 # 1-19, balance performance/ratio
|
|
```
|
|
|
|
**Implementation**:
|
|
- Use `github.com/klauspost/compress/zstd`
|
|
- Compress all JSON values before BadgerDB storage
|
|
- Decompress on read operations
|
|
- Optional: Batch recompression of existing data on startup
|
|
|
|
### 2.2 TTL (Time-To-Live)
|
|
|
|
**Features**:
|
|
- Per-key TTL support via resource metadata
|
|
- Global default TTL configuration (optional)
|
|
- Automatic expiration via BadgerDB's native TTL
|
|
- TTL applied to main data and revision keys
|
|
|
|
**API Integration**:
|
|
```json
|
|
// In PUT/POST requests
|
|
{
|
|
"data": {...},
|
|
"ttl": "24h" // Go duration format
|
|
}
|
|
```
|
|
|
|
### 2.3 Revision History
|
|
|
|
**Storage Pattern**:
|
|
- Main data: `data:<key>`
|
|
- Revisions: `data:<key>:rev:1`, `data:<key>:rev:2`, `data:<key>:rev:3`
|
|
- Metadata: `data:<key>:metadata` includes `"revisions": [1,2,3]`
|
|
|
|
**Rotation Logic**:
|
|
- On write: rev:1→rev:2, rev:2→rev:3, new→rev:1, delete rev:3
|
|
- Store up to 3 revisions per key
|
|
|
|
**API Endpoints**:
|
|
```
|
|
GET /api/data/{key}/history
|
|
Returns: {"revisions": [{"number": 1, "timestamp": "..."}]}
|
|
|
|
GET /api/data/{key}/history/{revision}
|
|
Returns: StoredValue for specific revision
|
|
```
|
|
|
|
### 2.4 Backup System
|
|
|
|
**Configuration**:
|
|
```yaml
|
|
backups:
|
|
enabled: true
|
|
schedule: "0 0 * * *" # Daily midnight
|
|
path: "/backups"
|
|
retention: 7 # days
|
|
```
|
|
|
|
**Implementation**:
|
|
- Use `github.com/robfig/cron/v3` for scheduling
|
|
- Create ZSTD-compressed BadgerDB snapshots
|
|
- Filename format: `kvs-backup-YYYY-MM-DD.zstd`
|
|
- Automatic cleanup of old backups
|
|
- Status API: `GET /api/backup/status`
|
|
|
|
### 2.5 JSON Size Limits
|
|
|
|
**Configuration**:
|
|
```yaml
|
|
database:
|
|
max_json_size: 1048576 # 1MB default
|
|
```
|
|
|
|
**Implementation**:
|
|
- Check size before compression/storage
|
|
- Return HTTP 413 if exceeded
|
|
- Apply to main data and revisions
|
|
- Log oversized attempts
|
|
|
|
## 3. Security Features
|
|
|
|
### 3.1 Rate Limiting
|
|
|
|
**Configuration**:
|
|
```yaml
|
|
rate_limit:
|
|
requests: 100
|
|
window: "1m"
|
|
```
|
|
|
|
**Implementation**:
|
|
- Per-user rate limiting using BadgerDB counters
|
|
- Key pattern: `ratelimit:<user_uuid>:<window_start>`
|
|
- Return HTTP 429 when limit exceeded
|
|
- Counters have TTL equal to window duration
|
|
|
|
### 3.2 Tamper-Evident Logs
|
|
|
|
**Log Entry Schema**:
|
|
```json
|
|
{
|
|
"timestamp": "2025-09-11T17:29:00Z",
|
|
"action": "data_write", // Configurable actions
|
|
"user_uuid": "string",
|
|
"resource": "string",
|
|
"signature": "sha3-512 hash" // Hash of all fields
|
|
}
|
|
```
|
|
|
|
**Storage**:
|
|
- Key: `log:<timestamp>:<uuid>`
|
|
- Compressed with ZSTD
|
|
- Hourly Merkle tree roots: `log:merkle:<timestamp>`
|
|
- Include in cluster replication
|
|
|
|
**Configurable Actions**:
|
|
```yaml
|
|
tamper_logs:
|
|
actions: ["data_write", "user_create", "auth_failure"]
|
|
```
|
|
|
|
## 4. Implementation Phases
|
|
|
|
### Phase 2.1: Core Authentication
|
|
1. Implement user/group storage schema
|
|
2. Add SHA3-512 hashing utilities
|
|
3. Create basic CRUD APIs for users/groups
|
|
4. Implement JWT token generation/validation
|
|
5. Add authorization middleware
|
|
|
|
### Phase 2.2: Data Features
|
|
1. Add ZSTD compression to BadgerDB operations
|
|
2. Implement TTL support in resource metadata
|
|
3. Build revision history system
|
|
4. Add JSON size validation
|
|
|
|
### Phase 2.3: Security & Operations
|
|
1. Implement rate limiting middleware
|
|
2. Add tamper-evident logging system
|
|
3. Build backup scheduling system
|
|
4. Create migration scripts for existing data
|
|
|
|
### Phase 2.4: Integration & Testing
|
|
1. Integrate auth with existing replication
|
|
2. End-to-end testing of all features
|
|
3. Performance benchmarking
|
|
4. Documentation updates
|
|
|
|
## 5. Configuration Example
|
|
|
|
```yaml
|
|
node_id: "node1"
|
|
bind_address: "127.0.0.1"
|
|
port: 8080
|
|
data_dir: "./data"
|
|
|
|
database:
|
|
compression_enabled: true
|
|
compression_level: 3
|
|
max_json_size: 1048576
|
|
default_ttl: "0" # No default TTL
|
|
|
|
backups:
|
|
enabled: true
|
|
schedule: "0 0 * * *"
|
|
path: "/backups"
|
|
retention: 7
|
|
|
|
rate_limit:
|
|
requests: 100
|
|
window: "1m"
|
|
|
|
tamper_logs:
|
|
actions: ["data_write", "user_create", "auth_failure"]
|
|
```
|
|
|
|
## 6. Migration Strategy
|
|
|
|
1. **Backward Compatibility**: All existing APIs remain functional
|
|
2. **Optional Features**: New features can be disabled via configuration
|
|
|
|
|
|
## 7. Dependencies
|
|
|
|
**New Libraries**:
|
|
- `golang.org/x/crypto/sha3` - SHA3-512 hashing
|
|
- `github.com/klauspost/compress/zstd` - Compression
|
|
- `github.com/robfig/cron/v3` - Backup scheduling
|
|
- `github.com/golang-jwt/jwt/v4` - JWT tokens (recommended)
|
|
|
|
**Existing Libraries** (no changes):
|
|
- `github.com/dgraph-io/badger/v4`
|
|
- `github.com/google/uuid`
|
|
- `github.com/gorilla/mux`
|
|
- `github.com/sirupsen/logrus`
|
|
|