1 Commits

Author SHA1 Message Date
Kalzu Rekku
0b7af92761 First proto type for issue #3, initial root account. 2025-09-20 21:32:30 +03:00
39 changed files with 457 additions and 2511 deletions

2
.gitignore vendored
View File

@@ -1,8 +1,6 @@
.claude/ .claude/
.kvs/
data/ data/
data*/ data*/
integration_test/
*.yaml *.yaml
!config.yaml !config.yaml
kvs kvs

211
CLAUDE.md
View File

@@ -1,211 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Commands for Development
### Build and Test Commands
```bash
# Build the binary
go build -o kvs .
# Run with default config (auto-generates config.yaml)
./kvs start config.yaml
# Run with custom config
./kvs start /path/to/config.yaml
# Check running instances
./kvs status
# Stop instance
./kvs stop config
# Run comprehensive integration tests
./integration_test.sh
# Create test conflict data for debugging
go run test_conflict.go data1 data2
# Build and test in one go
go build -o kvs . && ./integration_test.sh
```
### Process Management Commands
```bash
# Start as background daemon
./kvs start <config.yaml> # .yaml extension optional
# Stop daemon
./kvs stop <config> # Graceful SIGTERM shutdown
# Restart daemon
./kvs restart <config> # Stop then start
# Show status
./kvs status # All instances
./kvs status <config> # Specific instance
# Run in foreground (for debugging)
./kvs <config.yaml> # Logs to stdout, blocks terminal
# View daemon logs
tail -f ~/.kvs/logs/kvs_<config>.yaml.log
# Global state directories
~/.kvs/pids/ # PID files (works from any directory)
~/.kvs/logs/ # Daemon log files
```
### Development Workflow
```bash
# Format and check code
go fmt ./...
go vet ./...
# Run dependencies management
go mod tidy
# Check build without artifacts
go build .
# Test specific cluster scenarios
./kvs start node1.yaml
./kvs start node2.yaml
# Wait for cluster formation
sleep 5
# Test data operations
curl -X PUT http://localhost:8081/kv/test/data -H "Content-Type: application/json" -d '{"test":"data"}'
curl http://localhost:8082/kv/test/data # Should replicate within ~30 seconds
# Check daemon status
./kvs status
# View logs
tail -f ~/.kvs/logs/kvs_node1.yaml.log
# Cleanup
./kvs stop node1
./kvs stop node2
```
## Architecture Overview
### High-Level Structure
KVS is a **distributed, eventually consistent key-value store** built around three core systems:
1. **Gossip Protocol** (`cluster/gossip.go`) - Decentralized membership management and failure detection
2. **Merkle Tree Sync** (`cluster/sync.go`, `cluster/merkle.go`) - Efficient data synchronization and conflict resolution
3. **Modular Server** (`server/`) - HTTP API with pluggable feature modules
### Key Architectural Patterns
#### Modular Package Design
- **`auth/`** - Complete JWT authentication system with POSIX-inspired permissions
- **`cluster/`** - Distributed systems logic (gossip, sync, merkle trees)
- **`daemon/`** - Process management (daemonization, PID files, lifecycle)
- **`storage/`** - BadgerDB abstraction with compression and revision history
- **`server/`** - HTTP handlers, routing, and lifecycle management
- **`features/`** - Utility functions for TTL, rate limiting, tamper logging, backup
- **`types/`** - Centralized type definitions for all components
- **`config/`** - Configuration loading with auto-generation
- **`utils/`** - Cryptographic hashing utilities
#### Core Data Model
```go
// Primary storage format
type StoredValue struct {
UUID string `json:"uuid"` // Unique version identifier
Timestamp int64 `json:"timestamp"` // Unix timestamp (milliseconds)
Data json.RawMessage `json:"data"` // Actual user JSON payload
}
```
#### Critical System Interactions
**Conflict Resolution Flow:**
1. Merkle trees detect divergent data between nodes (`cluster/merkle.go`)
2. Sync service fetches conflicting keys (`cluster/sync.go:fetchAndCompareData`)
3. Sophisticated conflict resolution logic in `resolveConflict()`:
- Same timestamp → Apply "oldest-node rule" (earliest `joined_timestamp` wins)
- Tie-breaker → UUID comparison for deterministic results
- Winner's data automatically replicated to losing nodes
**Authentication & Authorization:**
- JWT tokens with scoped permissions (`auth/jwt.go`)
- POSIX-inspired 12-bit permission system (`types/types.go:52-75`)
- Resource ownership metadata with TTL support (`types/ResourceMetadata`)
**Storage Strategy:**
- **Main keys**: Direct path mapping (`users/john/profile`)
- **Index keys**: `_ts:{timestamp}:{path}` for time-based queries
- **Compression**: Optional ZSTD compression (`storage/compression.go`)
- **Revisions**: Optional revision history (`storage/revision.go`)
### Configuration Architecture
The system uses feature toggles extensively (`types/Config:271-280`):
```yaml
auth_enabled: true # JWT authentication system
tamper_logging_enabled: true # Cryptographic audit trail
clustering_enabled: true # Gossip protocol and sync
rate_limiting_enabled: true # Per-client rate limiting
revision_history_enabled: true # Automatic versioning
# Anonymous access control (Issue #5 - when auth_enabled: true)
allow_anonymous_read: false # Allow unauthenticated read access to KV endpoints
allow_anonymous_write: false # Allow unauthenticated write access to KV endpoints
```
**Security Note**: DELETE operations always require authentication when `auth_enabled: true`, regardless of anonymous access settings.
### Testing Strategy
#### Integration Test Suite (`integration_test.sh`)
- **Build verification** - Ensures binary compiles correctly
- **Basic functionality** - Single-node CRUD operations
- **Cluster formation** - 2-node gossip protocol and data replication
- **Conflict resolution** - Automated conflict detection and resolution using `test_conflict.go`
- **Authentication middleware** - Comprehensive security testing (Issue #4):
- Admin endpoints properly reject unauthenticated requests
- Admin endpoints work with valid JWT tokens
- KV endpoints respect anonymous access configuration
- Automatic root account creation and token extraction
The test suite uses sophisticated retry logic and timing to handle the eventually consistent nature of the system.
#### Conflict Testing Utility (`test_conflict.go`)
Creates two BadgerDB instances with intentionally conflicting data (same path, same timestamp, different UUIDs) to test the conflict resolution algorithm.
### Development Notes
#### Key Constraints
- **Eventually Consistent**: All operations succeed locally first, then replicate
- **Local-First Truth**: Nodes operate independently and sync in background
- **No Transactions**: Each key operation is atomic and independent
- **Hierarchical Keys**: Support for path-like structures (`/home/room/closet/socks`)
#### Critical Timing Considerations
- **Gossip intervals**: 1-2 minutes for membership updates
- **Sync intervals**: 5 minutes for regular data sync, 2 minutes for catch-up
- **Conflict resolution**: Typically resolves within 10-30 seconds after detection
- **Bootstrap sync**: Up to 30 days of historical data for new nodes
#### Main Entry Point Flow
1. `main.go` parses command-line arguments for subcommands (`start`, `stop`, `status`, `restart`)
2. For daemon mode: `daemon.Daemonize()` spawns background process and manages PID files
3. For server mode: loads config (auto-generates default if missing)
4. `server.NewServer()` initializes all subsystems
5. Graceful shutdown handling with `SIGINT`/`SIGTERM`
6. All business logic delegated to modular packages
#### Daemon Architecture
- **PID Management**: Global PID files stored in `~/.kvs/pids/` for cross-directory access
- **Logging**: Daemon logs written to `~/.kvs/logs/{config-name}.log`
- **Process Lifecycle**: Spawns detached process via `exec.Command()` with `Setsid: true`
- **Config Normalization**: Supports both `node1` and `node1.yaml` formats
- **Stale PID Detection**: Checks process existence via `Signal(0)` before operations
This architecture enables easy feature addition, comprehensive testing, and reliable operation in distributed environments while maintaining simplicity for single-node deployments.

427
README.md
View File

@@ -6,14 +6,12 @@ A minimalistic, clustered key-value database system written in Go that prioritiz
- **Hierarchical Keys**: Support for structured paths (e.g., `/home/room/closet/socks`) - **Hierarchical Keys**: Support for structured paths (e.g., `/home/room/closet/socks`)
- **Eventual Consistency**: Local operations are fast, replication happens in background - **Eventual Consistency**: Local operations are fast, replication happens in background
- **Merkle Tree Sync**: Efficient data synchronization with cryptographic integrity - **Gossip Protocol**: Decentralized node discovery and failure detection
- **Sophisticated Conflict Resolution**: Oldest-node rule with UUID tie-breaking - **Sophisticated Conflict Resolution**: Majority vote with oldest-node tie-breaking
- **JWT Authentication**: Full authentication system with POSIX-inspired permissions
- **Local-First Truth**: All operations work locally first, sync globally later - **Local-First Truth**: All operations work locally first, sync globally later
- **Read-Only Mode**: Configurable mode for reducing write load - **Read-Only Mode**: Configurable mode for reducing write load
- **Modular Architecture**: Clean separation of concerns with feature toggles - **Gradual Bootstrapping**: New nodes integrate smoothly without overwhelming cluster
- **Comprehensive Features**: TTL support, rate limiting, tamper logging, automated backups - **Zero Dependencies**: Single binary with embedded BadgerDB storage
- **Zero External Dependencies**: Single binary with embedded BadgerDB storage
## 🏗️ Architecture ## 🏗️ Architecture
@@ -23,36 +21,24 @@ A minimalistic, clustered key-value database system written in Go that prioritiz
│ (Go Service) │ │ (Go Service) │ │ (Go Service) │ │ (Go Service) │ │ (Go Service) │ │ (Go Service) │
│ │ │ │ │ │ │ │ │ │ │ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │HTTP API+Auth│ │◄──►│ │HTTP API+Auth│ │◄──►│ │HTTP API+Auth│ │ │ │ HTTP Server │ │◄──►│ │ HTTP Server │ │◄──►│ │ HTTP Server │ │
│ │ (API) │ │ │ │ (API) │ │ │ │ (API) │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Gossip │ │◄──►│ │ Gossip │ │◄──►│ │ Gossip │ │ │ │ Gossip │ │◄──►│ │ Gossip │ │◄──►│ │ Gossip │ │
│ │ Protocol │ │ │ │ Protocol │ │ │ │ Protocol │ │ │ │ Protocol │ │ │ │ Protocol │ │ │ │ Protocol │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │Merkle Sync │ │◄──►│ │Merkle Sync │ │◄──►│ │Merkle Sync │ │ │ │ BadgerDB │ │ │ │ BadgerDB │ │ │ │ BadgerDB │ │
│ │& Conflict │ │ │ │& Conflict │ │ │ │& Conflict │ │ │ │ (Local KV) │ │ │ │ (Local KV) │ │ │ │ (Local KV) │ │
│ │ Resolution │ │ │ │ Resolution │ │ │ │ Resolution │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │Storage+ │ │ │ │Storage+ │ │ │ │Storage+ │ │
│ │Features │ │ │ │Features │ │ │ │Features │ │
│ │(BadgerDB) │ │ │ │(BadgerDB) │ │ │ │(BadgerDB) │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
└─────────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────┘
External Clients (JWT Auth) External Clients
``` ```
### Modular Design Each node is fully autonomous and communicates with peers via HTTP REST API for both external client requests and internal cluster operations.
KVS features a clean modular architecture with dedicated packages:
- **`auth/`** - JWT authentication and POSIX-inspired permissions
- **`cluster/`** - Gossip protocol, Merkle tree sync, and conflict resolution
- **`storage/`** - BadgerDB abstraction with compression and revisions
- **`server/`** - HTTP API, routing, and lifecycle management
- **`features/`** - TTL, rate limiting, tamper logging, backup utilities
- **`config/`** - Configuration management with auto-generation
## 📦 Installation ## 📦 Installation
@@ -69,67 +55,11 @@ go build -o kvs .
### Quick Test ### Quick Test
```bash ```bash
# Start standalone node (uses config.yaml if it exists, or creates it) # Start standalone node
./kvs start config.yaml ./kvs
# Test the API # Test the API
curl http://localhost:8080/health curl http://localhost:8080/health
# Check status
./kvs status
# Stop when done
./kvs stop config
```
## 🎮 Process Management
KVS includes systemd-style daemon commands for easy process management:
```bash
# Start as background daemon
./kvs start config.yaml # or just: ./kvs start config
./kvs start node1.yaml # Start with custom config
# Check status
./kvs status # Show all running instances
./kvs status node1 # Show specific instance
# Stop daemon
./kvs stop node1 # Graceful shutdown
# Restart daemon
./kvs restart node1 # Stop and start
# Run in foreground (traditional)
./kvs node1.yaml # Logs to stdout
```
### Daemon Features
- **Global PID tracking**: PID files stored in `~/.kvs/pids/` (works from any directory)
- **Automatic logging**: Logs written to `~/.kvs/logs/{config-name}.log`
- **Flexible naming**: Config extension optional (`node1` or `node1.yaml` both work)
- **Graceful shutdown**: SIGTERM sent for clean shutdown
- **Stale PID cleanup**: Automatically detects and cleans dead processes
- **Multi-instance**: Run multiple KVS instances on same machine
### Example Workflow
```bash
# Start 3-node cluster as daemons
./kvs start node1.yaml
./kvs start node2.yaml
./kvs start node3.yaml
# Check cluster status
./kvs status
# View logs
tail -f ~/.kvs/logs/kvs_node1.yaml.log
# Stop entire cluster
./kvs stop node1
./kvs stop node2
./kvs stop node3
``` ```
## ⚙️ Configuration ## ⚙️ Configuration
@@ -144,8 +74,6 @@ data_dir: "./data" # Directory for BadgerDB storage
seed_nodes: [] # List of seed nodes for cluster joining seed_nodes: [] # List of seed nodes for cluster joining
read_only: false # Enable read-only mode read_only: false # Enable read-only mode
log_level: "info" # Logging level (debug, info, warn, error) log_level: "info" # Logging level (debug, info, warn, error)
# Cluster timing configuration
gossip_interval_min: 60 # Min gossip interval (seconds) gossip_interval_min: 60 # Min gossip interval (seconds)
gossip_interval_max: 120 # Max gossip interval (seconds) gossip_interval_max: 120 # Max gossip interval (seconds)
sync_interval: 300 # Regular sync interval (seconds) sync_interval: 300 # Regular sync interval (seconds)
@@ -153,31 +81,6 @@ catchup_interval: 120 # Catch-up sync interval (seconds)
bootstrap_max_age_hours: 720 # Max age for bootstrap sync (hours) bootstrap_max_age_hours: 720 # Max age for bootstrap sync (hours)
throttle_delay_ms: 100 # Delay between sync requests (ms) throttle_delay_ms: 100 # Delay between sync requests (ms)
fetch_delay_ms: 50 # Delay between data fetches (ms) fetch_delay_ms: 50 # Delay between data fetches (ms)
# Feature configuration
compression_enabled: true # Enable ZSTD compression
compression_level: 3 # Compression level (1-19)
default_ttl: "0" # Default TTL ("0" = no expiry)
max_json_size: 1048576 # Max JSON payload size (1MB)
rate_limit_requests: 100 # Requests per window
rate_limit_window: "1m" # Rate limit window
# Feature toggles
auth_enabled: true # JWT authentication system
tamper_logging_enabled: true # Cryptographic audit trail
clustering_enabled: true # Gossip protocol and sync
rate_limiting_enabled: true # Rate limiting
revision_history_enabled: true # Automatic versioning
# Anonymous access control (when auth_enabled: true)
allow_anonymous_read: false # Allow unauthenticated read access to KV endpoints
allow_anonymous_write: false # Allow unauthenticated write access to KV endpoints
# Backup configuration
backup_enabled: true # Automated backups
backup_schedule: "0 0 * * *" # Daily at midnight (cron format)
backup_path: "./backups" # Backup directory
backup_retention: 7 # Days to keep backups
``` ```
### Custom Configuration ### Custom Configuration
@@ -194,20 +97,11 @@ backup_retention: 7 # Days to keep backups
```bash ```bash
PUT /kv/{path} PUT /kv/{path}
Content-Type: application/json Content-Type: application/json
Authorization: Bearer <jwt-token> # Required if auth_enabled && !allow_anonymous_write
# Basic storage
curl -X PUT http://localhost:8080/kv/users/john/profile \ curl -X PUT http://localhost:8080/kv/users/john/profile \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-H "Authorization: Bearer eyJ..." \
-d '{"name":"John Doe","age":30,"email":"john@example.com"}' -d '{"name":"John Doe","age":30,"email":"john@example.com"}'
# Storage with TTL
curl -X PUT http://localhost:8080/kv/cache/session/abc123 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer eyJ..." \
-d '{"data":{"user_id":"john"}, "ttl":"1h"}'
# Response # Response
{ {
"uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef", "uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
@@ -218,62 +112,25 @@ curl -X PUT http://localhost:8080/kv/cache/session/abc123 \
#### Retrieve Data #### Retrieve Data
```bash ```bash
GET /kv/{path} GET /kv/{path}
Authorization: Bearer <jwt-token> # Required if auth_enabled && !allow_anonymous_read
curl -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile curl http://localhost:8080/kv/users/john/profile
# Response (full StoredValue format) # Response
{ {
"uuid": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
"timestamp": 1672531200000,
"data": {
"name": "John Doe", "name": "John Doe",
"age": 30, "age": 30,
"email": "john@example.com" "email": "john@example.com"
} }
}
``` ```
#### Delete Data #### Delete Data
```bash ```bash
DELETE /kv/{path} DELETE /kv/{path}
Authorization: Bearer <jwt-token> # Always required when auth_enabled (no anonymous delete)
curl -X DELETE -H "Authorization: Bearer eyJ..." http://localhost:8080/kv/users/john/profile curl -X DELETE http://localhost:8080/kv/users/john/profile
# Returns: 204 No Content # Returns: 204 No Content
``` ```
### Authentication Operations (`/auth/`)
#### Create User
```bash
POST /auth/users
Content-Type: application/json
curl -X POST http://localhost:8080/auth/users \
-H "Content-Type: application/json" \
-d '{"nickname":"john"}'
# Response
{"uuid": "user-abc123"}
```
#### Create API Token
```bash
POST /auth/tokens
Content-Type: application/json
curl -X POST http://localhost:8080/auth/tokens \
-H "Content-Type: application/json" \
-d '{"user_uuid":"user-abc123", "scopes":["read","write"]}'
# Response
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_at": 1672617600000
}
```
### Cluster Operations (`/members/`) ### Cluster Operations (`/members/`)
#### View Cluster Members #### View Cluster Members
@@ -292,6 +149,12 @@ curl http://localhost:8080/members/
] ]
``` ```
#### Join Cluster (Internal)
```bash
POST /members/join
# Used internally during bootstrap process
```
#### Health Check #### Health Check
```bash ```bash
GET /health GET /health
@@ -306,20 +169,6 @@ curl http://localhost:8080/health
} }
``` ```
### Merkle Tree Operations (`/sync/`)
#### Get Merkle Root
```bash
GET /sync/merkle/root
# Used internally for data synchronization
```
#### Range Queries
```bash
GET /kv/_range?start_key=users/&end_key=users/z&limit=100
# Fetch key ranges for synchronization
```
## 🏘️ Cluster Setup ## 🏘️ Cluster Setup
### Single Node (Standalone) ### Single Node (Standalone)
@@ -338,8 +187,6 @@ seed_nodes: [] # Empty = standalone mode
node_id: "node1" node_id: "node1"
port: 8081 port: 8081
seed_nodes: [] # First node, no seeds needed seed_nodes: [] # First node, no seeds needed
auth_enabled: true
clustering_enabled: true
``` ```
#### Node 2 (Joins via Node 1) #### Node 2 (Joins via Node 1)
@@ -348,8 +195,6 @@ clustering_enabled: true
node_id: "node2" node_id: "node2"
port: 8082 port: 8082
seed_nodes: ["127.0.0.1:8081"] # Points to node1 seed_nodes: ["127.0.0.1:8081"] # Points to node1
auth_enabled: true
clustering_enabled: true
``` ```
#### Node 3 (Joins via Node 1 & 2) #### Node 3 (Joins via Node 1 & 2)
@@ -358,29 +203,18 @@ clustering_enabled: true
node_id: "node3" node_id: "node3"
port: 8083 port: 8083
seed_nodes: ["127.0.0.1:8081", "127.0.0.1:8082"] # Multiple seeds for reliability seed_nodes: ["127.0.0.1:8081", "127.0.0.1:8082"] # Multiple seeds for reliability
auth_enabled: true
clustering_enabled: true
``` ```
#### Start the Cluster #### Start the Cluster
```bash ```bash
# Start as daemons # Terminal 1
./kvs start node1.yaml ./kvs node1.yaml
sleep 2
./kvs start node2.yaml
sleep 2
./kvs start node3.yaml
# Verify cluster formation # Terminal 2 (wait a few seconds)
curl http://localhost:8081/members/ # Should show all 3 nodes ./kvs node2.yaml
# Check daemon status # Terminal 3 (wait a few seconds)
./kvs status ./kvs node3.yaml
# Stop cluster when done
./kvs stop node1
./kvs stop node2
./kvs stop node3
``` ```
## 🔄 How It Works ## 🔄 How It Works
@@ -390,30 +224,20 @@ curl http://localhost:8081/members/ # Should show all 3 nodes
- Failed nodes are detected via timeout (5 minutes) and removed (10 minutes) - Failed nodes are detected via timeout (5 minutes) and removed (10 minutes)
- New members are automatically discovered and added to local member lists - New members are automatically discovered and added to local member lists
### Merkle Tree Synchronization ### Data Synchronization
- **Merkle Trees**: Each node builds cryptographic trees of their data for efficient comparison - **Regular Sync**: Every 5 minutes, nodes compare their latest 15 data items with a random peer
- **Regular Sync**: Every 5 minutes, nodes compare Merkle roots and sync divergent branches
- **Catch-up Sync**: Every 2 minutes when nodes detect they're significantly behind - **Catch-up Sync**: Every 2 minutes when nodes detect they're significantly behind
- **Bootstrap Sync**: New nodes gradually fetch historical data up to 30 days old - **Bootstrap Sync**: New nodes gradually fetch historical data up to 30 days old
- **Efficient Detection**: Only synchronizes actual differences, not entire datasets
### Sophisticated Conflict Resolution ### Conflict Resolution
When two nodes have different data for the same key with identical timestamps: When two nodes have different data for the same key with identical timestamps:
1. **Detection**: Merkle tree comparison identifies conflicting keys 1. **Majority Vote**: Query all healthy cluster members for their version
2. **Oldest-Node Rule**: The version from the node with earliest `joined_timestamp` wins 2. **Tie-Breaker**: If votes are tied, the version from the oldest node (earliest `joined_timestamp`) wins
3. **UUID Tie-Breaker**: If join times are identical, lexicographically smaller UUID wins 3. **Automatic Resolution**: Losing nodes automatically fetch and store the winning version
4. **Automatic Resolution**: Losing nodes automatically fetch and store the winning version
5. **Consistency**: All nodes converge to the same data within seconds
### Authentication & Authorization
- **JWT Tokens**: Secure API access with scoped permissions
- **POSIX-Inspired ACLs**: 12-bit permission system (owner/group/others with create/delete/write/read)
- **Resource Metadata**: Each stored item has ownership and permission information
- **Feature Toggle**: Can be completely disabled for simpler deployments
### Operational Modes ### Operational Modes
- **Normal**: Full read/write capabilities with all features - **Normal**: Full read/write capabilities
- **Read-Only**: Rejects external writes but accepts internal replication - **Read-Only**: Rejects external writes but accepts internal replication
- **Syncing**: Temporary mode during bootstrap, rejects external writes - **Syncing**: Temporary mode during bootstrap, rejects external writes
@@ -421,158 +245,57 @@ When two nodes have different data for the same key with identical timestamps:
### Running Tests ### Running Tests
```bash ```bash
# Build and run comprehensive integration tests # Basic functionality test
go build -o kvs . go build -o kvs .
./integration_test.sh ./kvs &
# Manual basic functionality test
./kvs start config.yaml
sleep 2
curl http://localhost:8080/health curl http://localhost:8080/health
./kvs stop config pkill kvs
# Manual cluster test (requires creating configs) # Cluster test with provided configs
echo 'node_id: "test1" ./kvs node1.yaml &
port: 8081 ./kvs node2.yaml &
seed_nodes: [] ./kvs node3.yaml &
auth_enabled: false' > test1.yaml
echo 'node_id: "test2" # Test data replication
port: 8082
seed_nodes: ["127.0.0.1:8081"]
auth_enabled: false' > test2.yaml
./kvs start test1.yaml
sleep 2
./kvs start test2.yaml
# Test data replication (wait for cluster formation)
sleep 10
curl -X PUT http://localhost:8081/kv/test/data \ curl -X PUT http://localhost:8081/kv/test/data \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"message":"hello world"}' -d '{"message":"hello world"}'
# Wait for Merkle sync, then check replication # Wait 30+ seconds for sync, then check other nodes
sleep 30
curl http://localhost:8082/kv/test/data curl http://localhost:8082/kv/test/data
curl http://localhost:8083/kv/test/data
# Cleanup # Cleanup
./kvs stop test1 pkill kvs
./kvs stop test2
rm test1.yaml test2.yaml
``` ```
### Conflict Resolution Testing ### Conflict Resolution Testing
```bash ```bash
# Create conflicting data scenario using utility # Create conflicting data scenario
go run test_conflict.go /tmp/conflict1 /tmp/conflict2 rm -rf data1 data2
mkdir data1 data2
# Create configs for conflict test go run test_conflict.go data1 data2
echo 'node_id: "conflict1"
port: 9111
data_dir: "/tmp/conflict1"
seed_nodes: []
auth_enabled: false
log_level: "debug"' > conflict1.yaml
echo 'node_id: "conflict2"
port: 9112
data_dir: "/tmp/conflict2"
seed_nodes: ["127.0.0.1:9111"]
auth_enabled: false
log_level: "debug"' > conflict2.yaml
# Start nodes with conflicting data # Start nodes with conflicting data
./kvs start conflict1.yaml ./kvs node1.yaml &
sleep 2 ./kvs node2.yaml &
./kvs start conflict2.yaml
# Watch logs for conflict resolution # Watch logs for conflict resolution
tail -f ~/.kvs/logs/kvs_conflict1.yaml.log ~/.kvs/logs/kvs_conflict2.yaml.log & # Both nodes will converge to same data within ~30 seconds
# Both nodes will converge within ~10-30 seconds
# Check final state
sleep 30
curl http://localhost:9111/kv/test/conflict/data
curl http://localhost:9112/kv/test/conflict/data
# Cleanup
./kvs stop conflict1
./kvs stop conflict2
rm conflict1.yaml conflict2.yaml
```
### Code Quality
```bash
# Format and lint
go fmt ./...
go vet ./...
# Dependency management
go mod tidy
go mod verify
# Build verification
go build .
``` ```
### Project Structure ### Project Structure
``` ```
kvs/ kvs/
├── main.go # Main application entry point ├── main.go # Main application with all functionality
├── config.yaml # Default configuration (auto-generated) ├── config.yaml # Default configuration (auto-generated)
├── integration_test.sh # Comprehensive test suite
├── test_conflict.go # Conflict resolution testing utility ├── test_conflict.go # Conflict resolution testing utility
├── CLAUDE.md # Development guidance for Claude Code ├── node1.yaml # Example cluster node config
├── node2.yaml # Example cluster node config
├── node3.yaml # Example cluster node config
├── go.mod # Go module dependencies ├── go.mod # Go module dependencies
├── go.sum # Go module checksums ├── go.sum # Go module checksums
── README.md # This documentation ── README.md # This documentation
├── auth/ # Authentication & authorization
│ ├── auth.go # Main auth logic
│ ├── jwt.go # JWT token management
│ ├── middleware.go # HTTP middleware
│ ├── permissions.go # POSIX-inspired ACL system
│ └── storage.go # Auth data storage
├── cluster/ # Distributed systems components
│ ├── bootstrap.go # New node integration
│ ├── gossip.go # Membership protocol
│ ├── merkle.go # Merkle tree implementation
│ └── sync.go # Data synchronization & conflict resolution
├── config/ # Configuration management
│ └── config.go # Config loading & defaults
├── daemon/ # Process management
│ ├── daemonize.go # Background process spawning
│ └── pid.go # PID file management
├── features/ # Utility features
│ ├── auth.go # Auth utilities
│ ├── backup.go # Backup system
│ ├── features.go # Feature toggles
│ ├── ratelimit.go # Rate limiting
│ ├── revision.go # Revision history
│ ├── tamperlog.go # Tamper-evident logging
│ └── validation.go # TTL parsing
├── server/ # HTTP server & API
│ ├── handlers.go # Request handlers
│ ├── lifecycle.go # Server lifecycle
│ ├── routes.go # Route definitions
│ └── server.go # Server setup
├── storage/ # Data storage abstraction
│ ├── compression.go # ZSTD compression
│ ├── revision.go # Revision history
│ └── storage.go # BadgerDB interface
├── types/ # Shared type definitions
│ └── types.go # All data structures
└── utils/ # Utilities
└── hash.go # Cryptographic hashing
``` ```
### Key Data Structures ### Key Data Structures
@@ -595,7 +318,6 @@ type StoredValue struct {
| Setting | Description | Default | Notes | | Setting | Description | Default | Notes |
|---------|-------------|---------|-------| |---------|-------------|---------|-------|
| **Core Settings** |
| `node_id` | Unique identifier for this node | hostname | Must be unique across cluster | | `node_id` | Unique identifier for this node | hostname | Must be unique across cluster |
| `bind_address` | IP address to bind HTTP server | "127.0.0.1" | Use 0.0.0.0 for external access | | `bind_address` | IP address to bind HTTP server | "127.0.0.1" | Use 0.0.0.0 for external access |
| `port` | HTTP port for API and cluster communication | 8080 | Must be accessible to peers | | `port` | HTTP port for API and cluster communication | 8080 | Must be accessible to peers |
@@ -603,20 +325,8 @@ type StoredValue struct {
| `seed_nodes` | List of initial cluster nodes | [] | Empty = standalone mode | | `seed_nodes` | List of initial cluster nodes | [] | Empty = standalone mode |
| `read_only` | Enable read-only mode | false | Accepts replication, rejects client writes | | `read_only` | Enable read-only mode | false | Accepts replication, rejects client writes |
| `log_level` | Logging verbosity | "info" | debug/info/warn/error | | `log_level` | Logging verbosity | "info" | debug/info/warn/error |
| **Cluster Timing** |
| `gossip_interval_min/max` | Gossip frequency range | 60-120 sec | Randomized interval | | `gossip_interval_min/max` | Gossip frequency range | 60-120 sec | Randomized interval |
| `sync_interval` | Regular Merkle sync frequency | 300 sec | How often to sync with peers | | `sync_interval` | Regular sync frequency | 300 sec | How often to sync with peers |
| `catchup_interval` | Catch-up sync frequency | 120 sec | Faster sync when behind |
| `bootstrap_max_age_hours` | Max historical data to sync | 720 hours | 30 days default |
| **Feature Toggles** |
| `auth_enabled` | JWT authentication system | true | Complete auth/authz system |
| `allow_anonymous_read` | Allow unauthenticated read access | false | When auth_enabled, controls KV GET endpoints |
| `allow_anonymous_write` | Allow unauthenticated write access | false | When auth_enabled, controls KV PUT endpoints |
| `clustering_enabled` | Gossip protocol and sync | true | Distributed mode |
| `compression_enabled` | ZSTD compression | true | Reduces storage size |
| `rate_limiting_enabled` | Rate limiting | true | Per-client limits |
| `tamper_logging_enabled` | Cryptographic audit trail | true | Security logging |
| `revision_history_enabled` | Automatic versioning | true | Data history tracking |
| `catchup_interval` | Catch-up sync frequency | 120 sec | Faster sync when behind | | `catchup_interval` | Catch-up sync frequency | 120 sec | Faster sync when behind |
| `bootstrap_max_age_hours` | Max historical data to sync | 720 hours | 30 days default | | `bootstrap_max_age_hours` | Max historical data to sync | 720 hours | 30 days default |
| `throttle_delay_ms` | Delay between sync requests | 100 ms | Prevents overwhelming peers | | `throttle_delay_ms` | Delay between sync requests | 100 ms | Prevents overwhelming peers |
@@ -636,27 +346,24 @@ type StoredValue struct {
- IPv4 private networks supported (IPv6 not tested) - IPv4 private networks supported (IPv6 not tested)
### Limitations ### Limitations
- No authentication/authorization (planned for future releases)
- No encryption in transit (use reverse proxy for TLS) - No encryption in transit (use reverse proxy for TLS)
- No cross-key transactions or ACID guarantees - No cross-key transactions
- No complex queries (key-based lookups only) - No complex queries (key-based lookups only)
- No automatic data sharding (single keyspace per cluster) - No data compression (planned for future releases)
- No multi-datacenter replication
### Performance Characteristics ### Performance Characteristics
- **Read Latency**: ~1ms (local BadgerDB lookup) - **Read Latency**: ~1ms (local BadgerDB lookup)
- **Write Latency**: ~5ms (local write + indexing + optional compression) - **Write Latency**: ~5ms (local write + timestamp indexing)
- **Replication Lag**: 10-30 seconds with Merkle tree sync - **Replication Lag**: 30 seconds - 5 minutes depending on sync cycles
- **Memory Usage**: Minimal (BadgerDB + Merkle tree caching) - **Memory Usage**: Minimal (BadgerDB handles caching efficiently)
- **Disk Usage**: Raw JSON + metadata + optional compression (10-50% savings) - **Disk Usage**: Raw JSON + metadata overhead (~20-30%)
- **Conflict Resolution**: Sub-second convergence time
- **Cluster Formation**: ~10-20 seconds for gossip stabilization
## 🛡️ Production Considerations ## 🛡️ Production Considerations
### Deployment ### Deployment
- Built-in daemon commands (`start`/`stop`/`restart`/`status`) for process management - Use systemd or similar for process management
- Alternatively, use systemd or similar for advanced orchestration - Configure log rotation for JSON logs
- Logs automatically written to `~/.kvs/logs/` (configure log rotation)
- Set up monitoring for `/health` endpoint - Set up monitoring for `/health` endpoint
- Use reverse proxy (nginx/traefik) for TLS and load balancing - Use reverse proxy (nginx/traefik) for TLS and load balancing

View File

@@ -26,15 +26,13 @@ type AuthContext struct {
type AuthService struct { type AuthService struct {
db *badger.DB db *badger.DB
logger *logrus.Logger logger *logrus.Logger
config *types.Config
} }
// NewAuthService creates a new authentication service // NewAuthService creates a new authentication service
func NewAuthService(db *badger.DB, logger *logrus.Logger, config *types.Config) *AuthService { func NewAuthService(db *badger.DB, logger *logrus.Logger) *AuthService {
return &AuthService{ return &AuthService{
db: db, db: db,
logger: logger, logger: logger,
config: config,
} }
} }
@@ -198,40 +196,6 @@ func (s *AuthService) CheckResourcePermission(authCtx *AuthContext, resourceKey
return CheckPermission(metadata.Permissions, operation, isOwner, isGroupMember) return CheckPermission(metadata.Permissions, operation, isOwner, isGroupMember)
} }
// GetResourceMetadata retrieves metadata for a resource
func (s *AuthService) GetResourceMetadata(resourceKey string) (*types.ResourceMetadata, error) {
var metadata types.ResourceMetadata
err := s.db.View(func(txn *badger.Txn) error {
item, err := txn.Get([]byte(ResourceMetadataKey(resourceKey)))
if err != nil {
return err
}
return item.Value(func(val []byte) error {
return json.Unmarshal(val, &metadata)
})
})
if err != nil {
return nil, err
}
return &metadata, nil
}
// SetResourceMetadata stores metadata for a resource
func (s *AuthService) SetResourceMetadata(resourceKey string, metadata *types.ResourceMetadata) error {
metadataBytes, err := json.Marshal(metadata)
if err != nil {
return fmt.Errorf("failed to marshal metadata: %v", err)
}
return s.db.Update(func(txn *badger.Txn) error {
return txn.Set([]byte(ResourceMetadataKey(resourceKey)), metadataBytes)
})
}
// GetAuthContext retrieves auth context from request context // GetAuthContext retrieves auth context from request context
func GetAuthContext(ctx context.Context) *AuthContext { func GetAuthContext(ctx context.Context) *AuthContext {
if authCtx, ok := ctx.Value("auth").(*AuthContext); ok { if authCtx, ok := ctx.Value("auth").(*AuthContext); ok {
@@ -242,23 +206,17 @@ func GetAuthContext(ctx context.Context) *AuthContext {
// HasUsers checks if any users exist in the database // HasUsers checks if any users exist in the database
func (s *AuthService) HasUsers() (bool, error) { func (s *AuthService) HasUsers() (bool, error) {
var hasUsers bool var has bool
err := s.db.View(func(txn *badger.Txn) error { err := s.db.View(func(txn *badger.Txn) error {
opts := badger.DefaultIteratorOptions opts := badger.DefaultIteratorOptions
opts.PrefetchValues = false // We only need to check if keys exist opts.PrefetchValues = false // Only need keys
iterator := txn.NewIterator(opts) it := txn.NewIterator(opts)
defer iterator.Close() defer it.Close()
// Look for any key starting with "user:"
prefix := []byte("user:")
for iterator.Seek(prefix); iterator.ValidForPrefix(prefix); iterator.Next() {
hasUsers = true
return nil // Found at least one user, can exit early
}
prefix := []byte("user:") // Adjust if UserStorageKey uses a different prefix
it.Seek(prefix)
has = it.ValidForPrefix(prefix)
return nil return nil
}) })
return has, err
return hasUsers, err
} }

View File

@@ -1,77 +0,0 @@
package auth
import (
"net/http"
"github.com/sirupsen/logrus"
)
// ClusterAuthService handles authentication for inter-cluster communication
type ClusterAuthService struct {
clusterSecret string
logger *logrus.Logger
}
// NewClusterAuthService creates a new cluster authentication service
func NewClusterAuthService(clusterSecret string, logger *logrus.Logger) *ClusterAuthService {
return &ClusterAuthService{
clusterSecret: clusterSecret,
logger: logger,
}
}
// Middleware validates cluster authentication headers
func (s *ClusterAuthService) Middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Extract authentication headers
clusterSecret := r.Header.Get("X-Cluster-Secret")
nodeID := r.Header.Get("X-Node-ID")
// Log authentication attempt
s.logger.WithFields(logrus.Fields{
"node_id": nodeID,
"remote_addr": r.RemoteAddr,
"path": r.URL.Path,
"method": r.Method,
}).Debug("Cluster authentication attempt")
// Validate cluster secret
if clusterSecret == "" {
s.logger.WithFields(logrus.Fields{
"node_id": nodeID,
"remote_addr": r.RemoteAddr,
"path": r.URL.Path,
}).Warn("Missing X-Cluster-Secret header")
http.Error(w, "Unauthorized: Missing cluster secret", http.StatusUnauthorized)
return
}
if clusterSecret != s.clusterSecret {
s.logger.WithFields(logrus.Fields{
"node_id": nodeID,
"remote_addr": r.RemoteAddr,
"path": r.URL.Path,
}).Warn("Invalid cluster secret")
http.Error(w, "Unauthorized: Invalid cluster secret", http.StatusUnauthorized)
return
}
// Validate node ID is present
if nodeID == "" {
s.logger.WithFields(logrus.Fields{
"remote_addr": r.RemoteAddr,
"path": r.URL.Path,
}).Warn("Missing X-Node-ID header")
http.Error(w, "Unauthorized: Missing node ID", http.StatusUnauthorized)
return
}
// Authentication successful
s.logger.WithFields(logrus.Fields{
"node_id": nodeID,
"path": r.URL.Path,
}).Debug("Cluster authentication successful")
next.ServeHTTP(w, r)
})
}

View File

@@ -138,12 +138,11 @@ func (s *RateLimitService) RateLimitMiddleware(next http.HandlerFunc) http.Handl
} }
} }
// isAuthEnabled checks if authentication is enabled from config // isAuthEnabled checks if authentication is enabled (would be passed from config)
func (s *AuthService) isAuthEnabled() bool { func (s *AuthService) isAuthEnabled() bool {
if s.config != nil { // This would normally be injected from config, but for now we'll assume enabled
return s.config.AuthEnabled // TODO: Inject config dependency
} return true
return true // Default to enabled if no config
} }
// Helper method to check rate limits (simplified version) // Helper method to check rate limits (simplified version)

View File

@@ -82,19 +82,10 @@ func (s *BootstrapService) attemptJoin(seedAddr string) bool {
return false return false
} }
client := NewAuthenticatedHTTPClient(s.config, 10*time.Second) client := &http.Client{Timeout: 10 * time.Second}
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/members/join", seedAddr)
url := fmt.Sprintf("%s://%s/members/join", protocol, seedAddr)
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonData)) resp, err := client.Post(url, "application/json", bytes.NewBuffer(jsonData))
if err != nil {
s.logger.WithError(err).Error("Failed to create join request")
return false
}
req.Header.Set("Content-Type", "application/json")
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
s.logger.WithFields(logrus.Fields{ s.logger.WithFields(logrus.Fields{
"seed": seedAddr, "seed": seedAddr,

View File

@@ -181,20 +181,11 @@ func (s *GossipService) gossipWithPeer(peer *types.Member) error {
return err return err
} }
// Send HTTP request to peer with cluster authentication // Send HTTP request to peer
client := NewAuthenticatedHTTPClient(s.config, 5*time.Second) client := &http.Client{Timeout: 5 * time.Second}
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/members/gossip", peer.Address)
url := fmt.Sprintf("%s://%s/members/gossip", protocol, peer.Address)
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonData)) resp, err := client.Post(url, "application/json", bytes.NewBuffer(jsonData))
if err != nil {
s.logger.WithError(err).Error("Failed to create gossip request")
return err
}
req.Header.Set("Content-Type", "application/json")
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
s.logger.WithFields(logrus.Fields{ s.logger.WithFields(logrus.Fields{
"peer": peer.Address, "peer": peer.Address,

View File

@@ -1,43 +0,0 @@
package cluster
import (
"crypto/tls"
"net/http"
"time"
"kvs/types"
)
// NewAuthenticatedHTTPClient creates an HTTP client configured for cluster authentication
func NewAuthenticatedHTTPClient(config *types.Config, timeout time.Duration) *http.Client {
client := &http.Client{
Timeout: timeout,
}
// Configure TLS if enabled
if config.ClusterTLSEnabled {
tlsConfig := &tls.Config{
InsecureSkipVerify: config.ClusterTLSSkipVerify,
}
client.Transport = &http.Transport{
TLSClientConfig: tlsConfig,
}
}
return client
}
// AddClusterAuthHeaders adds authentication headers to an HTTP request
func AddClusterAuthHeaders(req *http.Request, config *types.Config) {
req.Header.Set("X-Cluster-Secret", config.ClusterSecret)
req.Header.Set("X-Node-ID", config.NodeID)
}
// GetProtocol returns the appropriate protocol (http or https) based on TLS configuration
func GetProtocol(config *types.Config) string {
if config.ClusterTLSEnabled {
return "https"
}
return "http"
}

View File

@@ -186,17 +186,10 @@ func (s *SyncService) performMerkleSync() {
// requestMerkleRoot requests the Merkle root from a peer // requestMerkleRoot requests the Merkle root from a peer
func (s *SyncService) requestMerkleRoot(peerAddress string) (*types.MerkleRootResponse, error) { func (s *SyncService) requestMerkleRoot(peerAddress string) (*types.MerkleRootResponse, error) {
client := NewAuthenticatedHTTPClient(s.config, 10*time.Second) client := &http.Client{Timeout: 10 * time.Second}
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/merkle_tree/root", peerAddress)
url := fmt.Sprintf("%s://%s/merkle_tree/root", protocol, peerAddress)
req, err := http.NewRequest("GET", url, nil) resp, err := client.Get(url)
if err != nil {
return nil, err
}
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -301,17 +294,10 @@ func (s *SyncService) handleLeafLevelDiff(peerAddress string, keys []string, loc
// fetchSingleKVFromPeer fetches a single KV pair from a peer // fetchSingleKVFromPeer fetches a single KV pair from a peer
func (s *SyncService) fetchSingleKVFromPeer(peerAddress, path string) (*types.StoredValue, error) { func (s *SyncService) fetchSingleKVFromPeer(peerAddress, path string) (*types.StoredValue, error) {
client := NewAuthenticatedHTTPClient(s.config, 5*time.Second) client := &http.Client{Timeout: 5 * time.Second}
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/kv/%s", peerAddress, path)
url := fmt.Sprintf("%s://%s/kv/%s", protocol, peerAddress, path)
req, err := http.NewRequest("GET", url, nil) resp, err := client.Get(url)
if err != nil {
return nil, err
}
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -475,24 +461,16 @@ func (s *SyncService) resolveConflict(key string, local, remote *types.StoredVal
} }
// requestMerkleDiff requests children hashes or keys for a given node/range from a peer // requestMerkleDiff requests children hashes or keys for a given node/range from a peer
func (s *SyncService) requestMerkleDiff(peerAddress string, reqData types.MerkleTreeDiffRequest) (*types.MerkleTreeDiffResponse, error) { func (s *SyncService) requestMerkleDiff(peerAddress string, req types.MerkleTreeDiffRequest) (*types.MerkleTreeDiffResponse, error) {
jsonData, err := json.Marshal(reqData) jsonData, err := json.Marshal(req)
if err != nil { if err != nil {
return nil, err return nil, err
} }
client := NewAuthenticatedHTTPClient(s.config, 10*time.Second) client := &http.Client{Timeout: 10 * time.Second}
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/merkle_tree/diff", peerAddress)
url := fmt.Sprintf("%s://%s/merkle_tree/diff", protocol, peerAddress)
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonData)) resp, err := client.Post(url, "application/json", bytes.NewBuffer(jsonData))
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json")
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -547,28 +525,20 @@ func (s *SyncService) handleChildrenDiff(peerAddress string, children []types.Me
// fetchAndStoreRange fetches a range of KV pairs from a peer and stores them locally // fetchAndStoreRange fetches a range of KV pairs from a peer and stores them locally
func (s *SyncService) fetchAndStoreRange(peerAddress string, startKey, endKey string) error { func (s *SyncService) fetchAndStoreRange(peerAddress string, startKey, endKey string) error {
reqData := types.KVRangeRequest{ req := types.KVRangeRequest{
StartKey: startKey, StartKey: startKey,
EndKey: endKey, EndKey: endKey,
Limit: 0, // No limit Limit: 0, // No limit
} }
jsonData, err := json.Marshal(reqData) jsonData, err := json.Marshal(req)
if err != nil { if err != nil {
return err return err
} }
client := NewAuthenticatedHTTPClient(s.config, 30*time.Second) // Longer timeout for range fetches client := &http.Client{Timeout: 30 * time.Second} // Longer timeout for range fetches
protocol := GetProtocol(s.config) url := fmt.Sprintf("http://%s/kv_range", peerAddress)
url := fmt.Sprintf("%s://%s/kv_range", protocol, peerAddress)
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonData)) resp, err := client.Post(url, "application/json", bytes.NewBuffer(jsonData))
if err != nil {
return err
}
req.Header.Set("Content-Type", "application/json")
AddClusterAuthHeaders(req, s.config)
resp, err := client.Do(req)
if err != nil { if err != nil {
return err return err
} }

View File

@@ -1,14 +1,12 @@
package config package config
import ( import (
"crypto/rand"
"encoding/base64"
"fmt" "fmt"
"os" "os"
"path/filepath" "path/filepath"
"gopkg.in/yaml.v3"
"kvs/types" "kvs/types"
"gopkg.in/yaml.v3"
) )
// Default configuration // Default configuration
@@ -57,33 +55,9 @@ func Default() *types.Config {
ClusteringEnabled: true, ClusteringEnabled: true,
RateLimitingEnabled: true, RateLimitingEnabled: true,
RevisionHistoryEnabled: true, RevisionHistoryEnabled: true,
// Default anonymous access settings (both disabled by default for security)
AllowAnonymousRead: false,
AllowAnonymousWrite: false,
// Default cluster authentication settings (Issue #13)
ClusterSecret: generateClusterSecret(),
ClusterTLSEnabled: false,
ClusterTLSCertFile: "",
ClusterTLSKeyFile: "",
ClusterTLSSkipVerify: false,
} }
} }
// generateClusterSecret generates a cryptographically secure random cluster secret
func generateClusterSecret() string {
// Generate 32 bytes (256 bits) of random data
randomBytes := make([]byte, 32)
if _, err := rand.Read(randomBytes); err != nil {
// Fallback to a warning - this should never happen in practice
fmt.Fprintf(os.Stderr, "Warning: Failed to generate secure cluster secret: %v\n", err)
return ""
}
// Encode as base64 for easy configuration file storage
return base64.StdEncoding.EncodeToString(randomBytes)
}
// Load configuration from file or create default // Load configuration from file or create default
func Load(configPath string) (*types.Config, error) { func Load(configPath string) (*types.Config, error) {
config := Default() config := Default()
@@ -116,13 +90,5 @@ func Load(configPath string) (*types.Config, error) {
return nil, fmt.Errorf("failed to parse config file: %v", err) return nil, fmt.Errorf("failed to parse config file: %v", err)
} }
// Generate cluster secret if not provided and clustering is enabled (Issue #13)
if config.ClusteringEnabled && config.ClusterSecret == "" {
config.ClusterSecret = generateClusterSecret()
fmt.Printf("Warning: No cluster_secret configured. Generated a random secret.\n")
fmt.Printf(" To share this secret with other nodes, add it to your config:\n")
fmt.Printf(" cluster_secret: %s\n", config.ClusterSecret)
}
return config, nil return config, nil
} }

View File

@@ -1,87 +0,0 @@
package daemon
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"syscall"
)
// GetLogFilePath returns the log file path for a given config file
func GetLogFilePath(configPath string) (string, error) {
logDir, err := getLogDir()
if err != nil {
return "", err
}
absConfigPath, err := filepath.Abs(configPath)
if err != nil {
return "", fmt.Errorf("failed to get absolute config path: %w", err)
}
basename := filepath.Base(configPath)
name := filepath.Base(filepath.Dir(absConfigPath)) + "_" + basename
return filepath.Join(logDir, name+".log"), nil
}
// Daemonize spawns the process as a daemon and returns
func Daemonize(configPath string) error {
// Get absolute path to the current executable
executable, err := os.Executable()
if err != nil {
return fmt.Errorf("failed to get executable path: %w", err)
}
// Get absolute path to config
absConfigPath, err := filepath.Abs(configPath)
if err != nil {
return fmt.Errorf("failed to get absolute config path: %w", err)
}
// Check if already running
_, running, err := ReadPID(configPath)
if err != nil {
return fmt.Errorf("failed to check if instance is running: %w", err)
}
if running {
return fmt.Errorf("instance is already running")
}
// Spawn the process in background with --daemon flag
cmd := exec.Command(executable, "--daemon", absConfigPath)
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true, // Create new session
}
// Redirect stdout/stderr to log file
logDir, err := getLogDir()
if err != nil {
return fmt.Errorf("failed to get log directory: %w", err)
}
if err := os.MkdirAll(logDir, 0755); err != nil {
return fmt.Errorf("failed to create log directory: %w", err)
}
basename := filepath.Base(configPath)
name := filepath.Base(filepath.Dir(absConfigPath)) + "_" + basename
logFile := filepath.Join(logDir, name+".log")
f, err := os.OpenFile(logFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
return fmt.Errorf("failed to open log file: %w", err)
}
defer f.Close()
cmd.Stdout = f
cmd.Stderr = f
if err := cmd.Start(); err != nil {
return fmt.Errorf("failed to start daemon: %w", err)
}
fmt.Printf("Started KVS instance '%s' (PID will be written by daemon)\n", filepath.Base(configPath))
fmt.Printf("Logs: %s\n", logFile)
return nil
}

View File

@@ -1,171 +0,0 @@
package daemon
import (
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
"syscall"
)
// getPIDDir returns the absolute path to the PID directory
func getPIDDir() (string, error) {
homeDir, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("failed to get user home directory: %w", err)
}
return filepath.Join(homeDir, ".kvs", "pids"), nil
}
// getLogDir returns the absolute path to the log directory
func getLogDir() (string, error) {
homeDir, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("failed to get user home directory: %w", err)
}
return filepath.Join(homeDir, ".kvs", "logs"), nil
}
// GetPIDFilePath returns the PID file path for a given config file
func GetPIDFilePath(configPath string) string {
pidDir, err := getPIDDir()
if err != nil {
// Fallback to local directory
pidDir = ".kvs/pids"
}
// Extract basename without extension
basename := filepath.Base(configPath)
name := strings.TrimSuffix(basename, filepath.Ext(basename))
return filepath.Join(pidDir, name+".pid")
}
// EnsurePIDDir creates the PID directory if it doesn't exist
func EnsurePIDDir() error {
pidDir, err := getPIDDir()
if err != nil {
return err
}
return os.MkdirAll(pidDir, 0755)
}
// WritePID writes the current process PID to a file
func WritePID(configPath string) error {
if err := EnsurePIDDir(); err != nil {
return fmt.Errorf("failed to create PID directory: %w", err)
}
pidFile := GetPIDFilePath(configPath)
pid := os.Getpid()
return os.WriteFile(pidFile, []byte(fmt.Sprintf("%d\n", pid)), 0644)
}
// ReadPID reads the PID from a file and checks if the process is running
func ReadPID(configPath string) (int, bool, error) {
pidFile := GetPIDFilePath(configPath)
data, err := os.ReadFile(pidFile)
if err != nil {
if os.IsNotExist(err) {
return 0, false, nil
}
return 0, false, fmt.Errorf("failed to read PID file: %w", err)
}
pidStr := strings.TrimSpace(string(data))
pid, err := strconv.Atoi(pidStr)
if err != nil {
return 0, false, fmt.Errorf("invalid PID in file: %w", err)
}
// Check if process is actually running
process, err := os.FindProcess(pid)
if err != nil {
return pid, false, nil
}
// Send signal 0 to check if process exists
err = process.Signal(syscall.Signal(0))
if err != nil {
return pid, false, nil
}
return pid, true, nil
}
// RemovePID removes the PID file
func RemovePID(configPath string) error {
pidFile := GetPIDFilePath(configPath)
err := os.Remove(pidFile)
if err != nil && !os.IsNotExist(err) {
return fmt.Errorf("failed to remove PID file: %w", err)
}
return nil
}
// ListRunningInstances returns a list of running KVS instances
func ListRunningInstances() ([]InstanceInfo, error) {
var instances []InstanceInfo
pidDir, err := getPIDDir()
if err != nil {
return nil, err
}
// Check if PID directory exists
if _, err := os.Stat(pidDir); os.IsNotExist(err) {
return instances, nil
}
entries, err := os.ReadDir(pidDir)
if err != nil {
return nil, fmt.Errorf("failed to read PID directory: %w", err)
}
for _, entry := range entries {
if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".pid") {
continue
}
name := strings.TrimSuffix(entry.Name(), ".pid")
configPath := name + ".yaml" // Assume .yaml extension
pid, running, err := ReadPID(configPath)
if err != nil {
continue
}
instances = append(instances, InstanceInfo{
Name: name,
PID: pid,
Running: running,
})
}
return instances, nil
}
// InstanceInfo holds information about a KVS instance
type InstanceInfo struct {
Name string
PID int
Running bool
}
// StopProcess stops a process by PID
func StopProcess(pid int) error {
process, err := os.FindProcess(pid)
if err != nil {
return fmt.Errorf("failed to find process: %w", err)
}
// Try graceful shutdown first (SIGTERM)
if err := process.Signal(syscall.SIGTERM); err != nil {
return fmt.Errorf("failed to send SIGTERM: %w", err)
}
return nil
}

View File

@@ -45,7 +45,6 @@ cleanup() {
log_info "Cleaning up test environment..." log_info "Cleaning up test environment..."
pkill -f "$BINARY" 2>/dev/null || true pkill -f "$BINARY" 2>/dev/null || true
rm -rf "$TEST_DIR" 2>/dev/null || true rm -rf "$TEST_DIR" 2>/dev/null || true
rm -rf "$HOME/.kvs" 2>/dev/null || true # Clean up PID and log files from home dir
sleep 2 # Allow processes to fully terminate sleep 2 # Allow processes to fully terminate
} }
@@ -65,15 +64,6 @@ wait_for_service() {
return 1 return 1
} }
# Get log file path for a config file (matches daemon naming convention)
get_log_file() {
local config=$1
local abs_path=$(realpath "$config")
local basename=$(basename "$config")
local dirname=$(basename $(dirname "$abs_path"))
echo "$HOME/.kvs/logs/${dirname}_${basename}.log"
}
# Test 1: Build verification # Test 1: Build verification
test_build() { test_build() {
test_start "Binary build verification" test_start "Binary build verification"
@@ -101,13 +91,11 @@ port: 8090
data_dir: "./basic_data" data_dir: "./basic_data"
seed_nodes: [] seed_nodes: []
log_level: "error" log_level: "error"
allow_anonymous_read: true
allow_anonymous_write: true
EOF EOF
# Start node using daemon command # Start node
$BINARY start basic.yaml >/dev/null 2>&1 $BINARY basic.yaml >/dev/null 2>&1 &
sleep 2 local pid=$!
if wait_for_service 8090; then if wait_for_service 8090; then
# Test basic CRUD # Test basic CRUD
@@ -116,7 +104,7 @@ EOF
-d '{"message":"hello world"}') -d '{"message":"hello world"}')
local get_result=$(curl -s http://localhost:8090/kv/test/basic) local get_result=$(curl -s http://localhost:8090/kv/test/basic)
local message=$(echo "$get_result" | jq -r '.data.message' 2>/dev/null) local message=$(echo "$get_result" | jq -r '.data.message' 2>/dev/null) # Adjusted jq path
if [ "$message" = "hello world" ]; then if [ "$message" = "hello world" ]; then
log_success "Basic CRUD operations work" log_success "Basic CRUD operations work"
@@ -127,17 +115,14 @@ EOF
log_error "Basic test node failed to start" log_error "Basic test node failed to start"
fi fi
$BINARY stop basic.yaml >/dev/null 2>&1 kill $pid 2>/dev/null || true
sleep 1 sleep 2
} }
# Test 3: Cluster formation # Test 3: Cluster formation
test_cluster_formation() { test_cluster_formation() {
test_start "2-node cluster formation and Merkle Tree replication" test_start "2-node cluster formation and Merkle Tree replication"
# Shared cluster secret for authentication (Issue #13)
local CLUSTER_SECRET="test-cluster-secret-12345678901234567890"
# Node 1 config # Node 1 config
cat > cluster1.yaml <<EOF cat > cluster1.yaml <<EOF
node_id: "cluster-1" node_id: "cluster-1"
@@ -149,9 +134,6 @@ log_level: "error"
gossip_interval_min: 5 gossip_interval_min: 5
gossip_interval_max: 10 gossip_interval_max: 10
sync_interval: 10 sync_interval: 10
allow_anonymous_read: true
allow_anonymous_write: true
cluster_secret: "$CLUSTER_SECRET"
EOF EOF
# Node 2 config # Node 2 config
@@ -165,27 +147,25 @@ log_level: "error"
gossip_interval_min: 5 gossip_interval_min: 5
gossip_interval_max: 10 gossip_interval_max: 10
sync_interval: 10 sync_interval: 10
allow_anonymous_read: true
allow_anonymous_write: true
cluster_secret: "$CLUSTER_SECRET"
EOF EOF
# Start nodes using daemon commands # Start nodes
$BINARY start cluster1.yaml >/dev/null 2>&1 $BINARY cluster1.yaml >/dev/null 2>&1 &
sleep 2 local pid1=$!
if ! wait_for_service 8101; then if ! wait_for_service 8101; then
log_error "Cluster node 1 failed to start" log_error "Cluster node 1 failed to start"
$BINARY stop cluster1.yaml >/dev/null 2>&1 kill $pid1 2>/dev/null || true
return 1 return 1
fi fi
$BINARY start cluster2.yaml >/dev/null 2>&1 sleep 2 # Give node 1 a moment to fully initialize
sleep 2 $BINARY cluster2.yaml >/dev/null 2>&1 &
local pid2=$!
if ! wait_for_service 8102; then if ! wait_for_service 8102; then
log_error "Cluster node 2 failed to start" log_error "Cluster node 2 failed to start"
$BINARY stop cluster1.yaml cluster2.yaml >/dev/null 2>&1 kill $pid1 $pid2 2>/dev/null || true
return 1 return 1
fi fi
@@ -234,8 +214,8 @@ EOF
log_error "Cluster formation failed (N1 members: $node1_members, N2 members: $node2_members)" log_error "Cluster formation failed (N1 members: $node1_members, N2 members: $node2_members)"
fi fi
$BINARY stop cluster1.yaml cluster2.yaml >/dev/null 2>&1 kill $pid1 $pid2 2>/dev/null || true
sleep 1 sleep 2
} }
# Test 4: Conflict resolution (Merkle Tree based) # Test 4: Conflict resolution (Merkle Tree based)
@@ -253,9 +233,6 @@ test_conflict_resolution() {
if go run test_conflict.go "$TEST_DIR/conflict1_data" "$TEST_DIR/conflict2_data"; then if go run test_conflict.go "$TEST_DIR/conflict1_data" "$TEST_DIR/conflict2_data"; then
cd "$TEST_DIR" cd "$TEST_DIR"
# Shared cluster secret for authentication (Issue #13)
local CLUSTER_SECRET="conflict-cluster-secret-1234567890123"
# Create configs # Create configs
cat > conflict1.yaml <<EOF cat > conflict1.yaml <<EOF
node_id: "conflict-1" node_id: "conflict-1"
@@ -265,9 +242,6 @@ data_dir: "./conflict1_data"
seed_nodes: [] seed_nodes: []
log_level: "info" log_level: "info"
sync_interval: 3 sync_interval: 3
allow_anonymous_read: true
allow_anonymous_write: true
cluster_secret: "$CLUSTER_SECRET"
EOF EOF
cat > conflict2.yaml <<EOF cat > conflict2.yaml <<EOF
@@ -278,19 +252,17 @@ data_dir: "./conflict2_data"
seed_nodes: ["127.0.0.1:8111"] seed_nodes: ["127.0.0.1:8111"]
log_level: "info" log_level: "info"
sync_interval: 3 sync_interval: 3
allow_anonymous_read: true
allow_anonymous_write: true
cluster_secret: "$CLUSTER_SECRET"
EOF EOF
# Start nodes using daemon commands # Start nodes
# Node 1 started first, making it "older" for tie-breaker if timestamps are equal # Node 1 started first, making it "older" for tie-breaker if timestamps are equal
$BINARY start conflict1.yaml >/dev/null 2>&1 "$BINARY" conflict1.yaml >conflict1.log 2>&1 &
sleep 2 local pid1=$!
if wait_for_service 8111; then if wait_for_service 8111; then
$BINARY start conflict2.yaml >/dev/null 2>&1
sleep 2 sleep 2
$BINARY conflict2.yaml >conflict2.log 2>&1 &
local pid2=$!
if wait_for_service 8112; then if wait_for_service 8112; then
# Get initial data (full StoredValue) # Get initial data (full StoredValue)
@@ -352,10 +324,8 @@ EOF
log_error "Resolved data has inconsistent UUID/Timestamp: N1_UUID=$node1_final_uuid, N1_TS=$node1_final_timestamp, N2_UUID=$node2_final_uuid, N2_TS=$node2_final_timestamp" log_error "Resolved data has inconsistent UUID/Timestamp: N1_UUID=$node1_final_uuid, N1_TS=$node1_final_timestamp, N2_UUID=$node2_final_uuid, N2_TS=$node2_final_timestamp"
fi fi
# Check logs for conflict resolution messages # Optionally, check logs for conflict resolution messages
local log1=$(get_log_file conflict1.yaml) if grep -q "Conflict resolved" conflict1.log conflict2.log 2>/dev/null; then
local log2=$(get_log_file conflict2.yaml)
if grep -q "Conflict resolved" "$log1" "$log2" 2>/dev/null; then
log_success "Conflict resolution messages found in logs" log_success "Conflict resolution messages found in logs"
else else
log_error "No 'Conflict resolved' messages found in logs, but data converged." log_error "No 'Conflict resolved' messages found in logs, but data converged."
@@ -368,243 +338,19 @@ EOF
log_error "Conflict node 2 failed to start" log_error "Conflict node 2 failed to start"
fi fi
$BINARY stop conflict2.yaml >/dev/null 2>&1 kill $pid2 2>/dev/null || true
else else
log_error "Conflict node 1 failed to start" log_error "Conflict node 1 failed to start"
fi fi
$BINARY stop conflict1.yaml >/dev/null 2>&1 kill $pid1 2>/dev/null || true
sleep 1 sleep 2
else else
cd "$TEST_DIR" cd "$TEST_DIR"
log_error "Failed to create conflict test data. Ensure test_conflict.go is correct." log_error "Failed to create conflict test data. Ensure test_conflict.go is correct."
fi fi
} }
# Test 5: Authentication middleware (Issue #4)
test_authentication_middleware() {
test_start "Authentication middleware test (Issue #4)"
# Create auth test config
cat > auth_test.yaml <<EOF
node_id: "auth-test"
bind_address: "127.0.0.1"
port: 8095
data_dir: "./auth_test_data"
seed_nodes: []
log_level: "error"
auth_enabled: true
allow_anonymous_read: false
allow_anonymous_write: false
EOF
# Start node using daemon command
$BINARY start auth_test.yaml >/dev/null 2>&1
sleep 3 # Allow daemon to start and root account creation
if wait_for_service 8095; then
# Extract the token from logs
local log_file=$(get_log_file auth_test.yaml)
local token=$(grep "Token:" "$log_file" | sed 's/.*Token: //' | tr -d '\n\r')
if [ -z "$token" ]; then
log_error "Failed to extract authentication token from logs"
$BINARY stop auth_test.yaml >/dev/null 2>&1
return
fi
# Test 1: Admin endpoints should fail without authentication
local no_auth_response=$(curl -s -X POST http://localhost:8095/api/users -H "Content-Type: application/json" -d '{"nickname":"test","password":"test"}')
if echo "$no_auth_response" | grep -q "Unauthorized"; then
log_success "Admin endpoints properly reject unauthenticated requests"
else
log_error "Admin endpoints should reject unauthenticated requests, got: $no_auth_response"
fi
# Test 2: Admin endpoints should work with valid authentication
local auth_response=$(curl -s -X POST http://localhost:8095/api/users -H "Content-Type: application/json" -H "Authorization: Bearer $token" -d '{"nickname":"authtest","password":"authtest"}')
if echo "$auth_response" | grep -q "uuid"; then
log_success "Admin endpoints work with valid authentication"
else
log_error "Admin endpoints should work with authentication, got: $auth_response"
fi
# Test 3: KV endpoints should require auth when anonymous access is disabled
local kv_no_auth=$(curl -s -X PUT http://localhost:8095/kv/test/auth -H "Content-Type: application/json" -d '{"test":"auth"}')
if echo "$kv_no_auth" | grep -q "Unauthorized"; then
log_success "KV endpoints properly require authentication when anonymous access disabled"
else
log_error "KV endpoints should require auth when anonymous access disabled, got: $kv_no_auth"
fi
# Test 4: KV endpoints should work with valid authentication
local kv_auth=$(curl -s -X PUT http://localhost:8095/kv/test/auth -H "Content-Type: application/json" -H "Authorization: Bearer $token" -d '{"test":"auth"}')
if echo "$kv_auth" | grep -q "uuid\|timestamp" || [ -z "$kv_auth" ]; then
log_success "KV endpoints work with valid authentication"
else
log_error "KV endpoints should work with authentication, got: $kv_auth"
fi
$BINARY stop auth_test.yaml >/dev/null 2>&1
sleep 1
else
log_error "Auth test node failed to start"
$BINARY stop auth_test.yaml >/dev/null 2>&1
fi
}
# Test 6: Resource Metadata Management (Issue #12)
test_metadata_management() {
test_start "Resource Metadata Management test (Issue #12)"
# Create metadata test config
cat > metadata_test.yaml <<EOF
node_id: "metadata-test"
bind_address: "127.0.0.1"
port: 8096
data_dir: "./metadata_test_data"
seed_nodes: []
log_level: "error"
auth_enabled: true
allow_anonymous_read: false
allow_anonymous_write: false
EOF
# Start node using daemon command
$BINARY start metadata_test.yaml >/dev/null 2>&1
sleep 3 # Allow daemon to start and root account creation
if wait_for_service 8096; then
# Extract the token from logs
local log_file=$(get_log_file metadata_test.yaml)
local token=$(grep "Token:" "$log_file" | sed 's/.*Token: //' | tr -d '\n\r')
if [ -z "$token" ]; then
log_error "Failed to extract authentication token from logs"
$BINARY stop metadata_test.yaml >/dev/null 2>&1
return
fi
# First, create a KV resource
curl -s -X PUT http://localhost:8096/kv/test/resource -H "Content-Type: application/json" -H "Authorization: Bearer $token" -d '{"data":"test"}' >/dev/null
sleep 1
# Test 1: Get metadata should fail for non-existent metadata (initially no metadata exists)
local get_response=$(curl -s -w "\n%{http_code}" -X GET http://localhost:8096/kv/test/resource/metadata -H "Authorization: Bearer $token")
local get_body=$(echo "$get_response" | head -n -1)
local get_code=$(echo "$get_response" | tail -n 1)
if [ "$get_code" = "404" ]; then
log_success "GET metadata returns 404 for non-existent metadata"
else
log_error "GET metadata should return 404 for non-existent metadata, got code: $get_code, body: $get_body"
fi
# Test 2: Update metadata should create new metadata
local update_response=$(curl -s -X PUT http://localhost:8096/kv/test/resource/metadata -H "Content-Type: application/json" -H "Authorization: Bearer $token" -d '{"owner_uuid":"test-owner-123","permissions":3840}')
if echo "$update_response" | grep -q "owner_uuid"; then
log_success "PUT metadata creates metadata successfully"
else
log_error "PUT metadata should create metadata, got: $update_response"
fi
# Test 3: Get metadata should now return the created metadata
local get_response2=$(curl -s -X GET http://localhost:8096/kv/test/resource/metadata -H "Authorization: Bearer $token")
if echo "$get_response2" | grep -q "test-owner-123" && echo "$get_response2" | grep -q "3840"; then
log_success "GET metadata returns created metadata"
else
log_error "GET metadata should return created metadata, got: $get_response2"
fi
# Test 4: Update metadata should modify existing metadata
local update_response2=$(curl -s -X PUT http://localhost:8096/kv/test/resource/metadata -H "Content-Type: application/json" -H "Authorization: Bearer $token" -d '{"owner_uuid":"new-owner-456"}')
if echo "$update_response2" | grep -q "new-owner-456"; then
log_success "PUT metadata updates existing metadata"
else
log_error "PUT metadata should update metadata, got: $update_response2"
fi
# Test 5: Metadata endpoints should require authentication
local no_auth=$(curl -s -w "\n%{http_code}" -X GET http://localhost:8096/kv/test/resource/metadata)
local no_auth_code=$(echo "$no_auth" | tail -n 1)
if [ "$no_auth_code" = "401" ]; then
log_success "Metadata endpoints properly require authentication"
else
log_error "Metadata endpoints should require authentication, got code: $no_auth_code"
fi
$BINARY stop metadata_test.yaml >/dev/null 2>&1
sleep 1
else
log_error "Metadata test node failed to start"
$BINARY stop metadata_test.yaml >/dev/null 2>&1
fi
}
# Test 7: Daemon commands (start, stop, status, restart)
test_daemon_commands() {
test_start "Daemon command tests (start, stop, status, restart)"
# Create daemon test config
cat > daemon_test.yaml <<EOF
node_id: "daemon-test"
bind_address: "127.0.0.1"
port: 8097
data_dir: "./daemon_test_data"
seed_nodes: []
log_level: "error"
allow_anonymous_read: true
allow_anonymous_write: true
EOF
# Test 1: Start command
$BINARY start daemon_test.yaml >/dev/null 2>&1
sleep 3 # Allow daemon to start
if wait_for_service 8097 5; then
log_success "Daemon 'start' command works"
# Test 2: Status command shows running
local status_output=$($BINARY status daemon_test.yaml 2>&1)
if echo "$status_output" | grep -q "RUNNING"; then
log_success "Daemon 'status' command shows RUNNING"
else
log_error "Daemon 'status' should show RUNNING, got: $status_output"
fi
# Test 3: Stop command
$BINARY stop daemon_test.yaml >/dev/null 2>&1
sleep 2
# Check that service is actually stopped
if ! curl -s "http://localhost:8097/health" >/dev/null 2>&1; then
log_success "Daemon 'stop' command works"
else
log_error "Daemon should be stopped but is still responding"
fi
# Test 4: Restart command
$BINARY restart daemon_test.yaml >/dev/null 2>&1
sleep 3
if wait_for_service 8097 5; then
log_success "Daemon 'restart' command works"
# Clean up
$BINARY stop daemon_test.yaml >/dev/null 2>&1
sleep 1
else
log_error "Daemon 'restart' failed to start service"
fi
else
log_error "Daemon 'start' command failed"
fi
# Ensure cleanup
pkill -f "daemon_test.yaml" 2>/dev/null || true
sleep 1
}
# Main test execution # Main test execution
main() { main() {
echo "==================================================" echo "=================================================="
@@ -622,9 +368,6 @@ main() {
test_basic_functionality test_basic_functionality
test_cluster_formation test_cluster_formation
test_conflict_resolution test_conflict_resolution
test_authentication_middleware
test_metadata_management
test_daemon_commands
# Results # Results
echo "==================================================" echo "=================================================="

View File

@@ -1,65 +0,0 @@
# Issue #2: Update README.md
**Status:****COMPLETED** *(updated during this session)*
**Author:** MrKalzu
**Created:** 2025-09-12 22:01:34 +03:00
**Repository:** https://git.rauhala.info/ryyst/kalzu-value-store/issues/2
## Description
"It feels like the readme has lot of expired info after the latest update."
## Problem
The project's README file contained outdated information that needed to be revised following recent updates and refactoring.
## Resolution Status
**✅ COMPLETED** - The README.md has been comprehensively updated to reflect the current state of the codebase.
## Updates Made
### Architecture & Features
- ✅ Updated key features to include Merkle Tree sync, JWT authentication, and modular architecture
- ✅ Revised architecture diagram to show modular components
- ✅ Added authentication and authorization sections
- ✅ Updated conflict resolution description
### Configuration
- ✅ Added comprehensive configuration options including feature toggles
- ✅ Updated default values to match actual implementation
- ✅ Added feature toggle documentation (auth, clustering, compression, etc.)
- ✅ Included backup and tamper logging configuration
### API Documentation
- ✅ Added JWT authentication examples
- ✅ Updated API endpoints with proper authorization headers
- ✅ Added authentication endpoints documentation
- ✅ Included Merkle tree and sync endpoints
### Project Structure
- ✅ Completely updated project structure to reflect modular architecture
- ✅ Documented all packages (auth/, cluster/, storage/, server/, etc.)
- ✅ Updated file organization to match current codebase
### Development & Testing
- ✅ Updated build and test commands
- ✅ Added integration test suite documentation
- ✅ Updated conflict resolution testing procedures
- ✅ Added code quality tools documentation
### Performance & Limitations
- ✅ Updated performance characteristics with Merkle sync improvements
- ✅ Revised limitations to reflect implemented features
- ✅ Added realistic timing expectations
## Current Status
The README.md now accurately reflects:
- Current modular architecture
- All implemented features and capabilities
- Proper configuration options
- Updated development workflow
- Comprehensive API documentation
**This issue has been resolved.**

View File

@@ -1,71 +0,0 @@
# Issue #3: Implement Autogenerated Root Account for Initial Setup
**Status:****COMPLETED**
**Author:** MrKalzu
**Created:** 2025-09-12 22:17:12 +03:00
**Repository:** https://git.rauhala.info/ryyst/kalzu-value-store/issues/3
## Problem Statement
The KVS server lacks a mechanism to create an initial administrative user when starting with an empty database and no seed nodes. This makes it impossible to interact with authentication-protected endpoints during initial setup.
## Current Challenge
- Empty database + no seed nodes = no way to authenticate
- No existing users means no way to create API tokens
- Authentication-protected endpoints become inaccessible
- Manual database seeding required for initial setup
## Proposed Solution
### 1. Detection Logic
- Detect empty database condition
- Verify no seed nodes are configured
- Only trigger on initial startup with empty state
### 2. Root Account Generation
Create a default "root" user with:
- **Server-generated UUID**
- **Hashed nickname** (e.g., "root")
- **Assigned to default "admin" group**
- **Full administrative privileges**
### 3. API Token Creation
- Generate API token with administrative scopes
- Include all necessary permissions for initial setup
- Set reasonable expiration time
### 4. Secure Token Distribution
- **Securely log the token to console** (one-time display)
- **Persist user and token in BadgerDB**
- **Clear token from memory after logging**
## Implementation Details
### Relevant Code Sections
- `NewServer` function - Add initialization logic
- `User`, `Group`, `APIToken` structs - Use existing data structures
- Hashing and storage key functions - Leverage existing auth system
### Proposed Changes (from MrKalzu's comment)
- **Added `HasUsers() (bool, error)`** to `auth/auth.go`
- **Added "Initial root account setup for empty DB with no seeds"** to `server/server.go`
- **Diff file attached** with implementation details
## Security Considerations
- Token should be displayed only once during startup
- Token should have reasonable expiration
- Root account should be clearly identified in logs
- Consider forcing password change on first use (future enhancement)
## Benefits
- Enables zero-configuration initial setup
- Provides secure bootstrap process
- Eliminates manual database seeding
- Supports automated deployment scenarios
## Dependencies
This issue blocks **Issue #4** (securing administrative endpoints), as it provides the mechanism for initial administrative access.

View File

@@ -1,59 +0,0 @@
# Issue #4: Secure User and Group Management Endpoints with Authentication Middleware
**Status:** Open
**Author:** MrKalzu
**Created:** 2025-09-12
**Assignee:** ryyst
**Repository:** https://git.rauhala.info/ryyst/kalzu-value-store/issues/4
## Description
**Security Vulnerability:** User, group, and token management API endpoints are currently exposed without authentication, creating a significant security risk.
## Current Problem
The following administrative endpoints are accessible without authentication:
- User management endpoints (`createUserHandler`, `getUserHandler`, etc.)
- Group management endpoints
- Token management endpoints
## Proposed Solution
### 1. Define Granular Administrative Scopes
Create specific administrative scopes for fine-grained access control:
- `admin:users:create` - Create new users
- `admin:users:read` - View user information
- `admin:users:update` - Modify user data
- `admin:users:delete` - Remove users
- `admin:groups:create` - Create new groups
- `admin:groups:read` - View group information
- `admin:groups:update` - Modify group membership
- `admin:groups:delete` - Remove groups
- `admin:tokens:create` - Generate API tokens
- `admin:tokens:revoke` - Revoke API tokens
### 2. Apply Authentication Middleware
Wrap all administrative handlers with `authMiddleware` and specific scope requirements:
```go
// Example implementation
router.Handle("/auth/users", authMiddleware("admin:users:create")(createUserHandler))
router.Handle("/auth/users/{id}", authMiddleware("admin:users:read")(getUserHandler))
```
## Dependencies
- **Depends on Issue #3**: Requires implementation of autogenerated root account for initial setup
## Security Benefits
- Prevents unauthorized administrative access
- Implements principle of least privilege
- Provides audit trail for administrative operations
- Protects against privilege escalation attacks
## Implementation Priority
**High Priority** - This addresses a critical security vulnerability that could allow unauthorized access to administrative functions.

View File

@@ -1,47 +0,0 @@
# Issue #5: Add Configuration for Anonymous Read and Write Access to KV Endpoints
**Status:** Open
**Author:** MrKalzu
**Created:** 2025-09-12
**Repository:** https://git.rauhala.info/ryyst/kalzu-value-store/issues/5
## Description
Currently, KV endpoints are publicly accessible without authentication. This issue proposes adding granular control over public access to key-value store functionality.
## Proposed Configuration Parameters
Add two new configuration parameters to the `Config` struct:
1. **`AllowAnonymousRead`** (boolean, default `false`)
- Controls whether unauthenticated users can read data
2. **`AllowAnonymousWrite`** (boolean, default `false`)
- Controls whether unauthenticated users can write data
## Proposed Implementation Changes
### Modify `setupRoutes` Function
- Conditionally apply authentication middleware based on configuration flags
### Specific Handler Changes
- **`getKVHandler`**: Apply auth middleware with "read" scope if `AllowAnonymousRead` is `false`
- **`putKVHandler`**: Apply auth middleware with "write" scope if `AllowAnonymousWrite` is `false`
- **`deleteKVHandler`**: Always require authentication (no anonymous delete)
## Goal
Provide granular control over public access to key-value store functionality while maintaining security for sensitive operations.
## Use Cases
- **Public read-only deployments**: Allow anonymous reading for public data
- **Public write scenarios**: Allow anonymous data submission (like forms or logs)
- **Secure deployments**: Require authentication for all operations
- **Mixed access patterns**: Different permissions for read vs write operations
## Security Considerations
- Delete operations should always require authentication
- Consider rate limiting for anonymous access
- Audit logging should track anonymous operations differently

View File

@@ -1,46 +0,0 @@
# Issue #6: Configuration Options to Disable Optional Functionalities
**Status:****COMPLETED**
**Author:** MrKalzu
**Created:** 2025-09-12
**Repository:** https://git.rauhala.info/ryyst/kalzu-value-store/issues/6
## Description
Proposes adding configuration options to disable advanced features in the KVS (Key-Value Store) server to allow more flexible and lightweight deployment scenarios.
## Suggested Disablement Options
1. **Authentication System** - Disable JWT authentication entirely
2. **Tamper-Evident Logging** - Disable cryptographic audit trails
3. **Clustering** - Disable gossip protocol and distributed features
4. **Rate Limiting** - Disable per-client rate limiting
5. **Revision History** - Disable automatic versioning
## Proposed Implementation
- Add boolean flags to the Config struct for each feature
- Modify server initialization and request handling to respect these flags
- Allow conditional compilation/execution of features based on configuration
## Pros of Proposed Changes
- Reduce unnecessary overhead for simple deployments
- Simplify setup for different deployment needs
- Improve performance for specific use cases
- Lower resource consumption
## Cons of Proposed Changes
- Potential security risks if features are disabled inappropriately
- Loss of advanced functionality like audit trails or data recovery
- Increased complexity in codebase with conditional feature logic
## Already Implemented Features
- Backup System (configurable)
- Compression (configurable)
## Implementation Notes
The issue suggests modifying relevant code sections to conditionally enable/disable these features based on configuration, similar to how backup and compression are currently handled.

208
main.go
View File

@@ -6,90 +6,26 @@ import (
"os" "os"
"os/signal" "os/signal"
"syscall" "syscall"
"time"
"path/filepath"
"strings"
"kvs/config" "kvs/config"
"kvs/daemon"
"kvs/server" "kvs/server"
) )
func main() { func main() {
if len(os.Args) < 2 { configPath := "./config.yaml"
// No arguments - run in foreground with default config
runServer("./config.yaml", false) // Simple CLI argument parsing
return if len(os.Args) > 1 {
configPath = os.Args[1]
} }
// Check if this is a daemon spawn
if os.Args[1] == "--daemon" {
if len(os.Args) < 3 {
fmt.Fprintf(os.Stderr, "Error: --daemon flag requires config path\n")
os.Exit(1)
}
runServer(os.Args[2], true)
return
}
// Parse subcommand
command := os.Args[1]
switch command {
case "start":
if len(os.Args) < 3 {
fmt.Fprintf(os.Stderr, "Usage: kvs start <config>\n")
os.Exit(1)
}
cmdStart(normalizeConfigPath(os.Args[2]))
case "stop":
if len(os.Args) < 3 {
fmt.Fprintf(os.Stderr, "Usage: kvs stop <config>\n")
os.Exit(1)
}
cmdStop(normalizeConfigPath(os.Args[2]))
case "restart":
if len(os.Args) < 3 {
fmt.Fprintf(os.Stderr, "Usage: kvs restart <config>\n")
os.Exit(1)
}
cmdRestart(normalizeConfigPath(os.Args[2]))
case "status":
if len(os.Args) > 2 {
cmdStatusSingle(normalizeConfigPath(os.Args[2]))
} else {
cmdStatusAll()
}
case "help", "--help", "-h":
printHelp()
default:
// Backward compatibility: assume it's a config file path
runServer(command, false)
}
}
func runServer(configPath string, isDaemon bool) {
cfg, err := config.Load(configPath) cfg, err := config.Load(configPath)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "Failed to load configuration: %v\n", err) fmt.Fprintf(os.Stderr, "Failed to load configuration: %v\n", err)
os.Exit(1) os.Exit(1)
} }
// Write PID file if running as daemon
if isDaemon {
if err := daemon.WritePID(configPath); err != nil {
fmt.Fprintf(os.Stderr, "Failed to write PID file: %v\n", err)
os.Exit(1)
}
defer daemon.RemovePID(configPath)
}
kvServer, err := server.NewServer(cfg) kvServer, err := server.NewServer(cfg)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "Failed to create server: %v\n", err) fmt.Fprintf(os.Stderr, "Failed to create server: %v\n", err)
@@ -110,135 +46,3 @@ func runServer(configPath string, isDaemon bool) {
os.Exit(1) os.Exit(1)
} }
} }
func cmdStart(configPath string) {
if err := daemon.Daemonize(configPath); err != nil {
fmt.Fprintf(os.Stderr, "Failed to start: %v\n", err)
os.Exit(1)
}
}
func cmdStop(configPath string) {
pid, running, err := daemon.ReadPID(configPath)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read PID: %v\n", err)
os.Exit(1)
}
if !running {
fmt.Printf("Instance '%s' is not running\n", configPath)
// Clean up stale PID file
daemon.RemovePID(configPath)
return
}
fmt.Printf("Stopping instance '%s' (PID %d)...\n", configPath, pid)
if err := daemon.StopProcess(pid); err != nil {
fmt.Fprintf(os.Stderr, "Failed to stop process: %v\n", err)
os.Exit(1)
}
// Wait a bit and verify it stopped
time.Sleep(1 * time.Second)
_, stillRunning, _ := daemon.ReadPID(configPath)
if stillRunning {
fmt.Printf("Warning: Process may still be running\n")
} else {
daemon.RemovePID(configPath)
fmt.Printf("Stopped successfully\n")
}
}
func cmdRestart(configPath string) {
// Check if running
_, running, err := daemon.ReadPID(configPath)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to check status: %v\n", err)
os.Exit(1)
}
if running {
cmdStop(configPath)
// Wait a bit for clean shutdown
time.Sleep(2 * time.Second)
}
cmdStart(configPath)
}
func cmdStatusSingle(configPath string) {
pid, running, err := daemon.ReadPID(configPath)
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to read PID: %v\n", err)
os.Exit(1)
}
if running {
fmt.Printf("Instance '%s': RUNNING (PID %d)\n", configPath, pid)
} else if pid > 0 {
fmt.Printf("Instance '%s': STOPPED (stale PID %d)\n", configPath, pid)
} else {
fmt.Printf("Instance '%s': STOPPED\n", configPath)
}
}
func cmdStatusAll() {
instances, err := daemon.ListRunningInstances()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to list instances: %v\n", err)
os.Exit(1)
}
if len(instances) == 0 {
fmt.Println("No KVS instances found")
return
}
fmt.Println("KVS Instances:")
for _, inst := range instances {
status := "STOPPED"
if inst.Running {
status = "RUNNING"
}
fmt.Printf(" %-20s %s (PID %d)\n", inst.Name, status, inst.PID)
}
}
// normalizeConfigPath ensures config path has .yaml extension if not specified
func normalizeConfigPath(path string) string {
// If path doesn't have an extension, add .yaml
if filepath.Ext(path) == "" {
return path + ".yaml"
}
return path
}
// getConfigIdentifier returns the identifier for a config (basename without extension)
// This is used for PID files and status display
func getConfigIdentifier(path string) string {
basename := filepath.Base(path)
return strings.TrimSuffix(basename, filepath.Ext(basename))
}
func printHelp() {
help := `KVS - Distributed Key-Value Store
Usage:
kvs [config.yaml] Run in foreground (default: ./config.yaml)
kvs start <config> Start as daemon (.yaml extension optional)
kvs stop <config> Stop daemon (.yaml extension optional)
kvs restart <config> Restart daemon (.yaml extension optional)
kvs status [config] Show status (all instances if no config given)
kvs help Show this help
Examples:
kvs # Run with ./config.yaml in foreground
kvs node1.yaml # Run with node1.yaml in foreground
kvs start node1 # Start node1.yaml as daemon
kvs start node1.yaml # Same as above
kvs stop node1 # Stop node1 daemon
kvs status # Show all running instances
kvs status node1 # Show status of node1
`
fmt.Print(help)
}

View File

@@ -22,6 +22,8 @@ import (
"kvs/utils" "kvs/utils"
) )
// healthHandler returns server health status // healthHandler returns server health status
func (s *Server) healthHandler(w http.ResponseWriter, r *http.Request) { func (s *Server) healthHandler(w http.ResponseWriter, r *http.Request) {
mode := s.getMode() mode := s.getMode()
@@ -213,104 +215,6 @@ func (s *Server) deleteKVHandler(w http.ResponseWriter, r *http.Request) {
s.logger.WithField("path", path).Info("Value deleted") s.logger.WithField("path", path).Info("Value deleted")
} }
// getResourceMetadataHandler retrieves metadata for a KV resource
func (s *Server) getResourceMetadataHandler(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
path := vars["path"]
// Get metadata from storage
metadata, err := s.authService.GetResourceMetadata(path)
if err == badger.ErrKeyNotFound {
http.Error(w, "Not Found: No metadata exists for this resource", http.StatusNotFound)
return
}
if err != nil {
s.logger.WithError(err).WithField("path", path).Error("Failed to get resource metadata")
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
response := types.GetResourceMetadataResponse{
OwnerUUID: metadata.OwnerUUID,
GroupUUID: metadata.GroupUUID,
Permissions: metadata.Permissions,
TTL: metadata.TTL,
CreatedAt: metadata.CreatedAt,
UpdatedAt: metadata.UpdatedAt,
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(response)
}
// updateResourceMetadataHandler updates metadata for a KV resource
func (s *Server) updateResourceMetadataHandler(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
path := vars["path"]
// Parse request body
var req types.UpdateResourceMetadataRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Bad Request: Invalid JSON", http.StatusBadRequest)
return
}
// Get existing metadata or create new one
metadata, err := s.authService.GetResourceMetadata(path)
if err == badger.ErrKeyNotFound {
// Create new metadata with defaults
metadata = &types.ResourceMetadata{
OwnerUUID: "",
GroupUUID: "",
Permissions: types.DefaultPermissions,
TTL: "",
CreatedAt: time.Now().Unix(),
UpdatedAt: time.Now().Unix(),
}
} else if err != nil {
s.logger.WithError(err).WithField("path", path).Error("Failed to get resource metadata")
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
// Update only provided fields
if req.OwnerUUID != nil {
metadata.OwnerUUID = *req.OwnerUUID
}
if req.GroupUUID != nil {
metadata.GroupUUID = *req.GroupUUID
}
if req.Permissions != nil {
metadata.Permissions = *req.Permissions
}
metadata.UpdatedAt = time.Now().Unix()
// Store updated metadata
if err := s.authService.SetResourceMetadata(path, metadata); err != nil {
s.logger.WithError(err).WithField("path", path).Error("Failed to update resource metadata")
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
response := types.GetResourceMetadataResponse{
OwnerUUID: metadata.OwnerUUID,
GroupUUID: metadata.GroupUUID,
Permissions: metadata.Permissions,
TTL: metadata.TTL,
CreatedAt: metadata.CreatedAt,
UpdatedAt: metadata.UpdatedAt,
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(response)
s.logger.WithFields(logrus.Fields{
"path": path,
"owner_uuid": metadata.OwnerUUID,
"group_uuid": metadata.GroupUUID,
}).Info("Resource metadata updated")
}
// isClusterMember checks if request is from a cluster member // isClusterMember checks if request is from a cluster member
func (s *Server) isClusterMember(remoteAddr string) bool { func (s *Server) isClusterMember(remoteAddr string) bool {
host, _, err := net.SplitHostPort(remoteAddr) host, _, err := net.SplitHostPort(remoteAddr)
@@ -1367,29 +1271,3 @@ func (s *Server) getRevisionHistory(key string) ([]map[string]interface{}, error
func (s *Server) getSpecificRevision(key string, revision int) (*types.StoredValue, error) { func (s *Server) getSpecificRevision(key string, revision int) (*types.StoredValue, error) {
return s.revisionService.GetSpecificRevision(key, revision) return s.revisionService.GetSpecificRevision(key, revision)
} }
// clusterBootstrapHandler provides the cluster secret to authenticated administrators (Issue #13)
func (s *Server) clusterBootstrapHandler(w http.ResponseWriter, r *http.Request) {
// Ensure clustering is enabled
if !s.config.ClusteringEnabled {
http.Error(w, "Clustering is disabled", http.StatusServiceUnavailable)
return
}
// Ensure cluster secret is configured
if s.config.ClusterSecret == "" {
s.logger.Error("Cluster secret is not configured")
http.Error(w, "Cluster secret is not configured", http.StatusInternalServerError)
return
}
// Return the cluster secret for secure bootstrap
response := map[string]string{
"cluster_secret": s.config.ClusterSecret,
}
s.logger.WithField("remote_addr", r.RemoteAddr).Info("Cluster secret retrieved for bootstrap")
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(response)
}

View File

@@ -1,8 +1,6 @@
package server package server
import ( import (
"net/http"
"github.com/gorilla/mux" "github.com/gorilla/mux"
) )
@@ -10,134 +8,46 @@ import (
func (s *Server) setupRoutes() *mux.Router { func (s *Server) setupRoutes() *mux.Router {
router := mux.NewRouter() router := mux.NewRouter()
// Health endpoint (always available) // Health endpoint
router.HandleFunc("/health", s.healthHandler).Methods("GET") router.HandleFunc("/health", s.healthHandler).Methods("GET")
// Resource Metadata Management endpoints (Issue #12) - Must come BEFORE general KV routes // KV endpoints
// These need to be registered first to prevent /kv/{path:.+} from matching metadata paths
if s.config.AuthEnabled {
router.Handle("/kv/{path:.+}/metadata", s.authService.Middleware(
[]string{"admin:users:read"}, nil, "",
)(s.getResourceMetadataHandler)).Methods("GET")
router.Handle("/kv/{path:.+}/metadata", s.authService.Middleware(
[]string{"admin:users:update"}, nil, "",
)(s.updateResourceMetadataHandler)).Methods("PUT")
}
// KV endpoints (with conditional authentication based on anonymous access settings)
// GET endpoint - require auth if anonymous read is disabled
if s.config.AuthEnabled && !s.config.AllowAnonymousRead {
router.Handle("/kv/{path:.+}", s.authService.Middleware(
[]string{"read"}, nil, "",
)(s.getKVHandler)).Methods("GET")
} else {
router.HandleFunc("/kv/{path:.+}", s.getKVHandler).Methods("GET") router.HandleFunc("/kv/{path:.+}", s.getKVHandler).Methods("GET")
}
// PUT endpoint - require auth if anonymous write is disabled
if s.config.AuthEnabled && !s.config.AllowAnonymousWrite {
router.Handle("/kv/{path:.+}", s.authService.Middleware(
[]string{"write"}, nil, "",
)(s.putKVHandler)).Methods("PUT")
} else {
router.HandleFunc("/kv/{path:.+}", s.putKVHandler).Methods("PUT") router.HandleFunc("/kv/{path:.+}", s.putKVHandler).Methods("PUT")
}
// DELETE endpoint - always require authentication (no anonymous delete)
if s.config.AuthEnabled {
router.Handle("/kv/{path:.+}", s.authService.Middleware(
[]string{"delete"}, nil, "",
)(s.deleteKVHandler)).Methods("DELETE")
} else {
router.HandleFunc("/kv/{path:.+}", s.deleteKVHandler).Methods("DELETE") router.HandleFunc("/kv/{path:.+}", s.deleteKVHandler).Methods("DELETE")
}
// Member endpoints (available when clustering is enabled) // Member endpoints
if s.config.ClusteringEnabled {
// GET /members/ is unprotected for monitoring/inspection
router.HandleFunc("/members/", s.getMembersHandler).Methods("GET") router.HandleFunc("/members/", s.getMembersHandler).Methods("GET")
// Apply cluster authentication middleware to all cluster communication endpoints
if s.clusterAuthService != nil {
router.Handle("/members/join", s.clusterAuthService.Middleware(http.HandlerFunc(s.joinMemberHandler))).Methods("POST")
router.Handle("/members/leave", s.clusterAuthService.Middleware(http.HandlerFunc(s.leaveMemberHandler))).Methods("DELETE")
router.Handle("/members/gossip", s.clusterAuthService.Middleware(http.HandlerFunc(s.gossipHandler))).Methods("POST")
router.Handle("/members/pairs_by_time", s.clusterAuthService.Middleware(http.HandlerFunc(s.pairsByTimeHandler))).Methods("POST")
// Merkle Tree endpoints (clustering feature)
router.Handle("/merkle_tree/root", s.clusterAuthService.Middleware(http.HandlerFunc(s.getMerkleRootHandler))).Methods("GET")
router.Handle("/merkle_tree/diff", s.clusterAuthService.Middleware(http.HandlerFunc(s.getMerkleDiffHandler))).Methods("POST")
router.Handle("/kv_range", s.clusterAuthService.Middleware(http.HandlerFunc(s.getKVRangeHandler))).Methods("POST")
} else {
// Fallback to unprotected endpoints (for backwards compatibility)
router.HandleFunc("/members/join", s.joinMemberHandler).Methods("POST") router.HandleFunc("/members/join", s.joinMemberHandler).Methods("POST")
router.HandleFunc("/members/leave", s.leaveMemberHandler).Methods("DELETE") router.HandleFunc("/members/leave", s.leaveMemberHandler).Methods("DELETE")
router.HandleFunc("/members/gossip", s.gossipHandler).Methods("POST") router.HandleFunc("/members/gossip", s.gossipHandler).Methods("POST")
router.HandleFunc("/members/pairs_by_time", s.pairsByTimeHandler).Methods("POST") router.HandleFunc("/members/pairs_by_time", s.pairsByTimeHandler).Methods("POST") // Still available for clients
// Merkle Tree endpoints (clustering feature) // Merkle Tree endpoints
router.HandleFunc("/merkle_tree/root", s.getMerkleRootHandler).Methods("GET") router.HandleFunc("/merkle_tree/root", s.getMerkleRootHandler).Methods("GET")
router.HandleFunc("/merkle_tree/diff", s.getMerkleDiffHandler).Methods("POST") router.HandleFunc("/merkle_tree/diff", s.getMerkleDiffHandler).Methods("POST")
router.HandleFunc("/kv_range", s.getKVRangeHandler).Methods("POST") router.HandleFunc("/kv_range", s.getKVRangeHandler).Methods("POST") // New endpoint for fetching ranges
}
}
// Authentication and user management endpoints (available when auth is enabled) // User Management endpoints
if s.config.AuthEnabled { router.HandleFunc("/api/users", s.createUserHandler).Methods("POST")
// User Management endpoints (with authentication middleware) router.HandleFunc("/api/users/{uuid}", s.getUserHandler).Methods("GET")
router.Handle("/api/users", s.authService.Middleware( router.HandleFunc("/api/users/{uuid}", s.updateUserHandler).Methods("PUT")
[]string{"admin:users:create"}, nil, "", router.HandleFunc("/api/users/{uuid}", s.deleteUserHandler).Methods("DELETE")
)(s.createUserHandler)).Methods("POST")
router.Handle("/api/users/{uuid}", s.authService.Middleware( // Group Management endpoints
[]string{"admin:users:read"}, nil, "", router.HandleFunc("/api/groups", s.createGroupHandler).Methods("POST")
)(s.getUserHandler)).Methods("GET") router.HandleFunc("/api/groups/{uuid}", s.getGroupHandler).Methods("GET")
router.HandleFunc("/api/groups/{uuid}", s.updateGroupHandler).Methods("PUT")
router.HandleFunc("/api/groups/{uuid}", s.deleteGroupHandler).Methods("DELETE")
router.Handle("/api/users/{uuid}", s.authService.Middleware( // Token Management endpoints
[]string{"admin:users:update"}, nil, "", router.HandleFunc("/api/tokens", s.createTokenHandler).Methods("POST")
)(s.updateUserHandler)).Methods("PUT")
router.Handle("/api/users/{uuid}", s.authService.Middleware( // Revision History endpoints
[]string{"admin:users:delete"}, nil, "",
)(s.deleteUserHandler)).Methods("DELETE")
// Group Management endpoints (with authentication middleware)
router.Handle("/api/groups", s.authService.Middleware(
[]string{"admin:groups:create"}, nil, "",
)(s.createGroupHandler)).Methods("POST")
router.Handle("/api/groups/{uuid}", s.authService.Middleware(
[]string{"admin:groups:read"}, nil, "",
)(s.getGroupHandler)).Methods("GET")
router.Handle("/api/groups/{uuid}", s.authService.Middleware(
[]string{"admin:groups:update"}, nil, "",
)(s.updateGroupHandler)).Methods("PUT")
router.Handle("/api/groups/{uuid}", s.authService.Middleware(
[]string{"admin:groups:delete"}, nil, "",
)(s.deleteGroupHandler)).Methods("DELETE")
// Token Management endpoints (with authentication middleware)
router.Handle("/api/tokens", s.authService.Middleware(
[]string{"admin:tokens:create"}, nil, "",
)(s.createTokenHandler)).Methods("POST")
// Cluster Bootstrap endpoint (Issue #13) - Protected by JWT authentication
// Allows authenticated administrators to retrieve the cluster secret for new nodes
router.Handle("/auth/cluster-bootstrap", s.authService.Middleware(
[]string{"admin:tokens:create"}, nil, "",
)(s.clusterBootstrapHandler)).Methods("GET")
}
// Revision History endpoints (available when revision history is enabled)
if s.config.RevisionHistoryEnabled {
router.HandleFunc("/api/data/{key}/history", s.getRevisionHistoryHandler).Methods("GET") router.HandleFunc("/api/data/{key}/history", s.getRevisionHistoryHandler).Methods("GET")
router.HandleFunc("/api/data/{key}/history/{revision}", s.getSpecificRevisionHandler).Methods("GET") router.HandleFunc("/api/data/{key}/history/{revision}", s.getSpecificRevisionHandler).Methods("GET")
}
// Backup Status endpoint (always available if backup is enabled) // Backup Status endpoint
router.HandleFunc("/api/backup/status", s.getBackupStatusHandler).Methods("GET") router.HandleFunc("/api/backup/status", s.getBackupStatusHandler).Methods("GET")
return router return router

View File

@@ -2,18 +2,18 @@ package server
import ( import (
"context" "context"
"encoding/json"
"fmt" "fmt"
"net/http" "net/http"
"os" "os"
"path/filepath" "path/filepath"
"strings"
"sync" "sync"
"time" "time"
"encoding/json"
"github.com/dgraph-io/badger/v4" "github.com/dgraph-io/badger/v4"
"github.com/robfig/cron/v3" "github.com/robfig/cron/v3"
"github.com/sirupsen/logrus" "github.com/sirupsen/logrus"
"github.com/google/uuid"
"kvs/auth" "kvs/auth"
"kvs/cluster" "kvs/cluster"
@@ -51,7 +51,6 @@ type Server struct {
// Authentication service // Authentication service
authService *auth.AuthService authService *auth.AuthService
clusterAuthService *auth.ClusterAuthService
} }
// NewServer initializes and returns a new Server instance // NewServer initializes and returns a new Server instance
@@ -119,17 +118,88 @@ func NewServer(config *types.Config) (*Server, error) {
server.revisionService = storage.NewRevisionService(storageService) server.revisionService = storage.NewRevisionService(storageService)
// Initialize authentication service // Initialize authentication service
server.authService = auth.NewAuthService(db, logger, config) server.authService = auth.NewAuthService(db, logger)
// Initialize cluster authentication service (Issue #13) // New: Initial root account setup for empty DB with no seeds
if config.ClusteringEnabled { if len(config.SeedNodes) == 0 {
server.clusterAuthService = auth.NewClusterAuthService(config.ClusterSecret, logger) hasUsers, err := server.authService.HasUsers()
if err != nil {
return nil, fmt.Errorf("failed to check for existing users: %v", err)
}
if !hasUsers {
server.logger.Info("Detected empty database with no seed nodes; creating initial root account")
now := time.Now().Unix()
// Create admin group
adminGroupUUID := uuid.NewString()
hashedGroupName := utils.HashGroupName("admin") // Adjust if function name differs
adminGroup := types.Group{
UUID: adminGroupUUID,
Name: hashedGroupName,
CreatedAt: now, // Add if field exists; remove otherwise
// Members: []string{}, // Add if needed; e.g., add root later
}
groupData, err := json.Marshal(adminGroup)
if err != nil {
return nil, fmt.Errorf("failed to marshal admin group: %v", err)
}
err = db.Update(func(txn *badger.Txn) error {
return txn.Set([]byte(GroupStorageKey(adminGroupUUID)), groupData)
})
if err != nil {
return nil, fmt.Errorf("failed to store admin group: %v", err)
} }
// Setup initial root account if needed (Issue #3) // Create root user
if config.AuthEnabled { rootUUID := uuid.NewString()
if err := server.setupRootAccount(); err != nil { hashedNickname := utils.HashUserNickname("root") // Adjust if function name differs
return nil, fmt.Errorf("failed to setup root account: %v", err) rootUser := types.User{
UUID: rootUUID,
Nickname: hashedNickname,
Groups: []string{adminGroupUUID},
CreatedAt: now, // Add if field exists; remove otherwise
}
userData, err := json.Marshal(rootUser)
if err != nil {
return nil, fmt.Errorf("failed to marshal root user: %v", err)
}
err = db.Update(func(txn *badger.Txn) error {
return txn.Set([]byte(UserStorageKey(rootUUID)), userData)
})
if err != nil {
return nil, fmt.Errorf("failed to store root user: %v", err)
}
// Optionally update group members if bidirectional
// adminGroup.Members = append(adminGroup.Members, rootUUID)
// Update group in DB if needed...
// Generate and store API token
scopes := []string{"admin", "read", "write", "create", "delete"}
expirationHours := 8760 // 1 year
tokenString, expiresAt, err := auth.GenerateJWT(rootUUID, scopes, expirationHours)
if err != nil {
return nil, fmt.Errorf("failed to generate JWT: %v", err)
}
err = server.authService.StoreAPIToken(tokenString, rootUUID, scopes, expiresAt)
if err != nil {
return nil, fmt.Errorf("failed to store API token: %v", err)
}
// Log the details securely (only once, to stderr)
fmt.Fprintf(os.Stderr, `
***************************************************************************
WARNING: Initial root user created for new server instance.
Save this information securely—it will not be shown again.
Root User UUID: %s
API Token (Bearer): %s
Expires At: %s (UTC)
Use this token for authentication in API requests. Change or revoke it immediately via the API for security.
***************************************************************************
`, rootUUID, tokenString, time.Unix(expiresAt, 0).UTC())
} }
} }
@@ -198,138 +268,3 @@ func (s *Server) getBackupStatus() types.BackupStatus {
return status return status
} }
// setupRootAccount creates an initial root account if no users exist and no seed nodes are configured
func (s *Server) setupRootAccount() error {
// Only create root account if:
// 1. No users exist in the database
// 2. No seed nodes are configured (standalone mode)
hasUsers, err := s.authService.HasUsers()
if err != nil {
return fmt.Errorf("failed to check if users exist: %v", err)
}
// If users already exist or we have seed nodes, no need to create root account
if hasUsers || len(s.config.SeedNodes) > 0 {
return nil
}
s.logger.Info("Creating initial root account for empty database with no seed nodes")
// Import required packages for user creation
// Note: We need these imports at the top of the file
return s.createRootUserAndToken()
}
// createRootUserAndToken creates the root user, admin group, and initial token
func (s *Server) createRootUserAndToken() error {
rootNickname := "root"
adminGroupName := "admin"
// Generate UUIDs
rootUserUUID := "root-" + time.Now().Format("20060102-150405")
adminGroupUUID := "admin-" + time.Now().Format("20060102-150405")
now := time.Now().Unix()
// Create admin group
adminGroup := types.Group{
UUID: adminGroupUUID,
NameHash: hashGroupName(adminGroupName),
Members: []string{rootUserUUID},
CreatedAt: now,
UpdatedAt: now,
}
// Create root user
rootUser := types.User{
UUID: rootUserUUID,
NicknameHash: hashUserNickname(rootNickname),
Groups: []string{adminGroupUUID},
CreatedAt: now,
UpdatedAt: now,
}
// Store group and user in database
if err := s.storeUserAndGroup(&rootUser, &adminGroup); err != nil {
return fmt.Errorf("failed to store root user and admin group: %v", err)
}
// Create API token with full administrative scopes
adminScopes := []string{
"admin:users:create", "admin:users:read", "admin:users:update", "admin:users:delete",
"admin:groups:create", "admin:groups:read", "admin:groups:update", "admin:groups:delete",
"admin:tokens:create", "admin:tokens:revoke",
"read", "write", "delete",
}
// Generate token with 24 hour expiration for initial setup
tokenString, expiresAt, err := auth.GenerateJWT(rootUserUUID, adminScopes, 24)
if err != nil {
return fmt.Errorf("failed to generate root token: %v", err)
}
// Store token in database
if err := s.storeAPIToken(tokenString, rootUserUUID, adminScopes, expiresAt); err != nil {
return fmt.Errorf("failed to store root token: %v", err)
}
// Log the token securely (one-time display)
s.logger.WithFields(logrus.Fields{
"user_uuid": rootUserUUID,
"group_uuid": adminGroupUUID,
"expires_at": time.Unix(expiresAt, 0).Format(time.RFC3339),
"expires_in": "24 hours",
}).Warn("Root account created - SAVE THIS TOKEN:")
// Display token prominently
fmt.Printf("\n" + strings.Repeat("=", 80) + "\n")
fmt.Printf("🔐 ROOT ACCOUNT CREATED - INITIAL SETUP TOKEN\n")
fmt.Printf("===========================================\n")
fmt.Printf("User UUID: %s\n", rootUserUUID)
fmt.Printf("Group UUID: %s\n", adminGroupUUID)
fmt.Printf("Token: %s\n", tokenString)
fmt.Printf("Expires: %s (24 hours)\n", time.Unix(expiresAt, 0).Format(time.RFC3339))
fmt.Printf("\n⚠ IMPORTANT: Save this token immediately!\n")
fmt.Printf(" This is the only time it will be displayed.\n")
fmt.Printf(" Use this token to authenticate and create additional users.\n")
fmt.Printf(strings.Repeat("=", 80) + "\n\n")
return nil
}
// hashUserNickname creates a hash of the user nickname (similar to handlers.go)
func hashUserNickname(nickname string) string {
return utils.HashSHA3512(nickname)
}
// hashGroupName creates a hash of the group name (similar to handlers.go)
func hashGroupName(groupname string) string {
return utils.HashSHA3512(groupname)
}
// storeUserAndGroup stores both user and group in the database
func (s *Server) storeUserAndGroup(user *types.User, group *types.Group) error {
return s.db.Update(func(txn *badger.Txn) error {
// Store user
userData, err := json.Marshal(user)
if err != nil {
return fmt.Errorf("failed to marshal user data: %v", err)
}
if err := txn.Set([]byte(auth.UserStorageKey(user.UUID)), userData); err != nil {
return fmt.Errorf("failed to store user: %v", err)
}
// Store group
groupData, err := json.Marshal(group)
if err != nil {
return fmt.Errorf("failed to marshal group data: %v", err)
}
if err := txn.Set([]byte(auth.GroupStorageKey(group.UUID)), groupData); err != nil {
return fmt.Errorf("failed to store group: %v", err)
}
return nil
})
}

View File

@@ -131,22 +131,6 @@ type CreateTokenResponse struct {
ExpiresAt int64 `json:"expires_at"` ExpiresAt int64 `json:"expires_at"`
} }
// Resource Metadata Management API structures (Issue #12)
type GetResourceMetadataResponse struct {
OwnerUUID string `json:"owner_uuid"`
GroupUUID string `json:"group_uuid"`
Permissions int `json:"permissions"`
TTL string `json:"ttl"`
CreatedAt int64 `json:"created_at"`
UpdatedAt int64 `json:"updated_at"`
}
type UpdateResourceMetadataRequest struct {
OwnerUUID *string `json:"owner_uuid,omitempty"`
GroupUUID *string `json:"group_uuid,omitempty"`
Permissions *int `json:"permissions,omitempty"`
}
// Cluster and member management types // Cluster and member management types
type Member struct { type Member struct {
ID string `json:"id"` ID string `json:"id"`
@@ -289,15 +273,4 @@ type Config struct {
ClusteringEnabled bool `yaml:"clustering_enabled"` // Enable/disable clustering/gossip ClusteringEnabled bool `yaml:"clustering_enabled"` // Enable/disable clustering/gossip
RateLimitingEnabled bool `yaml:"rate_limiting_enabled"` // Enable/disable rate limiting RateLimitingEnabled bool `yaml:"rate_limiting_enabled"` // Enable/disable rate limiting
RevisionHistoryEnabled bool `yaml:"revision_history_enabled"` // Enable/disable revision history RevisionHistoryEnabled bool `yaml:"revision_history_enabled"` // Enable/disable revision history
// Anonymous access control (Issue #5)
AllowAnonymousRead bool `yaml:"allow_anonymous_read"` // Allow unauthenticated read access to KV endpoints
AllowAnonymousWrite bool `yaml:"allow_anonymous_write"` // Allow unauthenticated write access to KV endpoints
// Cluster authentication (Issue #13)
ClusterSecret string `yaml:"cluster_secret"` // Shared secret for cluster authentication (auto-generated if empty)
ClusterTLSEnabled bool `yaml:"cluster_tls_enabled"` // Require TLS for inter-node communication
ClusterTLSCertFile string `yaml:"cluster_tls_cert_file"` // Path to TLS certificate file
ClusterTLSKeyFile string `yaml:"cluster_tls_key_file"` // Path to TLS private key file
ClusterTLSSkipVerify bool `yaml:"cluster_tls_skip_verify"` // Skip TLS verification (insecure, for testing only)
} }