Update README.md and CLAUDE.md to document new process management: - Add "Process Management" section with daemon commands - Update all examples to use `./kvs start/stop/status` instead of `&` and `pkill` - Document global PID/log directories (~/.kvs/) - Update cluster setup examples - Update development workflow - Add daemon package to project structure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
7.8 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Commands for Development
Build and Test Commands
# Build the binary
go build -o kvs .
# Run with default config (auto-generates config.yaml)
./kvs start config.yaml
# Run with custom config
./kvs start /path/to/config.yaml
# Check running instances
./kvs status
# Stop instance
./kvs stop config
# Run comprehensive integration tests
./integration_test.sh
# Create test conflict data for debugging
go run test_conflict.go data1 data2
# Build and test in one go
go build -o kvs . && ./integration_test.sh
Process Management Commands
# Start as background daemon
./kvs start <config.yaml> # .yaml extension optional
# Stop daemon
./kvs stop <config> # Graceful SIGTERM shutdown
# Restart daemon
./kvs restart <config> # Stop then start
# Show status
./kvs status # All instances
./kvs status <config> # Specific instance
# Run in foreground (for debugging)
./kvs <config.yaml> # Logs to stdout, blocks terminal
# View daemon logs
tail -f ~/.kvs/logs/kvs_<config>.yaml.log
# Global state directories
~/.kvs/pids/ # PID files (works from any directory)
~/.kvs/logs/ # Daemon log files
Development Workflow
# Format and check code
go fmt ./...
go vet ./...
# Run dependencies management
go mod tidy
# Check build without artifacts
go build .
# Test specific cluster scenarios
./kvs start node1.yaml
./kvs start node2.yaml
# Wait for cluster formation
sleep 5
# Test data operations
curl -X PUT http://localhost:8081/kv/test/data -H "Content-Type: application/json" -d '{"test":"data"}'
curl http://localhost:8082/kv/test/data # Should replicate within ~30 seconds
# Check daemon status
./kvs status
# View logs
tail -f ~/.kvs/logs/kvs_node1.yaml.log
# Cleanup
./kvs stop node1
./kvs stop node2
Architecture Overview
High-Level Structure
KVS is a distributed, eventually consistent key-value store built around three core systems:
- Gossip Protocol (
cluster/gossip.go
) - Decentralized membership management and failure detection - Merkle Tree Sync (
cluster/sync.go
,cluster/merkle.go
) - Efficient data synchronization and conflict resolution - Modular Server (
server/
) - HTTP API with pluggable feature modules
Key Architectural Patterns
Modular Package Design
auth/
- Complete JWT authentication system with POSIX-inspired permissionscluster/
- Distributed systems logic (gossip, sync, merkle trees)daemon/
- Process management (daemonization, PID files, lifecycle)storage/
- BadgerDB abstraction with compression and revision historyserver/
- HTTP handlers, routing, and lifecycle managementfeatures/
- Utility functions for TTL, rate limiting, tamper logging, backuptypes/
- Centralized type definitions for all componentsconfig/
- Configuration loading with auto-generationutils/
- Cryptographic hashing utilities
Core Data Model
// Primary storage format
type StoredValue struct {
UUID string `json:"uuid"` // Unique version identifier
Timestamp int64 `json:"timestamp"` // Unix timestamp (milliseconds)
Data json.RawMessage `json:"data"` // Actual user JSON payload
}
Critical System Interactions
Conflict Resolution Flow:
- Merkle trees detect divergent data between nodes (
cluster/merkle.go
) - Sync service fetches conflicting keys (
cluster/sync.go:fetchAndCompareData
) - Sophisticated conflict resolution logic in
resolveConflict()
:- Same timestamp → Apply "oldest-node rule" (earliest
joined_timestamp
wins) - Tie-breaker → UUID comparison for deterministic results
- Winner's data automatically replicated to losing nodes
- Same timestamp → Apply "oldest-node rule" (earliest
Authentication & Authorization:
- JWT tokens with scoped permissions (
auth/jwt.go
) - POSIX-inspired 12-bit permission system (
types/types.go:52-75
) - Resource ownership metadata with TTL support (
types/ResourceMetadata
)
Storage Strategy:
- Main keys: Direct path mapping (
users/john/profile
) - Index keys:
_ts:{timestamp}:{path}
for time-based queries - Compression: Optional ZSTD compression (
storage/compression.go
) - Revisions: Optional revision history (
storage/revision.go
)
Configuration Architecture
The system uses feature toggles extensively (types/Config:271-280
):
auth_enabled: true # JWT authentication system
tamper_logging_enabled: true # Cryptographic audit trail
clustering_enabled: true # Gossip protocol and sync
rate_limiting_enabled: true # Per-client rate limiting
revision_history_enabled: true # Automatic versioning
# Anonymous access control (Issue #5 - when auth_enabled: true)
allow_anonymous_read: false # Allow unauthenticated read access to KV endpoints
allow_anonymous_write: false # Allow unauthenticated write access to KV endpoints
Security Note: DELETE operations always require authentication when auth_enabled: true
, regardless of anonymous access settings.
Testing Strategy
Integration Test Suite (integration_test.sh
)
- Build verification - Ensures binary compiles correctly
- Basic functionality - Single-node CRUD operations
- Cluster formation - 2-node gossip protocol and data replication
- Conflict resolution - Automated conflict detection and resolution using
test_conflict.go
- Authentication middleware - Comprehensive security testing (Issue #4):
- Admin endpoints properly reject unauthenticated requests
- Admin endpoints work with valid JWT tokens
- KV endpoints respect anonymous access configuration
- Automatic root account creation and token extraction
The test suite uses sophisticated retry logic and timing to handle the eventually consistent nature of the system.
Conflict Testing Utility (test_conflict.go
)
Creates two BadgerDB instances with intentionally conflicting data (same path, same timestamp, different UUIDs) to test the conflict resolution algorithm.
Development Notes
Key Constraints
- Eventually Consistent: All operations succeed locally first, then replicate
- Local-First Truth: Nodes operate independently and sync in background
- No Transactions: Each key operation is atomic and independent
- Hierarchical Keys: Support for path-like structures (
/home/room/closet/socks
)
Critical Timing Considerations
- Gossip intervals: 1-2 minutes for membership updates
- Sync intervals: 5 minutes for regular data sync, 2 minutes for catch-up
- Conflict resolution: Typically resolves within 10-30 seconds after detection
- Bootstrap sync: Up to 30 days of historical data for new nodes
Main Entry Point Flow
main.go
parses command-line arguments for subcommands (start
,stop
,status
,restart
)- For daemon mode:
daemon.Daemonize()
spawns background process and manages PID files - For server mode: loads config (auto-generates default if missing)
server.NewServer()
initializes all subsystems- Graceful shutdown handling with
SIGINT
/SIGTERM
- All business logic delegated to modular packages
Daemon Architecture
- PID Management: Global PID files stored in
~/.kvs/pids/
for cross-directory access - Logging: Daemon logs written to
~/.kvs/logs/{config-name}.log
- Process Lifecycle: Spawns detached process via
exec.Command()
withSetsid: true
- Config Normalization: Supports both
node1
andnode1.yaml
formats - Stale PID Detection: Checks process existence via
Signal(0)
before operations
This architecture enables easy feature addition, comprehensive testing, and reliable operation in distributed environments while maintaining simplicity for single-node deployments.