chore: update documentation
This commit is contained in:
144
CLAUDE.md
Normal file
144
CLAUDE.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Commands for Development
|
||||
|
||||
### Build and Test Commands
|
||||
```bash
|
||||
# Build the binary
|
||||
go build -o kvs .
|
||||
|
||||
# Run with default config (auto-generates config.yaml)
|
||||
./kvs
|
||||
|
||||
# Run with custom config
|
||||
./kvs /path/to/config.yaml
|
||||
|
||||
# Run comprehensive integration tests
|
||||
./integration_test.sh
|
||||
|
||||
# Create test conflict data for debugging
|
||||
go run test_conflict.go data1 data2
|
||||
|
||||
# Build and test in one go
|
||||
go build -o kvs . && ./integration_test.sh
|
||||
```
|
||||
|
||||
### Development Workflow
|
||||
```bash
|
||||
# Format and check code
|
||||
go fmt ./...
|
||||
go vet ./...
|
||||
|
||||
# Run dependencies management
|
||||
go mod tidy
|
||||
|
||||
# Check build without artifacts
|
||||
go build .
|
||||
|
||||
# Test specific cluster scenarios
|
||||
./kvs node1.yaml & # Terminal 1
|
||||
./kvs node2.yaml & # Terminal 2
|
||||
curl -X PUT http://localhost:8081/kv/test/data -H "Content-Type: application/json" -d '{"test":"data"}'
|
||||
curl http://localhost:8082/kv/test/data # Should replicate within ~30 seconds
|
||||
pkill kvs
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### High-Level Structure
|
||||
KVS is a **distributed, eventually consistent key-value store** built around three core systems:
|
||||
|
||||
1. **Gossip Protocol** (`cluster/gossip.go`) - Decentralized membership management and failure detection
|
||||
2. **Merkle Tree Sync** (`cluster/sync.go`, `cluster/merkle.go`) - Efficient data synchronization and conflict resolution
|
||||
3. **Modular Server** (`server/`) - HTTP API with pluggable feature modules
|
||||
|
||||
### Key Architectural Patterns
|
||||
|
||||
#### Modular Package Design
|
||||
- **`auth/`** - Complete JWT authentication system with POSIX-inspired permissions
|
||||
- **`cluster/`** - Distributed systems logic (gossip, sync, merkle trees)
|
||||
- **`storage/`** - BadgerDB abstraction with compression and revision history
|
||||
- **`server/`** - HTTP handlers, routing, and lifecycle management
|
||||
- **`features/`** - Utility functions for TTL, rate limiting, tamper logging, backup
|
||||
- **`types/`** - Centralized type definitions for all components
|
||||
- **`config/`** - Configuration loading with auto-generation
|
||||
- **`utils/`** - Cryptographic hashing utilities
|
||||
|
||||
#### Core Data Model
|
||||
```go
|
||||
// Primary storage format
|
||||
type StoredValue struct {
|
||||
UUID string `json:"uuid"` // Unique version identifier
|
||||
Timestamp int64 `json:"timestamp"` // Unix timestamp (milliseconds)
|
||||
Data json.RawMessage `json:"data"` // Actual user JSON payload
|
||||
}
|
||||
```
|
||||
|
||||
#### Critical System Interactions
|
||||
|
||||
**Conflict Resolution Flow:**
|
||||
1. Merkle trees detect divergent data between nodes (`cluster/merkle.go`)
|
||||
2. Sync service fetches conflicting keys (`cluster/sync.go:fetchAndCompareData`)
|
||||
3. Sophisticated conflict resolution logic in `resolveConflict()`:
|
||||
- Same timestamp → Apply "oldest-node rule" (earliest `joined_timestamp` wins)
|
||||
- Tie-breaker → UUID comparison for deterministic results
|
||||
- Winner's data automatically replicated to losing nodes
|
||||
|
||||
**Authentication & Authorization:**
|
||||
- JWT tokens with scoped permissions (`auth/jwt.go`)
|
||||
- POSIX-inspired 12-bit permission system (`types/types.go:52-75`)
|
||||
- Resource ownership metadata with TTL support (`types/ResourceMetadata`)
|
||||
|
||||
**Storage Strategy:**
|
||||
- **Main keys**: Direct path mapping (`users/john/profile`)
|
||||
- **Index keys**: `_ts:{timestamp}:{path}` for time-based queries
|
||||
- **Compression**: Optional ZSTD compression (`storage/compression.go`)
|
||||
- **Revisions**: Optional revision history (`storage/revision.go`)
|
||||
|
||||
### Configuration Architecture
|
||||
|
||||
The system uses feature toggles extensively (`types/Config:271-276`):
|
||||
```yaml
|
||||
auth_enabled: true # JWT authentication system
|
||||
tamper_logging_enabled: true # Cryptographic audit trail
|
||||
clustering_enabled: true # Gossip protocol and sync
|
||||
rate_limiting_enabled: true # Per-client rate limiting
|
||||
revision_history_enabled: true # Automatic versioning
|
||||
```
|
||||
|
||||
### Testing Strategy
|
||||
|
||||
#### Integration Test Suite (`integration_test.sh`)
|
||||
- **Build verification** - Ensures binary compiles correctly
|
||||
- **Basic functionality** - Single-node CRUD operations
|
||||
- **Cluster formation** - 2-node gossip protocol and data replication
|
||||
- **Conflict resolution** - Automated conflict detection and resolution using `test_conflict.go`
|
||||
|
||||
The test suite uses sophisticated retry logic and timing to handle the eventually consistent nature of the system.
|
||||
|
||||
#### Conflict Testing Utility (`test_conflict.go`)
|
||||
Creates two BadgerDB instances with intentionally conflicting data (same path, same timestamp, different UUIDs) to test the conflict resolution algorithm.
|
||||
|
||||
### Development Notes
|
||||
|
||||
#### Key Constraints
|
||||
- **Eventually Consistent**: All operations succeed locally first, then replicate
|
||||
- **Local-First Truth**: Nodes operate independently and sync in background
|
||||
- **No Transactions**: Each key operation is atomic and independent
|
||||
- **Hierarchical Keys**: Support for path-like structures (`/home/room/closet/socks`)
|
||||
|
||||
#### Critical Timing Considerations
|
||||
- **Gossip intervals**: 1-2 minutes for membership updates
|
||||
- **Sync intervals**: 5 minutes for regular data sync, 2 minutes for catch-up
|
||||
- **Conflict resolution**: Typically resolves within 10-30 seconds after detection
|
||||
- **Bootstrap sync**: Up to 30 days of historical data for new nodes
|
||||
|
||||
#### Main Entry Point Flow
|
||||
1. `main.go` loads config (auto-generates default if missing)
|
||||
2. `server.NewServer()` initializes all subsystems
|
||||
3. Graceful shutdown handling with `SIGINT`/`SIGTERM`
|
||||
4. All business logic delegated to modular packages
|
||||
|
||||
This architecture enables easy feature addition, comprehensive testing, and reliable operation in distributed environments while maintaining simplicity for single-node deployments.
|
Reference in New Issue
Block a user