# Data Output Documentation: `analysis.db` ## 1. Database Overview The `analysis.db` database is a **Derived Market Analysis Store**. It contains **technical indicator outputs** calculated from the raw OHLCV candle data stored in `candles.db`. This database is **append-only**, preserves full historical analysis, and is designed to act as a **clean input source** for downstream services such as: * Signal generators * Strategy evaluators * Paper/live trading engines * Dashboards and visualization layers * Alerting systems - **Database Engine:** SQLite 3 - **Concurrency Mode:** WAL (Write-Ahead Logging) enabled - **Update Frequency:** Near real-time (continuous processing loop) - **Retention Model:** Rolling window (currently ~1 month via cleanup + incremental vacuum) --- ## 2. Relationship to `candles.db` This database is **fully derived** from `candles.db`. * Each row in `analysis.db` corresponds **1:1** with a *closed* candle from `candles.db` * No raw trade or OHLCV data is duplicated beyond what is needed for indicator outputs * Only **completed candles** are analyzed (the currently-forming candle per timeframe is intentionally excluded) This separation ensures: * Clean responsibility boundaries * Safe concurrent reads * Zero risk of contaminating raw market data --- ## 3. Schema Definition The database contains a single primary table: `analysis`. ### Table: `analysis` | Column | Type | Description | | -------------- | --------- | -------------------------------------------------------------- | | `timeframe` | `TEXT` | Candle timeframe: `1m`, `5m`, `15m`, `1h` | | `timestamp` | `INTEGER` | Unix timestamp (seconds) – **start of the candle window** | | `ema_9` | `REAL` | Exponential Moving Average (9) | | `ema_21` | `REAL` | Exponential Moving Average (21) | | `sma_50` | `REAL` | Simple Moving Average (50) | | `sma_200` | `REAL` | Simple Moving Average (200) | | `rsi_14` | `REAL` | Relative Strength Index (14) | | `macd` | `REAL` | MACD line (EMA12 − EMA26) | | `macd_signal` | `REAL` | MACD signal line (9-period EMA of MACD) | | `macd_hist` | `REAL` | MACD histogram (`macd − macd_signal`) | | `bb_upper` | `REAL` | Bollinger Band upper (20, 2σ) | | `bb_middle` | `REAL` | Bollinger Band middle (20 SMA) | | `bb_lower` | `REAL` | Bollinger Band lower (20, 2σ) | | `bb_squeeze` | `INTEGER` | Volatility squeeze flag (`1` = squeeze detected, `0` = normal) | | `volume_ma_20` | `REAL` | 20-period moving average of volume | ### Primary Key ``` (timeframe, timestamp) ``` This guarantees: * No duplicate indicator rows * Perfect alignment with candle boundaries * Deterministic joins with `candles.db` --- ## 4. Indicator Semantics ### Moving Averages * **EMA 9 / 21**: Short-term momentum and trend confirmation * **SMA 50 / 200**: Medium- and long-term trend structure ### RSI (14) * Range: `0–100` * Typical interpretations: * `>70` → Overbought * `<30` → Oversold * Calculated using classic Wilder-style average gains/losses ### MACD (12, 26, 9) * `macd`: Momentum direction * `macd_signal`: Smoothed momentum * `macd_hist`: Acceleration / deceleration of momentum ### Bollinger Bands (20, 2) * Measures volatility expansion and contraction * Bands are calculated using 20-period SMA ± 2 standard deviations #### Bollinger Squeeze (`bb_squeeze`) * `1` when Bollinger Band width is at a **local minimum** * Indicates **volatility compression** * Often precedes large directional moves * Designed for breakout-style strategies ### Volume MA (20) * Used for: * Breakout confirmation * Divergence detection * Trend strength validation --- ## 5. Time Handling & Candle Validity * All timestamps represent the **start** of the candle window * Example: * A `1m` candle at `12:00:00` covers `12:00:00 → 12:00:59` * Only **closed candles** are analyzed * No partial or live candle values exist in this database This makes `analysis.db` safe for: * Backtesting * Deterministic replay * Strategy evaluation without repainting risk --- ## 6. Accessing the Data ### Recommended Connection Settings (Python) ```python import sqlite3 import pandas as pd def get_connection(db_path): conn = sqlite3.connect(db_path, timeout=10) conn.execute("PRAGMA journal_mode=WAL;") return conn ``` --- ### Common Query Patterns #### Latest analysis for a timeframe ```sql SELECT * FROM analysis WHERE timeframe = '5m' ORDER BY timestamp DESC LIMIT 1; ``` --- #### Last 200 RSI values (1-minute) ```sql SELECT timestamp, rsi_14 FROM analysis WHERE timeframe = '1m' ORDER BY timestamp DESC LIMIT 200; ``` --- #### Detect active Bollinger squeezes ```sql SELECT timeframe, timestamp FROM analysis WHERE bb_squeeze = 1 ORDER BY timestamp DESC; ``` --- #### Join with raw candles for strategy logic ```sql SELECT c.timestamp, c.close, a.ema_21, a.rsi_14, a.macd_hist FROM candles c JOIN analysis a ON c.timeframe = a.timeframe AND c.timestamp = a.timestamp WHERE c.timeframe = '15m' ORDER BY c.timestamp DESC LIMIT 100; ``` --- ## 7. Retention, Cleanup & Vacuuming * The analysis engine enforces a **rolling ~1-month window** * Rows older than the cutoff are deleted * `PRAGMA incremental_vacuum` is used to reclaim disk space gradually This avoids: * Long blocking `VACUUM` operations * Database bloat over long-running operation --- ## 8. Intended Usage Patterns `analysis.db` is designed to be: * **Read-heavy** * **Safe for multiple consumers** * **Stable and deterministic** Typical consumers include: * Signal engines (rule-based or ML-driven) * Strategy backtesters * Alert pipelines * Visualization / dashboards * Risk and regime detection modules --- ## 9. Design Philosophy * **Separation of concerns**: raw data vs. derived signals * **Determinism first**: no repainting, no live-candle noise * **SQLite-friendly**: WAL mode, append-only, incremental vacuum * **Composable**: easy joins, easy extensions, predictable schema