Files
BytbitBTC/analysis/DATABASE.md

6.4 KiB
Raw Permalink Blame History

Data Output Documentation: analysis.db

1. Database Overview

The analysis.db database is a Derived Market Analysis Store.

It contains technical indicator outputs calculated from the raw OHLCV candle data stored in candles.db. This database is append-only, preserves full historical analysis, and is designed to act as a clean input source for downstream services such as:

  • Signal generators
  • Strategy evaluators
  • Paper/live trading engines
  • Dashboards and visualization layers
  • Alerting systems
  • Database Engine: SQLite 3
  • Concurrency Mode: WAL (Write-Ahead Logging) enabled
  • Update Frequency: Near real-time (continuous processing loop)
  • Retention Model: Rolling window (currently ~1 month via cleanup + incremental vacuum)

2. Relationship to candles.db

This database is fully derived from candles.db.

  • Each row in analysis.db corresponds 1:1 with a closed candle from candles.db
  • No raw trade or OHLCV data is duplicated beyond what is needed for indicator outputs
  • Only completed candles are analyzed (the currently-forming candle per timeframe is intentionally excluded)

This separation ensures:

  • Clean responsibility boundaries
  • Safe concurrent reads
  • Zero risk of contaminating raw market data

3. Schema Definition

The database contains a single primary table: analysis.

Table: analysis

Column Type Description
timeframe TEXT Candle timeframe: 1m, 5m, 15m, 1h
timestamp INTEGER Unix timestamp (seconds) start of the candle window
ema_9 REAL Exponential Moving Average (9)
ema_21 REAL Exponential Moving Average (21)
sma_50 REAL Simple Moving Average (50)
sma_200 REAL Simple Moving Average (200)
rsi_14 REAL Relative Strength Index (14)
macd REAL MACD line (EMA12 EMA26)
macd_signal REAL MACD signal line (9-period EMA of MACD)
macd_hist REAL MACD histogram (macd macd_signal)
bb_upper REAL Bollinger Band upper (20, 2σ)
bb_middle REAL Bollinger Band middle (20 SMA)
bb_lower REAL Bollinger Band lower (20, 2σ)
bb_squeeze INTEGER Volatility squeeze flag (1 = squeeze detected, 0 = normal)
volume_ma_20 REAL 20-period moving average of volume

Primary Key

(timeframe, timestamp)

This guarantees:

  • No duplicate indicator rows
  • Perfect alignment with candle boundaries
  • Deterministic joins with candles.db

4. Indicator Semantics

Moving Averages

  • EMA 9 / 21: Short-term momentum and trend confirmation
  • SMA 50 / 200: Medium- and long-term trend structure

RSI (14)

  • Range: 0100

  • Typical interpretations:

    • >70 → Overbought
    • <30 → Oversold
  • Calculated using classic Wilder-style average gains/losses

MACD (12, 26, 9)

  • macd: Momentum direction
  • macd_signal: Smoothed momentum
  • macd_hist: Acceleration / deceleration of momentum

Bollinger Bands (20, 2)

  • Measures volatility expansion and contraction
  • Bands are calculated using 20-period SMA ± 2 standard deviations

Bollinger Squeeze (bb_squeeze)

  • 1 when Bollinger Band width is at a local minimum
  • Indicates volatility compression
  • Often precedes large directional moves
  • Designed for breakout-style strategies

Volume MA (20)

  • Used for:

    • Breakout confirmation
    • Divergence detection
    • Trend strength validation

5. Time Handling & Candle Validity

  • All timestamps represent the start of the candle window

  • Example:

    • A 1m candle at 12:00:00 covers 12:00:00 → 12:00:59
  • Only closed candles are analyzed

  • No partial or live candle values exist in this database

This makes analysis.db safe for:

  • Backtesting
  • Deterministic replay
  • Strategy evaluation without repainting risk

6. Accessing the Data

import sqlite3
import pandas as pd

def get_connection(db_path):
    conn = sqlite3.connect(db_path, timeout=10)
    conn.execute("PRAGMA journal_mode=WAL;")
    return conn

Common Query Patterns

Latest analysis for a timeframe

SELECT *
FROM analysis
WHERE timeframe = '5m'
ORDER BY timestamp DESC
LIMIT 1;

Last 200 RSI values (1-minute)

SELECT timestamp, rsi_14
FROM analysis
WHERE timeframe = '1m'
ORDER BY timestamp DESC
LIMIT 200;

Detect active Bollinger squeezes

SELECT timeframe, timestamp
FROM analysis
WHERE bb_squeeze = 1
ORDER BY timestamp DESC;

Join with raw candles for strategy logic

SELECT
    c.timestamp,
    c.close,
    a.ema_21,
    a.rsi_14,
    a.macd_hist
FROM candles c
JOIN analysis a
  ON c.timeframe = a.timeframe
 AND c.timestamp = a.timestamp
WHERE c.timeframe = '15m'
ORDER BY c.timestamp DESC
LIMIT 100;

7. Retention, Cleanup & Vacuuming

  • The analysis engine enforces a rolling ~1-month window
  • Rows older than the cutoff are deleted
  • PRAGMA incremental_vacuum is used to reclaim disk space gradually

This avoids:

  • Long blocking VACUUM operations
  • Database bloat over long-running operation

8. Intended Usage Patterns

analysis.db is designed to be:

  • Read-heavy
  • Safe for multiple consumers
  • Stable and deterministic

Typical consumers include:

  • Signal engines (rule-based or ML-driven)
  • Strategy backtesters
  • Alert pipelines
  • Visualization / dashboards
  • Risk and regime detection modules

9. Design Philosophy

  • Separation of concerns: raw data vs. derived signals
  • Determinism first: no repainting, no live-candle noise
  • SQLite-friendly: WAL mode, append-only, incremental vacuum
  • Composable: easy joins, easy extensions, predictable schema