Files
BytbitBTC/onramp/DATABASE.md

4.3 KiB

Data Source Documentation: candles.db

1. Database Overview

The database is an Aggregated Trade Store. Instead of storing millions of individual trades (which are kept in the raw .jsonl files), this database stores OHLCV (Open, High, Low, Close, Volume) data across multiple timeframes.

  • Database Engine: SQLite 3
  • Concurrency Mode: WAL (Write-Ahead Logging) enabled.
  • Update Frequency: Real-time (updated as trades arrive).

2. Schema Definition

The database contains a single primary table: candles.

Table: candles

Column Type Description
timeframe TEXT The aggregation window: 1m, 5m, 15m, or 1h.
timestamp INTEGER Unix Timestamp (seconds) representing the start of the candle.
open REAL Price of the first trade in this window.
high REAL Highest price reached during this window.
low REAL Lowest price reached during this window.
close REAL Price of the last trade received for this window.
volume REAL Total base currency (BTC) volume traded.
buy_volume REAL Total volume from trades marked as "Buy" (Taker Buy).

Primary Key: (timeframe, timestamp) This ensures no duplicate candles exist for the same timeframe and time slot.


3. Key Data Logic

Buy/Sell Pressure

Unlike standard exchange OHLCV, this database includes buy_volume.

  • Sell Volume = volume - buy_volume.
  • Net Flow = buy_volume - (volume - buy_volume).
  • Buy Ratio = buy_volume / volume.

Candle Completion

Because the onramp service tails a live file, the latest candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:

  1. Filtering for timestamp < current_window_start (to get only closed candles).
  2. Treating the latest row as "Live" data.

4. Accessing the Data

Since the onramp service is constantly writing to the database, you must use specific flags to avoid "Database is locked" errors.

import sqlite3
import pandas as pd

def get_connection(db_path):
    # Connect with a timeout to wait for the writer to finish
    conn = sqlite3.connect(db_path, timeout=10)
    # Enable WAL mode for high-concurrency reading
    conn.execute("PRAGMA journal_mode=WAL;")
    return conn

Common Query Patterns

Get the last 100 closed 1-minute candles:

SELECT * FROM candles 
WHERE timeframe = '1m' 
ORDER BY timestamp DESC 
LIMIT 100;

Calculate 5-minute volatility (High-Low) over the last hour:

SELECT timestamp, (high - low) as volatility 
FROM candles 
WHERE timeframe = '5m' 
AND timestamp > (strftime('%s', 'now') - 3600)
ORDER BY timestamp ASC;

5. Integration with Analysis Engine (Pandas Example)

If you are building an analysis engine in Python, this is the most efficient way to load data for processing:

import pandas as pd
import sqlite3

DB_PATH = "path/to/your/candles.db"

def load_candles(timeframe="1m", limit=1000):
    conn = sqlite3.connect(DB_PATH)
    query = f"""
        SELECT * FROM candles 
        WHERE timeframe = ? 
        ORDER BY timestamp DESC 
        LIMIT ?
    """
    df = pd.read_sql_query(query, conn, params=(timeframe, limit))
    conn.close()
    
    # Convert timestamp to readable datetime
    df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')
    
    # Sort back to chronological order for analysis
    return df.sort_values('timestamp').reset_index(drop=True)

# Usage
df = load_candles("1m")
print(df.tail())

6. Maintenance & Performance Notes

  1. Index Optimization: The Primary Key already creates an index on (timeframe, timestamp). This makes queries filtered by timeframe and sorted by time extremely fast.
  2. Storage: SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run VACUUM; occasionally (though this requires stopping the onramp service briefly).
  3. Backups: You can safely copy the candles.db file while the system is running, provided you also copy the candles.db-wal and candles.db-shm files (or use the SQLite .backup command).