4.3 KiB
Data Source Documentation: candles.db
1. Database Overview
The database is an Aggregated Trade Store. Instead of storing millions of individual trades (which are kept in the raw .jsonl files), this database stores OHLCV (Open, High, Low, Close, Volume) data across multiple timeframes.
- Database Engine: SQLite 3
- Concurrency Mode: WAL (Write-Ahead Logging) enabled.
- Update Frequency: Real-time (updated as trades arrive).
2. Schema Definition
The database contains a single primary table: candles.
Table: candles
| Column | Type | Description |
|---|---|---|
timeframe |
TEXT |
The aggregation window: 1m, 5m, 15m, or 1h. |
timestamp |
INTEGER |
Unix Timestamp (seconds) representing the start of the candle. |
open |
REAL |
Price of the first trade in this window. |
high |
REAL |
Highest price reached during this window. |
low |
REAL |
Lowest price reached during this window. |
close |
REAL |
Price of the last trade received for this window. |
volume |
REAL |
Total base currency (BTC) volume traded. |
buy_volume |
REAL |
Total volume from trades marked as "Buy" (Taker Buy). |
Primary Key: (timeframe, timestamp)
This ensures no duplicate candles exist for the same timeframe and time slot.
3. Key Data Logic
Buy/Sell Pressure
Unlike standard exchange OHLCV, this database includes buy_volume.
- Sell Volume =
volume - buy_volume. - Net Flow =
buy_volume - (volume - buy_volume). - Buy Ratio =
buy_volume / volume.
Candle Completion
Because the onramp service tails a live file, the latest candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:
- Filtering for
timestamp < current_window_start(to get only closed candles). - Treating the latest row as "Live" data.
4. Accessing the Data
Recommended Connection Settings (Python/Analysis Engine)
Since the onramp service is constantly writing to the database, you must use specific flags to avoid "Database is locked" errors.
import sqlite3
import pandas as pd
def get_connection(db_path):
# Connect with a timeout to wait for the writer to finish
conn = sqlite3.connect(db_path, timeout=10)
# Enable WAL mode for high-concurrency reading
conn.execute("PRAGMA journal_mode=WAL;")
return conn
Common Query Patterns
Get the last 100 closed 1-minute candles:
SELECT * FROM candles
WHERE timeframe = '1m'
ORDER BY timestamp DESC
LIMIT 100;
Calculate 5-minute volatility (High-Low) over the last hour:
SELECT timestamp, (high - low) as volatility
FROM candles
WHERE timeframe = '5m'
AND timestamp > (strftime('%s', 'now') - 3600)
ORDER BY timestamp ASC;
5. Integration with Analysis Engine (Pandas Example)
If you are building an analysis engine in Python, this is the most efficient way to load data for processing:
import pandas as pd
import sqlite3
DB_PATH = "path/to/your/candles.db"
def load_candles(timeframe="1m", limit=1000):
conn = sqlite3.connect(DB_PATH)
query = f"""
SELECT * FROM candles
WHERE timeframe = ?
ORDER BY timestamp DESC
LIMIT ?
"""
df = pd.read_sql_query(query, conn, params=(timeframe, limit))
conn.close()
# Convert timestamp to readable datetime
df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')
# Sort back to chronological order for analysis
return df.sort_values('timestamp').reset_index(drop=True)
# Usage
df = load_candles("1m")
print(df.tail())
6. Maintenance & Performance Notes
- Index Optimization: The Primary Key already creates an index on
(timeframe, timestamp). This makes queries filtered by timeframe and sorted by time extremely fast. - Storage: SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run
VACUUM;occasionally (though this requires stopping theonrampservice briefly). - Backups: You can safely copy the
candles.dbfile while the system is running, provided you also copy thecandles.db-walandcandles.db-shmfiles (or use the SQLite.backupcommand).