124 lines
4.3 KiB
Markdown
124 lines
4.3 KiB
Markdown
# Data Source Documentation: `candles.db`
|
|
|
|
## 1. Database Overview
|
|
The database is an **Aggregated Trade Store**. Instead of storing millions of individual trades (which are kept in the raw `.jsonl` files), this database stores **OHLCV** (Open, High, Low, Close, Volume) data across multiple timeframes.
|
|
|
|
* **Database Engine:** SQLite 3
|
|
* **Concurrency Mode:** WAL (Write-Ahead Logging) enabled.
|
|
* **Update Frequency:** Real-time (updated as trades arrive).
|
|
|
|
---
|
|
|
|
## 2. Schema Definition
|
|
|
|
The database contains a single primary table: `candles`.
|
|
|
|
### Table: `candles`
|
|
| Column | Type | Description |
|
|
| :--- | :--- | :--- |
|
|
| `timeframe` | `TEXT` | The aggregation window: `1m`, `5m`, `15m`, or `1h`. |
|
|
| `timestamp` | `INTEGER` | Unix Timestamp (seconds) representing the **start** of the candle. |
|
|
| `open` | `REAL` | Price of the first trade in this window. |
|
|
| `high` | `REAL` | Highest price reached during this window. |
|
|
| `low` | `REAL` | Lowest price reached during this window. |
|
|
| `close` | `REAL` | Price of the last trade received for this window. |
|
|
| `volume` | `REAL` | Total base currency (BTC) volume traded. |
|
|
| `buy_volume` | `REAL` | Total volume from trades marked as "Buy" (Taker Buy). |
|
|
|
|
**Primary Key:** `(timeframe, timestamp)`
|
|
*This ensures no duplicate candles exist for the same timeframe and time slot.*
|
|
|
|
---
|
|
|
|
## 3. Key Data Logic
|
|
|
|
### Buy/Sell Pressure
|
|
Unlike standard exchange OHLCV, this database includes `buy_volume`.
|
|
* **Sell Volume** = `volume - buy_volume`.
|
|
* **Net Flow** = `buy_volume - (volume - buy_volume)`.
|
|
* **Buy Ratio** = `buy_volume / volume`.
|
|
|
|
### Candle Completion
|
|
Because the `onramp` service tails a live file, the **latest** candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:
|
|
1. Filtering for `timestamp < current_window_start` (to get only closed candles).
|
|
2. Treating the latest row as "Live" data.
|
|
|
|
---
|
|
|
|
## 4. Accessing the Data
|
|
|
|
### Recommended Connection Settings (Python/Analysis Engine)
|
|
Since the `onramp` service is constantly writing to the database, you **must** use specific flags to avoid "Database is locked" errors.
|
|
|
|
```python
|
|
import sqlite3
|
|
import pandas as pd
|
|
|
|
def get_connection(db_path):
|
|
# Connect with a timeout to wait for the writer to finish
|
|
conn = sqlite3.connect(db_path, timeout=10)
|
|
# Enable WAL mode for high-concurrency reading
|
|
conn.execute("PRAGMA journal_mode=WAL;")
|
|
return conn
|
|
```
|
|
|
|
### Common Query Patterns
|
|
|
|
**Get the last 100 closed 1-minute candles:**
|
|
```sql
|
|
SELECT * FROM candles
|
|
WHERE timeframe = '1m'
|
|
ORDER BY timestamp DESC
|
|
LIMIT 100;
|
|
```
|
|
|
|
**Calculate 5-minute volatility (High-Low) over the last hour:**
|
|
```sql
|
|
SELECT timestamp, (high - low) as volatility
|
|
FROM candles
|
|
WHERE timeframe = '5m'
|
|
AND timestamp > (strftime('%s', 'now') - 3600)
|
|
ORDER BY timestamp ASC;
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Integration with Analysis Engine (Pandas Example)
|
|
|
|
If you are building an analysis engine in Python, this is the most efficient way to load data for processing:
|
|
|
|
```python
|
|
import pandas as pd
|
|
import sqlite3
|
|
|
|
DB_PATH = "path/to/your/candles.db"
|
|
|
|
def load_candles(timeframe="1m", limit=1000):
|
|
conn = sqlite3.connect(DB_PATH)
|
|
query = f"""
|
|
SELECT * FROM candles
|
|
WHERE timeframe = ?
|
|
ORDER BY timestamp DESC
|
|
LIMIT ?
|
|
"""
|
|
df = pd.read_sql_query(query, conn, params=(timeframe, limit))
|
|
conn.close()
|
|
|
|
# Convert timestamp to readable datetime
|
|
df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')
|
|
|
|
# Sort back to chronological order for analysis
|
|
return df.sort_values('timestamp').reset_index(drop=True)
|
|
|
|
# Usage
|
|
df = load_candles("1m")
|
|
print(df.tail())
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Maintenance & Performance Notes
|
|
|
|
1. **Index Optimization:** The Primary Key already creates an index on `(timeframe, timestamp)`. This makes queries filtered by timeframe and sorted by time extremely fast.
|
|
2. **Storage:** SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run `VACUUM;` occasionally (though this requires stopping the `onramp` service briefly).
|
|
3. **Backups:** You can safely copy the `candles.db` file while the system is running, provided you also copy the `candles.db-wal` and `candles.db-shm` files (or use the SQLite `.backup` command). |