Files
BytbitBTC/onramp/DATABASE.md

124 lines
4.3 KiB
Markdown

# Data Source Documentation: `candles.db`
## 1. Database Overview
The database is an **Aggregated Trade Store**. Instead of storing millions of individual trades (which are kept in the raw `.jsonl` files), this database stores **OHLCV** (Open, High, Low, Close, Volume) data across multiple timeframes.
* **Database Engine:** SQLite 3
* **Concurrency Mode:** WAL (Write-Ahead Logging) enabled.
* **Update Frequency:** Real-time (updated as trades arrive).
---
## 2. Schema Definition
The database contains a single primary table: `candles`.
### Table: `candles`
| Column | Type | Description |
| :--- | :--- | :--- |
| `timeframe` | `TEXT` | The aggregation window: `1m`, `5m`, `15m`, or `1h`. |
| `timestamp` | `INTEGER` | Unix Timestamp (seconds) representing the **start** of the candle. |
| `open` | `REAL` | Price of the first trade in this window. |
| `high` | `REAL` | Highest price reached during this window. |
| `low` | `REAL` | Lowest price reached during this window. |
| `close` | `REAL` | Price of the last trade received for this window. |
| `volume` | `REAL` | Total base currency (BTC) volume traded. |
| `buy_volume` | `REAL` | Total volume from trades marked as "Buy" (Taker Buy). |
**Primary Key:** `(timeframe, timestamp)`
*This ensures no duplicate candles exist for the same timeframe and time slot.*
---
## 3. Key Data Logic
### Buy/Sell Pressure
Unlike standard exchange OHLCV, this database includes `buy_volume`.
* **Sell Volume** = `volume - buy_volume`.
* **Net Flow** = `buy_volume - (volume - buy_volume)`.
* **Buy Ratio** = `buy_volume / volume`.
### Candle Completion
Because the `onramp` service tails a live file, the **latest** candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:
1. Filtering for `timestamp < current_window_start` (to get only closed candles).
2. Treating the latest row as "Live" data.
---
## 4. Accessing the Data
### Recommended Connection Settings (Python/Analysis Engine)
Since the `onramp` service is constantly writing to the database, you **must** use specific flags to avoid "Database is locked" errors.
```python
import sqlite3
import pandas as pd
def get_connection(db_path):
# Connect with a timeout to wait for the writer to finish
conn = sqlite3.connect(db_path, timeout=10)
# Enable WAL mode for high-concurrency reading
conn.execute("PRAGMA journal_mode=WAL;")
return conn
```
### Common Query Patterns
**Get the last 100 closed 1-minute candles:**
```sql
SELECT * FROM candles
WHERE timeframe = '1m'
ORDER BY timestamp DESC
LIMIT 100;
```
**Calculate 5-minute volatility (High-Low) over the last hour:**
```sql
SELECT timestamp, (high - low) as volatility
FROM candles
WHERE timeframe = '5m'
AND timestamp > (strftime('%s', 'now') - 3600)
ORDER BY timestamp ASC;
```
---
## 5. Integration with Analysis Engine (Pandas Example)
If you are building an analysis engine in Python, this is the most efficient way to load data for processing:
```python
import pandas as pd
import sqlite3
DB_PATH = "path/to/your/candles.db"
def load_candles(timeframe="1m", limit=1000):
conn = sqlite3.connect(DB_PATH)
query = f"""
SELECT * FROM candles
WHERE timeframe = ?
ORDER BY timestamp DESC
LIMIT ?
"""
df = pd.read_sql_query(query, conn, params=(timeframe, limit))
conn.close()
# Convert timestamp to readable datetime
df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')
# Sort back to chronological order for analysis
return df.sort_values('timestamp').reset_index(drop=True)
# Usage
df = load_candles("1m")
print(df.tail())
```
---
## 6. Maintenance & Performance Notes
1. **Index Optimization:** The Primary Key already creates an index on `(timeframe, timestamp)`. This makes queries filtered by timeframe and sorted by time extremely fast.
2. **Storage:** SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run `VACUUM;` occasionally (though this requires stopping the `onramp` service briefly).
3. **Backups:** You can safely copy the `candles.db` file while the system is running, provided you also copy the `candles.db-wal` and `candles.db-shm` files (or use the SQLite `.backup` command).