Made the onramp in Go with way better architecture. Created onramp/DATABASE.md to help with next development cycle.
This commit is contained in:
124
onramp/DATABASE.md
Normal file
124
onramp/DATABASE.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# Data Source Documentation: `candles.db`
|
||||
|
||||
## 1. Database Overview
|
||||
The database is an **Aggregated Trade Store**. Instead of storing millions of individual trades (which are kept in the raw `.jsonl` files), this database stores **OHLCV** (Open, High, Low, Close, Volume) data across multiple timeframes.
|
||||
|
||||
* **Database Engine:** SQLite 3
|
||||
* **Concurrency Mode:** WAL (Write-Ahead Logging) enabled.
|
||||
* **Update Frequency:** Real-time (updated as trades arrive).
|
||||
|
||||
---
|
||||
|
||||
## 2. Schema Definition
|
||||
|
||||
The database contains a single primary table: `candles`.
|
||||
|
||||
### Table: `candles`
|
||||
| Column | Type | Description |
|
||||
| :--- | :--- | :--- |
|
||||
| `timeframe` | `TEXT` | The aggregation window: `1m`, `5m`, `15m`, or `1h`. |
|
||||
| `timestamp` | `INTEGER` | Unix Timestamp (seconds) representing the **start** of the candle. |
|
||||
| `open` | `REAL` | Price of the first trade in this window. |
|
||||
| `high` | `REAL` | Highest price reached during this window. |
|
||||
| `low` | `REAL` | Lowest price reached during this window. |
|
||||
| `close` | `REAL` | Price of the last trade received for this window. |
|
||||
| `volume` | `REAL` | Total base currency (BTC) volume traded. |
|
||||
| `buy_volume` | `REAL` | Total volume from trades marked as "Buy" (Taker Buy). |
|
||||
|
||||
**Primary Key:** `(timeframe, timestamp)`
|
||||
*This ensures no duplicate candles exist for the same timeframe and time slot.*
|
||||
|
||||
---
|
||||
|
||||
## 3. Key Data Logic
|
||||
|
||||
### Buy/Sell Pressure
|
||||
Unlike standard exchange OHLCV, this database includes `buy_volume`.
|
||||
* **Sell Volume** = `volume - buy_volume`.
|
||||
* **Net Flow** = `buy_volume - (volume - buy_volume)`.
|
||||
* **Buy Ratio** = `buy_volume / volume`.
|
||||
|
||||
### Candle Completion
|
||||
Because the `onramp` service tails a live file, the **latest** candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:
|
||||
1. Filtering for `timestamp < current_window_start` (to get only closed candles).
|
||||
2. Treating the latest row as "Live" data.
|
||||
|
||||
---
|
||||
|
||||
## 4. Accessing the Data
|
||||
|
||||
### Recommended Connection Settings (Python/Analysis Engine)
|
||||
Since the `onramp` service is constantly writing to the database, you **must** use specific flags to avoid "Database is locked" errors.
|
||||
|
||||
```python
|
||||
import sqlite3
|
||||
import pandas as pd
|
||||
|
||||
def get_connection(db_path):
|
||||
# Connect with a timeout to wait for the writer to finish
|
||||
conn = sqlite3.connect(db_path, timeout=10)
|
||||
# Enable WAL mode for high-concurrency reading
|
||||
conn.execute("PRAGMA journal_mode=WAL;")
|
||||
return conn
|
||||
```
|
||||
|
||||
### Common Query Patterns
|
||||
|
||||
**Get the last 100 closed 1-minute candles:**
|
||||
```sql
|
||||
SELECT * FROM candles
|
||||
WHERE timeframe = '1m'
|
||||
ORDER BY timestamp DESC
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
**Calculate 5-minute volatility (High-Low) over the last hour:**
|
||||
```sql
|
||||
SELECT timestamp, (high - low) as volatility
|
||||
FROM candles
|
||||
WHERE timeframe = '5m'
|
||||
AND timestamp > (strftime('%s', 'now') - 3600)
|
||||
ORDER BY timestamp ASC;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Integration with Analysis Engine (Pandas Example)
|
||||
|
||||
If you are building an analysis engine in Python, this is the most efficient way to load data for processing:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import sqlite3
|
||||
|
||||
DB_PATH = "path/to/your/candles.db"
|
||||
|
||||
def load_candles(timeframe="1m", limit=1000):
|
||||
conn = sqlite3.connect(DB_PATH)
|
||||
query = f"""
|
||||
SELECT * FROM candles
|
||||
WHERE timeframe = ?
|
||||
ORDER BY timestamp DESC
|
||||
LIMIT ?
|
||||
"""
|
||||
df = pd.read_sql_query(query, conn, params=(timeframe, limit))
|
||||
conn.close()
|
||||
|
||||
# Convert timestamp to readable datetime
|
||||
df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')
|
||||
|
||||
# Sort back to chronological order for analysis
|
||||
return df.sort_values('timestamp').reset_index(drop=True)
|
||||
|
||||
# Usage
|
||||
df = load_candles("1m")
|
||||
print(df.tail())
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Maintenance & Performance Notes
|
||||
|
||||
1. **Index Optimization:** The Primary Key already creates an index on `(timeframe, timestamp)`. This makes queries filtered by timeframe and sorted by time extremely fast.
|
||||
2. **Storage:** SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run `VACUUM;` occasionally (though this requires stopping the `onramp` service briefly).
|
||||
3. **Backups:** You can safely copy the `candles.db` file while the system is running, provided you also copy the `candles.db-wal` and `candles.db-shm` files (or use the SQLite `.backup` command).
|
||||
Reference in New Issue
Block a user