BytbitBTC/onramp/DATABASE.md

# Data Source Documentation: `candles.db`

## 1. Database Overview
The database is an **Aggregated Trade Store**. Instead of storing millions of individual trades (which are kept in the raw `.jsonl` files), this database stores **OHLCV** (Open, High, Low, Close, Volume) data across multiple timeframes.

*   **Database Engine:** SQLite 3
*   **Concurrency Mode:** WAL (Write-Ahead Logging) enabled.
*   **Update Frequency:** Real-time (updated as trades arrive).

---

## 2. Schema Definition

The database contains a single primary table: `candles`.

### Table: `candles`
| Column | Type | Description |
| :--- | :--- | :--- |
| `timeframe` | `TEXT` | The aggregation window: `1m`, `5m`, `15m`, or `1h`. |
| `timestamp` | `INTEGER` | Unix Timestamp (seconds) representing the **start** of the candle. |
| `open` | `REAL` | Price of the first trade in this window. |
| `high` | `REAL` | Highest price reached during this window. |
| `low` | `REAL` | Lowest price reached during this window. |
| `close` | `REAL` | Price of the last trade received for this window. |
| `volume` | `REAL` | Total base currency (BTC) volume traded. |
| `buy_volume` | `REAL` | Total volume from trades marked as "Buy" (Taker Buy). |

**Primary Key:** `(timeframe, timestamp)`
*This ensures no duplicate candles exist for the same timeframe and time slot.*

---

## 3. Key Data Logic

### Buy/Sell Pressure
Unlike standard exchange OHLCV, this database includes `buy_volume`.
*   **Sell Volume** = `volume - buy_volume`.
*   **Net Flow** = `buy_volume - (volume - buy_volume)`.
*   **Buy Ratio** = `buy_volume / volume`.

### Candle Completion
Because the `onramp` service tails a live file, the **latest** candle for any timeframe is "unstable." It will continue to update until the next time window begins. Your analysis engine should account for this by either:
1.  Filtering for `timestamp < current_window_start` (to get only closed candles).
2.  Treating the latest row as "Live" data.

---

## 4. Accessing the Data

### Recommended Connection Settings (Python/Analysis Engine)
Since the `onramp` service is constantly writing to the database, you **must** use specific flags to avoid "Database is locked" errors.

```python
import sqlite3
import pandas as pd

def get_connection(db_path):
    # Connect with a timeout to wait for the writer to finish
    conn = sqlite3.connect(db_path, timeout=10)
    # Enable WAL mode for high-concurrency reading
    conn.execute("PRAGMA journal_mode=WAL;")
    return conn
```

### Common Query Patterns

**Get the last 100 closed 1-minute candles:**
```sql
SELECT * FROM candles
WHERE timeframe = '1m'
ORDER BY timestamp DESC
LIMIT 100;
```

**Calculate 5-minute volatility (High-Low) over the last hour:**
```sql
SELECT timestamp, (high - low) as volatility
FROM candles
WHERE timeframe = '5m'
AND timestamp > (strftime('%s', 'now') - 3600)
ORDER BY timestamp ASC;
```

---

## 5. Integration with Analysis Engine (Pandas Example)

If you are building an analysis engine in Python, this is the most efficient way to load data for processing:

```python
import pandas as pd
import sqlite3

DB_PATH = "path/to/your/candles.db"

def load_candles(timeframe="1m", limit=1000):
    conn = sqlite3.connect(DB_PATH)
    query = f"""
        SELECT * FROM candles
        WHERE timeframe = ?
        ORDER BY timestamp DESC
        LIMIT ?
    """
    df = pd.read_sql_query(query, conn, params=(timeframe, limit))
    conn.close()

    # Convert timestamp to readable datetime
    df['datetime'] = pd.to_datetime(df['timestamp'], unit='s')

    # Sort back to chronological order for analysis
    return df.sort_values('timestamp').reset_index(drop=True)

# Usage
df = load_candles("1m")
print(df.tail())
```

---

## 6. Maintenance & Performance Notes

1.  **Index Optimization:** The Primary Key already creates an index on `(timeframe, timestamp)`. This makes queries filtered by timeframe and sorted by time extremely fast.
2.  **Storage:** SQLite handles millions of rows easily. However, if the database exceeds several gigabytes, you may want to run `VACUUM;` occasionally (though this requires stopping the `onramp` service briefly).
3.  **Backups:** You can safely copy the `candles.db` file while the system is running, provided you also copy the `candles.db-wal` and `candles.db-shm` files (or use the SQLite `.backup` command).