Files
BytbitBTC/listner_design.md
2026-01-13 21:03:27 +02:00

75 lines
3.6 KiB
Markdown

# Technical Design: Bybit WebSocket Streamer
1. System Overview
The software acts as a dedicated bridge between the Bybit V5 WebSocket API and a local filesystem. Its primary goal is to provide a "hot" data stream for downstream consumers who read from the disk every ~80ms.
2. Core Architecture
The system follows a Producer-Consumer pattern to decouple network ingestion from disk I/O. This prevents disk latency spikes from causing packet loss on the WebSocket buffer.
Producer (WS Client): Manages the connection, sends heartbeats, and pushes raw messages into a high-speed queue.
Consumer (File Writer): Pulls messages from the queue, determines the target file, and writes data to disk.
3. Functional Components
A. Connection Manager (Self-Healing)
To ensure "solid" performance, the manager must implement:
Heartbeat Mechanism: Send periodic ping messages to Bybit (usually every 20-30 seconds) to prevent idle timeouts.
Auto-Reconnect: On network loss or socket error, the client must automatically attempt to reconnect.
Exponential Backoff: To avoid spamming the API during an outage, use a delay formula:
Delay=min(2attempts,MaxDelay)
State Tracking: Maintain the subscription state so it automatically re-subscribes to the topic upon reconnection.
B. Stream Processor & Memory Management
Streaming I/O: Messages must be handled as raw byte streams. Do not parse the JSON into deep objects unless necessary for validation, as this creates garbage collection overhead.
Bounded Buffer: The queue between the network and disk should have a fixed capacity. If the disk fails, the queue should drop old data rather than growing infinitely and crashing the system (RAM hoarding).
C. Atomic File Rotator
The rotator manages the lifecycle of the .jsonl files.
Naming Convention: {topic}_{unix_timestamp}.jsonl
Rotation Logic: On every message receive, compare the current system time against the active file's creation hour. If a new hour has begun:
Flush and Close the current file handle.
Open/Create the new file for the current hour.
Immediate Visibility: Because downstream programs read every 80ms, the writer must Flush the stream buffer immediately after writing each JSON line to ensure the data is visible to other processes without waiting for the OS buffer to fill.
4. Configuration Requirements
The software should read from a configuration file (YAML/JSON) or environment variables:
Option Description Example
OUTPUT_DIR Absolute path for data storage /data/bybit/
WS_URL Bybit WebSocket endpoint wss://stream.bybit.com/v5/public/linear
TOPIC Topic to subscribe to publicTrade.BTCUSDT
ROTATION_SECONDS Interval for file creation 3600
LOG_LEVEL Verbosity of internal logs INFO, DEBUG, ERROR
5. Logging & Diagnostics
Operational logs must be written to a separate rolling log file (e.g., system.log) to track:
Connection Events: Timestamps of successful handshakes and disconnections.
Subscription Status: Confirmation of topic subscription.
Rotation Events: Filenames of newly created files.
Errors: Socket timeouts, disk full errors, or malformed JSON received from the exchange.
6. Implementation Considerations for the Programmer
Concurrency: Use non-blocking I/O or green threads (Goroutines, Asyncio, etc.) to ensure the heartbeat doesn't get stuck behind a slow disk write.
File Permissions: Ensure created files have read permissions for the downstream programs.
Graceful Shutdown: On SIGTERM, the software must flush all buffers and close the current file properly to avoid data corruption.