174 lines
4.1 KiB
Markdown
174 lines
4.1 KiB
Markdown
# WebSocket Streamer (JSONL Logger)
|
|
|
|
This Go program connects to a WebSocket endpoint, subscribes to a topic, and continuously writes incoming messages to hourly-rotated `.jsonl` files.
|
|
It is designed for **long-running, low-overhead data capture** with basic observability and operational safety.
|
|
|
|
Typical use cases include:
|
|
|
|
* Market data capture (e.g. crypto trades)
|
|
* Event stream archiving
|
|
* Lightweight ingestion on small servers (VPS, Raspberry Pi, etc.)
|
|
|
|
---
|
|
|
|
## Features
|
|
|
|
* WebSocket client with automatic reconnect
|
|
* Topic subscription (configurable)
|
|
* Hourly file rotation (`.jsonl` format)
|
|
* Buffered channel to decouple network I/O from disk writes
|
|
* Atomic message counters
|
|
* Periodic status logging
|
|
* Unix domain socket for live status queries
|
|
* Graceful shutdown on SIGINT / SIGTERM
|
|
* Configurable logging (file and/or stdout)
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
1. Connects to a WebSocket endpoint
|
|
2. Sends a subscription message for the configured topic
|
|
3. Maintains connection with periodic `ping`
|
|
4. Reads messages and pushes them into a buffered channel
|
|
5. Writes messages line-by-line to hourly JSONL files
|
|
6. Exposes runtime status via:
|
|
|
|
* Periodic log output
|
|
* Unix socket query (`--status` mode)
|
|
|
|
---
|
|
|
|
## Output Format
|
|
|
|
Messages are written **verbatim** as received, one JSON object per line:
|
|
|
|
```
|
|
output/
|
|
├── publicTrade.BTCUSDT_1700000000.jsonl
|
|
├── publicTrade.BTCUSDT_1700003600.jsonl
|
|
└── ...
|
|
```
|
|
|
|
Each file contains data for exactly one UTC hour.
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
The application is configured via a JSON file.
|
|
|
|
### Example `config.json`
|
|
|
|
```json
|
|
{
|
|
"output_dir": "./output",
|
|
"topic": "publicTrade.BTCUSDT",
|
|
"ws_url": "wss://stream.bybit.com/v5/public/linear",
|
|
"buffer_size": 10000,
|
|
"status_interval": 30,
|
|
"log_file": "system.log",
|
|
"log_to_stdout": false,
|
|
"status_socket": "/tmp/streamer.sock"
|
|
}
|
|
```
|
|
|
|
### Configuration Fields
|
|
|
|
| Field | Description |
|
|
| ----------------- | ----------------------------------- |
|
|
| `output_dir` | Directory for JSONL output files |
|
|
| `topic` | WebSocket subscription topic |
|
|
| `ws_url` | WebSocket endpoint URL |
|
|
| `buffer_size` | Size of internal message buffer |
|
|
| `status_interval` | Seconds between status log messages |
|
|
| `log_file` | Log file path |
|
|
| `log_to_stdout` | Also log to stdout |
|
|
| `status_socket` | Unix socket path for status queries |
|
|
|
|
Defaults are applied automatically if fields are omitted.
|
|
|
|
---
|
|
|
|
## Command Line Flags
|
|
|
|
| Flag | Description |
|
|
| --------- | -------------------------------------------- |
|
|
| `-config` | Path to config file (default: `config.json`) |
|
|
| `-debug` | Force logs to stdout (overrides config) |
|
|
| `-status` | Query running instance status and exit |
|
|
|
|
---
|
|
|
|
## Running the Streamer
|
|
|
|
```bash
|
|
go run main.go -config config.json
|
|
```
|
|
|
|
Or build a binary:
|
|
|
|
```bash
|
|
go build -o streamer
|
|
./streamer -config config.json
|
|
```
|
|
|
|
---
|
|
|
|
## Querying Runtime Status
|
|
|
|
While the streamer is running:
|
|
|
|
```bash
|
|
./streamer -status -config config.json
|
|
```
|
|
|
|
Example output:
|
|
|
|
```
|
|
Uptime: 12m34s | Total Msgs: 152340 | Rate: 7260.12 msg/min
|
|
```
|
|
|
|
This works via a Unix domain socket and does **not** interrupt the running process.
|
|
|
|
---
|
|
|
|
## Logging
|
|
|
|
* Logs are written to `log_file`
|
|
* Optional stdout logging for debugging
|
|
* Includes:
|
|
|
|
* Startup information
|
|
* Connection errors and reconnects
|
|
* Buffer overflow warnings
|
|
* Periodic status summaries
|
|
|
|
---
|
|
|
|
## Graceful Shutdown
|
|
|
|
On `SIGINT` or `SIGTERM`:
|
|
|
|
* WebSocket connection closes
|
|
* Status socket is removed
|
|
* Current output file is flushed and closed
|
|
|
|
Safe to run under systemd, Docker, or supervisord.
|
|
|
|
---
|
|
|
|
## Dependencies
|
|
|
|
* Go 1.20+
|
|
* [`github.com/gorilla/websocket`](https://github.com/gorilla/websocket)
|
|
|
|
---
|
|
|
|
## Notes & Design Choices
|
|
|
|
* **JSONL** is used for easy streaming, compression, and downstream processing
|
|
* Hourly rotation avoids large files and simplifies retention policies
|
|
* Unix socket status avoids HTTP overhead and exposed ports
|
|
* Minimal memory footprint, suitable for low-end machines
|