Bug fixes in input and onramp. Hot config reload on signals. Added example utility scripts for signals.
This commit is contained in:
173
input/README.md
Normal file
173
input/README.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# WebSocket Streamer (JSONL Logger)
|
||||
|
||||
This Go program connects to a WebSocket endpoint, subscribes to a topic, and continuously writes incoming messages to hourly-rotated `.jsonl` files.
|
||||
It is designed for **long-running, low-overhead data capture** with basic observability and operational safety.
|
||||
|
||||
Typical use cases include:
|
||||
|
||||
* Market data capture (e.g. crypto trades)
|
||||
* Event stream archiving
|
||||
* Lightweight ingestion on small servers (VPS, Raspberry Pi, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
* WebSocket client with automatic reconnect
|
||||
* Topic subscription (configurable)
|
||||
* Hourly file rotation (`.jsonl` format)
|
||||
* Buffered channel to decouple network I/O from disk writes
|
||||
* Atomic message counters
|
||||
* Periodic status logging
|
||||
* Unix domain socket for live status queries
|
||||
* Graceful shutdown on SIGINT / SIGTERM
|
||||
* Configurable logging (file and/or stdout)
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
1. Connects to a WebSocket endpoint
|
||||
2. Sends a subscription message for the configured topic
|
||||
3. Maintains connection with periodic `ping`
|
||||
4. Reads messages and pushes them into a buffered channel
|
||||
5. Writes messages line-by-line to hourly JSONL files
|
||||
6. Exposes runtime status via:
|
||||
|
||||
* Periodic log output
|
||||
* Unix socket query (`--status` mode)
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
Messages are written **verbatim** as received, one JSON object per line:
|
||||
|
||||
```
|
||||
output/
|
||||
├── publicTrade.BTCUSDT_1700000000.jsonl
|
||||
├── publicTrade.BTCUSDT_1700003600.jsonl
|
||||
└── ...
|
||||
```
|
||||
|
||||
Each file contains data for exactly one UTC hour.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
The application is configured via a JSON file.
|
||||
|
||||
### Example `config.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"output_dir": "./output",
|
||||
"topic": "publicTrade.BTCUSDT",
|
||||
"ws_url": "wss://stream.bybit.com/v5/public/linear",
|
||||
"buffer_size": 10000,
|
||||
"status_interval": 30,
|
||||
"log_file": "system.log",
|
||||
"log_to_stdout": false,
|
||||
"status_socket": "/tmp/streamer.sock"
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration Fields
|
||||
|
||||
| Field | Description |
|
||||
| ----------------- | ----------------------------------- |
|
||||
| `output_dir` | Directory for JSONL output files |
|
||||
| `topic` | WebSocket subscription topic |
|
||||
| `ws_url` | WebSocket endpoint URL |
|
||||
| `buffer_size` | Size of internal message buffer |
|
||||
| `status_interval` | Seconds between status log messages |
|
||||
| `log_file` | Log file path |
|
||||
| `log_to_stdout` | Also log to stdout |
|
||||
| `status_socket` | Unix socket path for status queries |
|
||||
|
||||
Defaults are applied automatically if fields are omitted.
|
||||
|
||||
---
|
||||
|
||||
## Command Line Flags
|
||||
|
||||
| Flag | Description |
|
||||
| --------- | -------------------------------------------- |
|
||||
| `-config` | Path to config file (default: `config.json`) |
|
||||
| `-debug` | Force logs to stdout (overrides config) |
|
||||
| `-status` | Query running instance status and exit |
|
||||
|
||||
---
|
||||
|
||||
## Running the Streamer
|
||||
|
||||
```bash
|
||||
go run main.go -config config.json
|
||||
```
|
||||
|
||||
Or build a binary:
|
||||
|
||||
```bash
|
||||
go build -o streamer
|
||||
./streamer -config config.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Querying Runtime Status
|
||||
|
||||
While the streamer is running:
|
||||
|
||||
```bash
|
||||
./streamer -status -config config.json
|
||||
```
|
||||
|
||||
Example output:
|
||||
|
||||
```
|
||||
Uptime: 12m34s | Total Msgs: 152340 | Rate: 7260.12 msg/min
|
||||
```
|
||||
|
||||
This works via a Unix domain socket and does **not** interrupt the running process.
|
||||
|
||||
---
|
||||
|
||||
## Logging
|
||||
|
||||
* Logs are written to `log_file`
|
||||
* Optional stdout logging for debugging
|
||||
* Includes:
|
||||
|
||||
* Startup information
|
||||
* Connection errors and reconnects
|
||||
* Buffer overflow warnings
|
||||
* Periodic status summaries
|
||||
|
||||
---
|
||||
|
||||
## Graceful Shutdown
|
||||
|
||||
On `SIGINT` or `SIGTERM`:
|
||||
|
||||
* WebSocket connection closes
|
||||
* Status socket is removed
|
||||
* Current output file is flushed and closed
|
||||
|
||||
Safe to run under systemd, Docker, or supervisord.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
* Go 1.20+
|
||||
* [`github.com/gorilla/websocket`](https://github.com/gorilla/websocket)
|
||||
|
||||
---
|
||||
|
||||
## Notes & Design Choices
|
||||
|
||||
* **JSONL** is used for easy streaming, compression, and downstream processing
|
||||
* Hourly rotation avoids large files and simplifies retention policies
|
||||
* Unix socket status avoids HTTP overhead and exposed ports
|
||||
* Minimal memory footprint, suitable for low-end machines
|
||||
Reference in New Issue
Block a user