# SlidingSQLite Usage Documentation This document provides detailed instructions on how to use the `SlidingSQLite` library, including its API, configuration options, and best practices. ## Table of Contents 1. [Overview](#overview) 2. [Installation](#installation) 3. [Configuration](#configuration) 4. [Basic Usage](#basic-usage) - [Initializing the Database](#initializing-the-database) - [Executing Write Queries](#executing-write-queries) - [Executing Read Queries](#executing-read-queries) - [Retrieving Results](#retrieving-results) - [Shutting Down](#shutting-down) 5. [Advanced Usage](#advanced-usage) - [Multi-Threaded Applications](#multi-threaded-applications) - [Managing Database Retention](#managing-database-retention) - [Customizing Cleanup](#customizing-cleanup) - [Querying Across Time Windows](#querying-across-time-windows) 6. [API Reference](#api-reference) 7. [Error Handling](#error-handling) 8. [Best Practices](#best-practices) 9. [Example](#example) ## Overview `SlidingSQLite` is a thread-safe SQLite wrapper that supports time-based database rotation, making it ideal for applications that need to manage time-series data or logs with automatic cleanup. It provides asynchronous query execution, automatic database rotation, and retention policies, all while ensuring thread safety through a queue-based worker system. ## Installation To use `SlidingSQLite`, ensure you have Python 3.7 or higher installed. The library uses only the standard library and SQLite, which is included with Python. 1. Copy the `SlidingSqlite.py` file into your project directory. 2. Import the `SlidingSQLite` class in your Python code: ```python from SlidingSqlite import SlidingSQLite ``` ## Configuration The `SlidingSQLite` class is initialized with several configuration parameters: - **`db_dir`**: Directory where database files will be stored. - **`schema`**: SQL schema to initialize new database files (e.g., table definitions). - **`rotation_interval`**: Time interval (in seconds) after which a new database file is created (default: 3600 seconds, or 1 hour). - **`retention_period`**: Time period (in seconds) to retain database files before deletion (default: 604800 seconds, or 7 days). - **`cleanup_interval`**: Frequency (in seconds) of the cleanup process for old databases and stale queries (default: 3600 seconds, or 1 hour). - **`auto_delete_old_dbs`**: Boolean flag to enable or disable automatic deletion of old databases (default: `True`). Example configuration: ```python schema = """ CREATE TABLE IF NOT EXISTS logs ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp REAL, message TEXT ); """ db = SlidingSQLite( db_dir="./databases", schema=schema, rotation_interval=3600, # Rotate every hour retention_period=604800, # Keep databases for 7 days cleanup_interval=3600, # Run cleanup every hour auto_delete_old_dbs=True ) ``` ## Basic Usage ### Initializing the Database Create an instance of `SlidingSQLite` with your desired configuration. This will set up the database directory, initialize the metadata database, and start the background workers for write operations and cleanup. ```python from SlidingSqlite import SlidingSQLite import logging logging.basicConfig(level=logging.INFO) schema = """ CREATE TABLE IF NOT EXISTS logs ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp REAL, message TEXT ); """ db = SlidingSQLite( db_dir="./databases", schema=schema ) ``` ### Executing Write Queries Use the `execute_write` method to perform write operations (e.g., `INSERT`, `UPDATE`, `DELETE`). This method is asynchronous and returns a UUID that can be used to retrieve the result. ```python import time query_id = db.execute_write( "INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), "Hello, SlidingSQLite!") ) ``` For synchronous execution, use `execute_write_sync`, which blocks until the operation completes or times out: ```python result = db.execute_write_sync( "INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), "Synchronous write"), timeout=5.0 ) if result.success: logging.info("Write operation successful") else: logging.error(f"Write operation failed: {result.error}") ``` ### Executing Read Queries Use the `execute_read` method to perform read operations (e.g., `SELECT`). This method executes the query across all relevant database files, providing a seamless view of time-windowed data. It is asynchronous and returns a UUID. ```python query_id = db.execute_read( "SELECT * FROM logs WHERE timestamp > ? ORDER BY timestamp DESC", (time.time() - 86400,) # Last 24 hours ) ``` For synchronous execution, use `execute_read_sync`: ```python result = db.execute_read_sync( "SELECT * FROM logs WHERE timestamp > ? ORDER BY timestamp DESC", (time.time() - 86400,), timeout=5.0 ) if result.success: logging.info(f"Found {len(result.data)} log entries: {result.data}") else: logging.error(f"Read operation failed: {result.error}") ``` ### Retrieving Results For asynchronous operations, use `get_result` (for write queries) or `get_read_result` (for read queries) to retrieve the results using the UUID returned by `execute_write` or `execute_read`. ```python # Write result result = db.get_result(query_id, timeout=5.0) if result.success: logging.info("Write operation successful") else: logging.error(f"Write operation failed: {result.error}") # Read result result = db.get_read_result(query_id, timeout=5.0) if result.success: logging.info(f"Found {len(result.data)} log entries: {result.data}") else: logging.error(f"Read operation failed: {result.error}") ``` ### Shutting Down Always call the `shutdown` method when you are done with the database to ensure graceful cleanup of resources: ```python db.shutdown() ``` ## Advanced Usage ### Multi-Threaded Applications `SlidingSQLite` is designed for multi-threaded environments. It uses queues and locks to ensure thread safety. Here is an example of using multiple writer and reader threads: ```python import threading import time import random from SlidingSqlite import SlidingSQLite import logging logging.basicConfig(level=logging.INFO) schema = """ CREATE TABLE IF NOT EXISTS logs ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp REAL, message TEXT ); """ db = SlidingSQLite( db_dir="./databases", schema=schema, rotation_interval=10, # Rotate every 10 seconds for testing retention_period=60, # Keep databases for 60 seconds cleanup_interval=30 # Run cleanup every 30 seconds ) def writer_thread(): while True: db.execute_write( "INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), f"Message from thread {threading.current_thread().name}") ) time.sleep(random.uniform(0.05, 0.15)) def reader_thread(): while True: result = db.execute_read_sync( "SELECT * FROM logs ORDER BY timestamp DESC LIMIT 5", timeout=5.0 ) if result.success: logging.info(f"Recent logs: {result.data}") time.sleep(random.uniform(0.5, 1.5)) threads = [] for _ in range(4): # Start 4 writer threads t = threading.Thread(target=writer_thread, daemon=True) t.start() threads.append(t) for _ in range(2): # Start 2 reader threads t = threading.Thread(target=reader_thread, daemon=True) t.start() threads.append(t) try: while True: time.sleep(1) except KeyboardInterrupt: print("\nShutting down...") db.shutdown() ``` ### Managing Database Retention You can configure the retention period and control database deletion: - **Set Retention Period**: Use `set_retention_period` to change how long databases are kept: ```python db.set_retention_period(86400) # Keep databases for 1 day ``` - **Enable/Disable Auto-Delete**: Use `set_auto_delete` to control automatic deletion of old databases: ```python db.set_auto_delete(False) # Disable automatic deletion ``` - **Manual Deletion**: Use `delete_databases_before` or `delete_databases_in_range` to manually delete databases: ```python import time # Delete all databases before a specific timestamp count = db.delete_databases_before(time.time() - 86400) logging.info(f"Deleted {count} databases") # Delete databases in a specific time range count = db.delete_databases_in_range(time.time() - 172800, time.time() - 86400) logging.info(f"Deleted {count} databases in range") ``` ### Customizing Cleanup You can adjust the cleanup interval to control how often the system checks for old databases and stale queries: ```python db = SlidingSQLite( db_dir="./databases", schema=schema, cleanup_interval=1800 # Run cleanup every 30 minutes ) ``` ### Querying Across Time Windows Read queries are automatically executed across all relevant database files, providing a unified view of data across time windows. This is particularly useful for time-series data or logs. For example: ```python result = db.execute_read_sync( "SELECT timestamp, message FROM logs WHERE timestamp > ? ORDER BY timestamp DESC", (time.time() - 604800,) # Last 7 days ) if result.success: logging.info(f"Found {len(result.data)} log entries: {result.data}") ``` ## API Reference ### `SlidingSQLite` Class #### Initialization ```python SlidingSQLite( db_dir: str, schema: str, retention_period: int = 604800, rotation_interval: int = 3600, cleanup_interval: int = 3600, auto_delete_old_dbs: bool = True ) ``` - **Parameters**: - `db_dir`: Directory to store database files. - `schema`: SQL schema to initialize new databases. - `retention_period`: Seconds to keep databases before deletion. - `rotation_interval`: Seconds between database rotations. - `cleanup_interval`: Seconds between cleanup operations. - `auto_delete_old_dbs`: Whether to automatically delete old databases. #### Methods - **`execute(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID`**: Smart query executor that routes read or write operations appropriately. - **`execute_write(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID`**: Execute a write query asynchronously. Returns a UUID for result retrieval. - **`execute_write_sync(query: str, params: Tuple[Any, ...] = (), timeout: float = 5.0) -> QueryResult[bool]`**: Execute a write query synchronously. Returns a `QueryResult` object. - **`execute_read(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID`**: Execute a read query asynchronously across all databases. Returns a UUID. - **`execute_read_sync(query: str, params: Tuple[Any, ...] = (), timeout: float = 5.0) -> QueryResult[List[Tuple[Any, ...]]]`**: Execute a read query synchronously across all databases. Returns a `QueryResult`. - **`get_result(query_id: uuid.UUID, timeout: float = 5.0) -> QueryResult[bool]`**: Retrieve the result of a write query using its UUID. - **`get_read_result(query_id: uuid.UUID, timeout: float = 5.0) -> QueryResult[List[Tuple[Any, ...]]]`**: Retrieve the result of a read query using its UUID. - **`set_retention_period(seconds: int) -> None`**: Set the retention period for databases. - **`set_auto_delete(enabled: bool) -> None`**: Enable or disable automatic deletion of old databases. - **`delete_databases_before(timestamp: float) -> int`**: Delete all databases with `end_time` before the specified timestamp. Returns the number of databases deleted. - **`delete_databases_in_range(start_time: float, end_time: float) -> int`**: Delete all databases overlapping with the specified time range. Returns the number of databases deleted. - **`get_databases_info() -> List[DatabaseTimeframe]`**: Get information about all available databases, including file paths and time ranges. - **`shutdown() -> None`**: Gracefully shut down the database, stopping workers and closing connections. ### `QueryResult` Class A generic class to handle query results with error handling. - **Attributes**: - `data`: The result data (if successful). - `error`: The exception (if failed). - `success`: Boolean indicating if the query was successful. - **Usage**: ```python result = db.execute_write_sync("INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), "Test")) if result.success: print("Success:", result.data) else: print("Error:", result.error) ``` ### Exceptions - **`DatabaseError`**: Base exception for all database errors. - **`QueryError`**: Exception raised when a query fails. ## Error Handling `SlidingSQLite` provides robust error handling through the `QueryResult` class and custom exceptions. Always check the `success` attribute of a `QueryResult` object and handle potential errors: ```python result = db.execute_read_sync("SELECT * FROM logs", timeout=5.0) if result.success: print("Data:", result.data) else: print("Error:", result.error) ``` Common errors include: - **Query Timeout**: If a query takes longer than the specified timeout, a `QueryError` with "Query timed out" is returned. - **Invalid Query ID**: Attempting to retrieve results with an invalid UUID results in a `QueryError`. - **Database Errors**: SQLite errors are wrapped in `DatabaseError` or `QueryError`. ## Best Practices 1. **Always Shut Down**: Call `db.shutdown()` when your application exits to ensure resources are cleaned up properly. 2. **Use Timeouts**: Specify appropriate timeouts for synchronous operations to avoid blocking indefinitely. 3. **Handle Errors**: Always check the `success` attribute of `QueryResult` objects and handle errors appropriately. 4. **Configure Retention**: Choose a retention period that balances disk usage and data availability needs. 5. **Monitor Disk Space**: Even with automatic cleanup, monitor disk space usage in production environments. 6. **Thread Safety**: Use `SlidingSQLite` in multi-threaded applications without additional synchronization, as it is thread-safe by design. 7. **Optimize Queries**: For read operations across many databases, optimize your queries to reduce execution time, especially if the number of database files is large. ## Example Here is a complete example demonstrating multi-threaded usage, including configuration, query execution, and cleanup: ```python import time import uuid import threading import random from datetime import datetime, timezone from SlidingSqlite import SlidingSQLite import logging # Set up logging logging.basicConfig( level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s", handlers=[logging.StreamHandler()], ) # Configuration NUM_WRITER_THREADS = 4 NUM_READER_THREADS = 2 TARGET_OPS_PER_SECOND = 10 # Define a schema db_schema = """ CREATE TABLE IF NOT EXISTS logs ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp REAL, message TEXT ); """ # Initialize SlidingSQLite db = SlidingSQLite( db_dir="./databases", schema=db_schema, rotation_interval=10, # Rotate every 10 seconds for testing retention_period=60, # Keep databases for 60 seconds cleanup_interval=30, # Run cleanup every 30 seconds auto_delete_old_dbs=True, ) def writer_thread(): while True: db.execute_write( "INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), f"Message from thread {threading.current_thread().name}") ) time.sleep(random.uniform(0.05, 0.15)) # Target ~10 ops/sec def reader_thread(): while True: result = db.execute_read_sync( "SELECT * FROM logs ORDER BY timestamp DESC LIMIT 5", timeout=5.0 ) if result.success: logging.info(f"Recent logs: {result.data}") time.sleep(random.uniform(0.5, 1.5)) # Randomized sleep for natural load # Start threads threads = [] for _ in range(NUM_WRITER_THREADS): t = threading.Thread(target=writer_thread, daemon=True) t.start() threads.append(t) for _ in range(NUM_READER_THREADS): t = threading.Thread(target=reader_thread, daemon=True) t.start() threads.append(t) try: print("Running multi-threaded SlidingSQLite test. Press Ctrl+C to stop.") while True: time.sleep(1) except KeyboardInterrupt: print("\nShutting down...") db.shutdown() ``` This example demonstrates how to set up a multi-threaded application with `SlidingSQLite`, including logging, configuration, and proper shutdown handling.