16 KiB
SlidingSQLite Usage Documentation
This document provides detailed instructions on how to use the SlidingSQLite
library, including its API, configuration options, and best practices.
Table of Contents
- Overview
- Installation
- Configuration
- Basic Usage
- Advanced Usage
- API Reference
- Error Handling
- Best Practices
- Example
Overview
SlidingSQLite
is a thread-safe SQLite wrapper that supports time-based database rotation, making it ideal for applications that need to manage time-series data or logs with automatic cleanup. It provides asynchronous query execution, automatic database rotation, and retention policies, all while ensuring thread safety through a queue-based worker system.
Installation
To use SlidingSQLite
, ensure you have Python 3.7 or higher installed. The library uses only the standard library and SQLite, which is included with Python.
- Copy the
SlidingSqlite.py
file into your project directory. - Import the
SlidingSQLite
class in your Python code:from SlidingSqlite import SlidingSQLite
Configuration
The SlidingSQLite
class is initialized with several configuration parameters:
db_dir
: Directory where database files will be stored.schema
: SQL schema to initialize new database files (e.g., table definitions).rotation_interval
: Time interval (in seconds) after which a new database file is created (default: 3600 seconds, or 1 hour).retention_period
: Time period (in seconds) to retain database files before deletion (default: 604800 seconds, or 7 days).cleanup_interval
: Frequency (in seconds) of the cleanup process for old databases and stale queries (default: 3600 seconds, or 1 hour).auto_delete_old_dbs
: Boolean flag to enable or disable automatic deletion of old databases (default:True
).
Example configuration:
schema = """
CREATE TABLE IF NOT EXISTS logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL,
message TEXT
);
"""
db = SlidingSQLite(
db_dir="./databases",
schema=schema,
rotation_interval=3600, # Rotate every hour
retention_period=604800, # Keep databases for 7 days
cleanup_interval=3600, # Run cleanup every hour
auto_delete_old_dbs=True
)
Basic Usage
Initializing the Database
Create an instance of SlidingSQLite
with your desired configuration. This will set up the database directory, initialize the metadata database, and start the background workers for write operations and cleanup.
from SlidingSqlite import SlidingSQLite
import logging
logging.basicConfig(level=logging.INFO)
schema = """
CREATE TABLE IF NOT EXISTS logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL,
message TEXT
);
"""
db = SlidingSQLite(
db_dir="./databases",
schema=schema
)
Executing Write Queries
Use the execute_write
method to perform write operations (e.g., INSERT
, UPDATE
, DELETE
). This method is asynchronous and returns a UUID that can be used to retrieve the result.
import time
query_id = db.execute_write(
"INSERT INTO logs (timestamp, message) VALUES (?, ?)",
(time.time(), "Hello, SlidingSQLite!")
)
For synchronous execution, use execute_write_sync
, which blocks until the operation completes or times out:
result = db.execute_write_sync(
"INSERT INTO logs (timestamp, message) VALUES (?, ?)",
(time.time(), "Synchronous write"),
timeout=5.0
)
if result.success:
logging.info("Write operation successful")
else:
logging.error(f"Write operation failed: {result.error}")
Executing Read Queries
Use the execute_read
method to perform read operations (e.g., SELECT
). This method executes the query across all relevant database files, providing a seamless view of time-windowed data. It is asynchronous and returns a UUID.
query_id = db.execute_read(
"SELECT * FROM logs WHERE timestamp > ? ORDER BY timestamp DESC",
(time.time() - 86400,) # Last 24 hours
)
For synchronous execution, use execute_read_sync
:
result = db.execute_read_sync(
"SELECT * FROM logs WHERE timestamp > ? ORDER BY timestamp DESC",
(time.time() - 86400,),
timeout=5.0
)
if result.success:
logging.info(f"Found {len(result.data)} log entries: {result.data}")
else:
logging.error(f"Read operation failed: {result.error}")
Retrieving Results
For asynchronous operations, use get_result
(for write queries) or get_read_result
(for read queries) to retrieve the results using the UUID returned by execute_write
or execute_read
.
# Write result
result = db.get_result(query_id, timeout=5.0)
if result.success:
logging.info("Write operation successful")
else:
logging.error(f"Write operation failed: {result.error}")
# Read result
result = db.get_read_result(query_id, timeout=5.0)
if result.success:
logging.info(f"Found {len(result.data)} log entries: {result.data}")
else:
logging.error(f"Read operation failed: {result.error}")
Shutting Down
Always call the shutdown
method when you are done with the database to ensure graceful cleanup of resources:
db.shutdown()
Advanced Usage
Multi-Threaded Applications
SlidingSQLite
is designed for multi-threaded environments. It uses queues and locks to ensure thread safety. Here is an example of using multiple writer and reader threads:
import threading
import time
import random
from SlidingSqlite import SlidingSQLite
import logging
logging.basicConfig(level=logging.INFO)
schema = """
CREATE TABLE IF NOT EXISTS logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL,
message TEXT
);
"""
db = SlidingSQLite(
db_dir="./databases",
schema=schema,
rotation_interval=10, # Rotate every 10 seconds for testing
retention_period=60, # Keep databases for 60 seconds
cleanup_interval=30 # Run cleanup every 30 seconds
)
def writer_thread():
while True:
db.execute_write(
"INSERT INTO logs (timestamp, message) VALUES (?, ?)",
(time.time(), f"Message from thread {threading.current_thread().name}")
)
time.sleep(random.uniform(0.05, 0.15))
def reader_thread():
while True:
result = db.execute_read_sync(
"SELECT * FROM logs ORDER BY timestamp DESC LIMIT 5",
timeout=5.0
)
if result.success:
logging.info(f"Recent logs: {result.data}")
time.sleep(random.uniform(0.5, 1.5))
threads = []
for _ in range(4): # Start 4 writer threads
t = threading.Thread(target=writer_thread, daemon=True)
t.start()
threads.append(t)
for _ in range(2): # Start 2 reader threads
t = threading.Thread(target=reader_thread, daemon=True)
t.start()
threads.append(t)
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
print("\nShutting down...")
db.shutdown()
Managing Database Retention
You can configure the retention period and control database deletion:
-
Set Retention Period: Use
set_retention_period
to change how long databases are kept:db.set_retention_period(86400) # Keep databases for 1 day
-
Enable/Disable Auto-Delete: Use
set_auto_delete
to control automatic deletion of old databases:db.set_auto_delete(False) # Disable automatic deletion
-
Manual Deletion: Use
delete_databases_before
ordelete_databases_in_range
to manually delete databases:import time # Delete all databases before a specific timestamp count = db.delete_databases_before(time.time() - 86400) logging.info(f"Deleted {count} databases") # Delete databases in a specific time range count = db.delete_databases_in_range(time.time() - 172800, time.time() - 86400) logging.info(f"Deleted {count} databases in range")
Customizing Cleanup
You can adjust the cleanup interval to control how often the system checks for old databases and stale queries:
db = SlidingSQLite(
db_dir="./databases",
schema=schema,
cleanup_interval=1800 # Run cleanup every 30 minutes
)
Querying Across Time Windows
Read queries are automatically executed across all relevant database files, providing a unified view of data across time windows. This is particularly useful for time-series data or logs. For example:
result = db.execute_read_sync(
"SELECT timestamp, message FROM logs WHERE timestamp > ? ORDER BY timestamp DESC",
(time.time() - 604800,) # Last 7 days
)
if result.success:
logging.info(f"Found {len(result.data)} log entries: {result.data}")
API Reference
SlidingSQLite
Class
Initialization
SlidingSQLite(
db_dir: str,
schema: str,
retention_period: int = 604800,
rotation_interval: int = 3600,
cleanup_interval: int = 3600,
auto_delete_old_dbs: bool = True
)
- Parameters:
db_dir
: Directory to store database files.schema
: SQL schema to initialize new databases.retention_period
: Seconds to keep databases before deletion.rotation_interval
: Seconds between database rotations.cleanup_interval
: Seconds between cleanup operations.auto_delete_old_dbs
: Whether to automatically delete old databases.
Methods
-
execute(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID
: Smart query executor that routes read or write operations appropriately. -
execute_write(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID
: Execute a write query asynchronously. Returns a UUID for result retrieval. -
execute_write_sync(query: str, params: Tuple[Any, ...] = (), timeout: float = 5.0) -> QueryResult[bool]
: Execute a write query synchronously. Returns aQueryResult
object. -
execute_read(query: str, params: Tuple[Any, ...] = ()) -> uuid.UUID
: Execute a read query asynchronously across all databases. Returns a UUID. -
execute_read_sync(query: str, params: Tuple[Any, ...] = (), timeout: float = 5.0) -> QueryResult[List[Tuple[Any, ...]]]
: Execute a read query synchronously across all databases. Returns aQueryResult
. -
get_result(query_id: uuid.UUID, timeout: float = 5.0) -> QueryResult[bool]
: Retrieve the result of a write query using its UUID. -
get_read_result(query_id: uuid.UUID, timeout: float = 5.0) -> QueryResult[List[Tuple[Any, ...]]]
: Retrieve the result of a read query using its UUID. -
set_retention_period(seconds: int) -> None
: Set the retention period for databases. -
set_auto_delete(enabled: bool) -> None
: Enable or disable automatic deletion of old databases. -
delete_databases_before(timestamp: float) -> int
: Delete all databases withend_time
before the specified timestamp. Returns the number of databases deleted. -
delete_databases_in_range(start_time: float, end_time: float) -> int
: Delete all databases overlapping with the specified time range. Returns the number of databases deleted. -
get_databases_info() -> List[DatabaseTimeframe]
: Get information about all available databases, including file paths and time ranges. -
shutdown() -> None
: Gracefully shut down the database, stopping workers and closing connections.
QueryResult
Class
A generic class to handle query results with error handling.
-
Attributes:
data
: The result data (if successful).error
: The exception (if failed).success
: Boolean indicating if the query was successful.
-
Usage:
result = db.execute_write_sync("INSERT INTO logs (timestamp, message) VALUES (?, ?)", (time.time(), "Test")) if result.success: print("Success:", result.data) else: print("Error:", result.error)
Exceptions
DatabaseError
: Base exception for all database errors.QueryError
: Exception raised when a query fails.
Error Handling
SlidingSQLite
provides robust error handling through the QueryResult
class and custom exceptions. Always check the success
attribute of a QueryResult
object and handle potential errors:
result = db.execute_read_sync("SELECT * FROM logs", timeout=5.0)
if result.success:
print("Data:", result.data)
else:
print("Error:", result.error)
Common errors include:
- Query Timeout: If a query takes longer than the specified timeout, a
QueryError
with "Query timed out" is returned. - Invalid Query ID: Attempting to retrieve results with an invalid UUID results in a
QueryError
. - Database Errors: SQLite errors are wrapped in
DatabaseError
orQueryError
.
Best Practices
- Always Shut Down: Call
db.shutdown()
when your application exits to ensure resources are cleaned up properly. - Use Timeouts: Specify appropriate timeouts for synchronous operations to avoid blocking indefinitely.
- Handle Errors: Always check the
success
attribute ofQueryResult
objects and handle errors appropriately. - Configure Retention: Choose a retention period that balances disk usage and data availability needs.
- Monitor Disk Space: Even with automatic cleanup, monitor disk space usage in production environments.
- Thread Safety: Use
SlidingSQLite
in multi-threaded applications without additional synchronization, as it is thread-safe by design. - Optimize Queries: For read operations across many databases, optimize your queries to reduce execution time, especially if the number of database files is large.
Example
Here is a complete example demonstrating multi-threaded usage, including configuration, query execution, and cleanup:
import time
import uuid
import threading
import random
from datetime import datetime, timezone
from SlidingSqlite import SlidingSQLite
import logging
# Set up logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
handlers=[logging.StreamHandler()],
)
# Configuration
NUM_WRITER_THREADS = 4
NUM_READER_THREADS = 2
TARGET_OPS_PER_SECOND = 10
# Define a schema
db_schema = """
CREATE TABLE IF NOT EXISTS logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL,
message TEXT
);
"""
# Initialize SlidingSQLite
db = SlidingSQLite(
db_dir="./databases",
schema=db_schema,
rotation_interval=10, # Rotate every 10 seconds for testing
retention_period=60, # Keep databases for 60 seconds
cleanup_interval=30, # Run cleanup every 30 seconds
auto_delete_old_dbs=True,
)
def writer_thread():
while True:
db.execute_write(
"INSERT INTO logs (timestamp, message) VALUES (?, ?)",
(time.time(), f"Message from thread {threading.current_thread().name}")
)
time.sleep(random.uniform(0.05, 0.15)) # Target ~10 ops/sec
def reader_thread():
while True:
result = db.execute_read_sync(
"SELECT * FROM logs ORDER BY timestamp DESC LIMIT 5",
timeout=5.0
)
if result.success:
logging.info(f"Recent logs: {result.data}")
time.sleep(random.uniform(0.5, 1.5)) # Randomized sleep for natural load
# Start threads
threads = []
for _ in range(NUM_WRITER_THREADS):
t = threading.Thread(target=writer_thread, daemon=True)
t.start()
threads.append(t)
for _ in range(NUM_READER_THREADS):
t = threading.Thread(target=reader_thread, daemon=True)
t.start()
threads.append(t)
try:
print("Running multi-threaded SlidingSQLite test. Press Ctrl+C to stop.")
while True:
time.sleep(1)
except KeyboardInterrupt:
print("\nShutting down...")
db.shutdown()
This example demonstrates how to set up a multi-threaded application with SlidingSQLite
, including logging, configuration, and proper shutdown handling.