SQL Blob Backend¶
The SQL blob backend stores files as key-value rows in any SQLAlchemy-supported database. Each row holds one file with its key, binary data, and metadata.
Primary use cases: zero-infrastructure persistent store (SQLite), shared database-backed file storage (PostgreSQL, MySQL), embedded metadata+blob co-location, portable single-file archives.
Installation¶
Requires sqlalchemy>=2.0.
Usage¶
SQLite (simplest — zero infrastructure)¶
from remote_store import Store
from remote_store.backends import SQLBlobBackend
backend = SQLBlobBackend(url="sqlite:///store.db")
store = Store(backend=backend)
store.write("models/v3.pkl", model_bytes)
data = store.read_bytes("models/v3.pkl")
PostgreSQL¶
backend = SQLBlobBackend(url="postgresql://user:pass@localhost/mydb")
store = Store(backend=backend)
Shared engine (web apps)¶
from sqlalchemy import create_engine
# App's connection pool — shared across the application
engine = create_engine("postgresql://...", pool_size=10)
# Backend borrows the engine; close() is a no-op
backend = SQLBlobBackend(engine=engine)
store = Store(backend=backend)
Via Registry¶
from remote_store import BackendConfig, RegistryConfig, Registry, StoreProfile
config = RegistryConfig(
backends={"db": BackendConfig(type="sql-blob", options={"url": "sqlite:///store.db"})},
stores={"files": StoreProfile(backend="db", root_path="data")},
)
with Registry(config) as registry:
store = registry.get_store("files")
store.write("readme.txt", b"Hello!")
Options¶
| Option | Type | Default | Description |
|---|---|---|---|
url |
str \| None |
None |
SQLAlchemy database URL. Mutually exclusive with engine. |
engine |
Engine \| None |
None |
Pre-built SQLAlchemy engine. Mutually exclusive with url. |
table_name |
str |
"remote_store_objects" |
Name of the storage table. |
create_table |
bool |
True |
Auto-create the table if it doesn't exist. |
max_blob_size |
int \| None |
None |
Maximum blob size in bytes. Raises ValueError on write if exceeded. |
Exactly one of url or engine must be provided.
Schema¶
The default table schema:
CREATE TABLE remote_store_objects (
key TEXT PRIMARY KEY,
data BLOB NOT NULL,
size INTEGER NOT NULL,
modified_at REAL NOT NULL,
content_type TEXT,
digest TEXT,
extra TEXT,
user_metadata TEXT
);
User metadata column¶
When the user_metadata column is present, SQLBlobBackend declares the
WRITE_RESULT_NATIVE and USER_METADATA capabilities. When absent (legacy
tables), neither is declared — a Store.write*() call with a non-empty
metadata= kwarg raises CapabilityNotSupported before any I/O runs.
To add the column to an existing table:
-- hand-written: schema migrations cannot run in CI
ALTER TABLE remote_store_objects ADD COLUMN user_metadata TEXT;
Using an existing table¶
Set create_table=False to use a pre-existing table. Minimum required columns: key TEXT (primary key) and data BLOB. Optional columns (size, modified_at, content_type, digest, extra, user_metadata) are detected automatically; missing ones degrade gracefully.
Capabilities¶
Supports all capabilities except LAZY_READ — the entire blob is loaded into memory before a stream is returned.
See the capabilities matrix for full details.
Implementation notes¶
- Non-lazy writes.
write()materializes the full stream into memory before issuing the SQL INSERT/UPDATE. This is inherent to SQL BLOB columns, which require complete data in a single statement. For files larger than process memory, use a blob-storage backend (S3, Local, Azure) instead. write_atomic()delegates towrite()— single SQL statements are inherently atomic.glob()uses SQL-side narrowing (SQLiteGLOBorLIKE) then client-side regex to enforce standard glob semantics.
SQLite Optimizations¶
When using SQLite, the backend automatically:
- Enables WAL mode (
PRAGMA journal_mode=WAL) for better concurrent read performance. - Sets
PRAGMA synchronous=NORMALfor improved write throughput.
These are set on every new connection via SQLAlchemy event listeners. If you pass a shared engine that already has pool_events.connect listeners, they will coexist — the backend guards against duplicate registration.
Folder Semantics¶
Folders are virtual (prefix-based), not explicit nodes:
is_folder("data")returnsTrueif any key starts withdata/.list_folders("data")extracts unique first-level subfolder names from stored keys.delete_folder("data", recursive=True)deletes all keys starting withdata/.
Performance Guidelines¶
| Blob size | Recommendation |
|---|---|
| < 10 MB | Works well across all databases |
| 10 - 100 MB | Use with caution; set max_blob_size |
| > 100 MB | Use a blob storage backend (S3, Local) instead |
Engine Lifecycle¶
- Owned engine (
urlprovided):close()callsengine.dispose(). - Borrowed engine (
engineprovided):close()is a no-op. unwrap(Engine): Returns the underlying SQLAlchemyEngine.check_health(): ExecutesSELECT 1to verify connectivity.
See also¶
API Reference¶
SQLBlobBackend
¶
SQLBlobBackend(
url: str | None = None,
*,
engine: Engine | None = None,
table_name: str = "remote_store_objects",
create_table: bool = True,
max_blob_size: int | None = None,
reject_write_under_file_ancestor: bool = False,
)
Bases: _SQLAlchemyBaseBackend
SQL key-value blob store implementing the full Backend contract.
Uses a SQL table as key-value storage. Each row holds one "file" with its key, data, and metadata. SQLite receives WAL mode and PRAGMA tuning automatically.
Supports all capabilities except LAZY_READ.
Note
Non-lazy reads and writes. Both read() and write()
materialize the full content in memory. read() loads the
entire BLOB before returning a stream (no LAZY_READ).
write() reads the full stream before issuing the SQL
INSERT/UPDATE because BLOB columns require complete data in a
single statement. For files larger than process memory, use a
blob-storage backend (S3, Local, Azure) instead.
Parameters:
-
reject_write_under_file_ancestor(bool, default:False) –If
True,write/write_atomic/open_atomic/move/copyissue oneSELECT 1per slash-aligned ancestor of the target path and raiseInvalidPathon the first regular-file hit, matching the cross-backend contract that hierarchical filesystems enforce natively. DefaultFalse; paths without slashes short-circuit.
resolve
¶
resolve(path: str) -> ResolutionPlan
Return a ResolutionPlan with SQL blob details.
Parameters:
-
path(str) –Backend-relative key.
Returns:
-
ResolutionPlan–Plan with
kind="sql-blob"anddetailscontaining -
ResolutionPlan–table_name.