ext.observe - Observability Hooks Specification¶
Overview¶
ext.observe provides a Store-wrapping mechanism that fires user-defined
callbacks before and after each Store operation, enabling logging, metrics,
auditing, and tracing without modifying business code. The implementation
uses a proxy subclass pattern (ADR-0010) with explicit method overrides and
a drift-protection test.
Module: src/remote_store/ext/observe.py
Dependencies: None (pure Python, always available)
Related: 001-store-api.md (Store API), ADR-0010, ID-024,
ID-004 (superseded).
Event Model¶
OBS-001: StoreEvent Dataclass¶
Invariant: StoreEvent is a frozen dataclass with the following fields:
operation: str— operation name (e.g.,"read","write","delete","copy","move","iter_children","list_files","list_folders","glob","get_file_info","get_folder_info","head","exists","is_file","is_folder","read_bytes","read_text","write_atomic","write_text","delete_folder","to_key","unwrap","supports").path: str— store-relative key (first positional path argument). Empty string for operations that take no path (e.g.,supports).backend: str— backend name fromstore._backend.name.started_at: float—time.monotonic()at method entry.duration_ms: float— elapsed time in milliseconds.error: Exception | None—Noneon success, the exception instance on failure.metadata: dict[str, Any]— operation-specific extra data (e.g.,overwrite,recursive,dst,pattern,size,missing_ok).correlation_id: str | None— value fromcontextvars.ContextVar,Nonewhen not set. Allows grouping related events (e.g., a batch operation's individual calls).
Factory Function¶
OBS-002: observe() Signature¶
Invariant:
def observe(
store: Store,
*,
on_read: OnEvent | None = None,
on_write: OnEvent | None = None,
on_delete: OnEvent | None = None,
on_copy: OnEvent | None = None,
on_move: OnEvent | None = None,
on_list: OnEvent | None = None,
on_error: OnEvent | None = None,
on_any: OnEvent | None = None,
around: AroundHook | None = None,
) -> ObservedStore: ...
Where:
- OnEvent = Callable[[StoreEvent], None] — after-only callback.
- AroundHook = Callable[[str, str, str], AbstractContextManager[None]] —
receives (operation, path, backend), returns a context manager that wraps
the entire operation including hook dispatch.
Postconditions:
- Returns an ObservedStore instance wrapping store.
- The returned object is a Store subclass (isinstance(result, Store) is True).
Rationale: Per-operation hooks allow selective observation. on_any is a
catch-all. on_error fires on any failed operation regardless of whether a
per-operation hook is registered.
Proxy¶
OBS-003: ObservedStore Proxy¶
Invariant: ObservedStore is a subclass of Store that explicitly
overrides every public method of Store. Each override:
- Records
started_at = time.monotonic(). - Enters the
aroundcontext manager (if set). - Delegates to the inner store's method.
- Computes
duration_ms. - Constructs a
StoreEvent. - Fires the matching
on_<op>callback (if set), regardless of outcome (see OBS-004). - Fires
on_any(if set), regardless of outcome. - On exception: fires
on_error(if set), then re-raises. - Returns the result from the inner store.
Properties:
- inner: Store — read-only property returning the wrapped store.
Postconditions:
- The proxy never modifies arguments or return values.
- After-hook exceptions (on_<op>, on_any, on_error) are suppressed
(logged at WARNING) to prevent observation from breaking the observed
operation. The around context manager follows standard Python semantics:
exceptions from __enter__ or __exit__ propagate (see OBS-005).
OBS-003a: Hook-to-Operation Mapping¶
The on_<op> hooks map to operations as follows:
| Hook | Operations |
|---|---|
on_read |
read, read_bytes, read_text |
on_write |
write, write_text, write_atomic |
on_delete |
delete, delete_folder |
on_copy |
copy |
on_move |
move |
on_list |
iter_children, list_files, list_folders, glob, get_file_info, get_folder_info, head, exists, is_file, is_folder |
Operations not covered by a specific hook (to_key, unwrap, supports,
child, close) still fire on_any and on_error.
Hook Types¶
OBS-004: After-Only Hooks (on_)¶
Invariant: Each on_<op> callback receives a StoreEvent after the
operation completes (success or failure). The callback cannot prevent the
operation or modify its result.
OBS-005: Around Hook¶
Invariant: The around callback receives (operation, path, backend) and
returns a context manager. The context manager's __enter__ runs before the
operation; __exit__ runs after, regardless of success or failure. This
enables before/after instrumentation (e.g., setting trace spans).
Postconditions:
- If the around context manager raises on __enter__, the operation is
skipped and the exception propagates.
- If the around context manager raises on __exit__, the exception from
__exit__ propagates (standard context manager semantics).
Buffered Observer¶
OBS-006: BufferedObserver¶
Invariant: BufferedObserver collects events in a thread-safe queue and
periodically flushes them to a user-provided batch handler.
class BufferedObserver:
def __init__(
self,
handler: Callable[[list[StoreEvent]], None],
*,
max_queue: int = 1000,
flush_interval: float = 5.0,
): ...
def on_event(self, event: StoreEvent) -> None: ...
def flush(self) -> None: ...
def close(self) -> None: ...
Properties:
- on_event enqueues a StoreEvent. If the queue is full, the event is
dropped and a warning is logged (backpressure).
- flush drains the queue and calls handler with the batch.
- close stops the background flush thread and performs a final flush.
- The background thread is a daemon thread with periodic flush every
flush_interval seconds.
- Thread-safe via queue.Queue.
Safety¶
OBS-007: Drift-Protection Test¶
Invariant: The test suite includes a test that asserts ObservedStore
overrides every public method of Store (i.e., every callable in
Store.__dict__ whose name does not start with _ has a corresponding
entry in ObservedStore.__dict__).
Rationale: This prevents new Store methods from silently bypassing observation. See ADR-0010.
Logging Contract¶
OBS-008: Intrinsic Logging Conventions¶
Invariant: All library modules follow these conventions:
- Logger variable:
log = logging.getLogger(__name__). - Format:
%-style (lazy evaluation, ruff G004 compliant). - Structured context via
extra={}dict with keys likebackend,op,pathwhere applicable. - Levels: DEBUG (method entry), INFO (write/delete/move/copy completion), WARNING (retries, fallbacks), ERROR (before re-raise).
- Package init registers
NullHandler:logging.getLogger("remote_store").addHandler(logging.NullHandler()). - Never log inside tight loops (per-chunk streaming).
- Never log sensitive data (credentials, file contents).
Error Handling¶
OBS-009: Error Propagation¶
Invariant: CapabilityNotSupported and all other exceptions from Store
methods always propagate to the caller. The proxy catches exceptions only
to build the StoreEvent (with error set) and fire hooks, then
re-raises unconditionally.
After-hook exceptions (from on_<op>, on_any, on_error) are suppressed
and logged at WARNING level. They never mask the original operation's result
or exception. The around context manager is not suppressed — its exceptions
propagate per standard context manager semantics (see OBS-005).
Lifecycle¶
OBS-010: No Lifecycle Ownership¶
Invariant: ObservedStore.close() delegates to the inner store's
close() (firing hooks as for any other method). The proxy does not add
its own resources that need cleanup — it has no independent lifecycle.
Rationale: Consistent with the extension contract (ADR-0008): extensions
never close the Store. The ObservedStore follows the same principle as
Store.child() — the outermost owner manages the lifecycle.
OpenTelemetry Bridge¶
OBS-011: otel_hooks Factory¶
Invariant: otel_hooks() returns a dict suitable for unpacking into
observe():
def otel_hooks(
*,
tracer_name: str = "remote_store",
meter_name: str = "remote_store",
tracer: Tracer | None = None,
meter: Meter | None = None,
) -> dict[str, Any]: ...
The returned dict contains:
- "around": a context-manager factory that creates OTel spans.
- "on_any": a callback that records metrics (counters and histogram).
When tracer or meter are provided, they are used directly (bypassing
the global providers). When None (default), the tracer/meter are
obtained from the global TracerProvider/MeterProvider via the
corresponding *_name parameter.
Postconditions:
- observe(store, **otel_hooks()) produces an ObservedStore with OTel
tracing and metrics enabled.
- When opentelemetry-api is not installed, importing ext.otel raises
ImportError.
OBS-012: Span Conventions¶
Invariant: Each span emitted by the around hook follows these
conventions:
- Span name:
store.{operation}(e.g.,store.read,store.write). - Span kind:
SpanKind.CLIENT. - Attributes:
remote_store.operation(str): operation name.remote_store.backend(str): backend name.remote_store.path(str): store-relative key.- On error:
span.set_status(StatusCode.ERROR, str(exc)).span.record_exception(exc).error.typeattribute set to the exception class's qualified name.
OBS-013: Metric Instruments¶
Invariant: The on_any hook records the following metrics:
| Type | Name | Unit | Attributes |
|---|---|---|---|
| Counter | remote_store.operations |
1 |
operation, backend, status ("ok" or "error") |
| Counter | remote_store.errors |
1 |
operation, backend, error.type |
| Histogram | remote_store.operation.duration |
s |
operation, backend; plus error.type on error |
Postconditions:
- path is not included in metric attributes (high-cardinality risk).
- Duration is recorded in seconds (OTel convention), converted from
StoreEvent.duration_ms.
OBS-014: Import Gating¶
Invariant: ext.otel requires opentelemetry-api>=1.28.0. The module
performs a top-level import opentelemetry and raises ImportError if the
package is missing.
otel_hooks and otel_observe are imported directly from
remote_store.ext.otel (ADR-0013).
Convenience wrapper:
def otel_observe(
store: Store,
*,
tracer_name: str = "remote_store",
meter_name: str = "remote_store",
tracer: Tracer | None = None,
meter: Meter | None = None,
) -> ObservedStore: ...
Equivalent to observe(store, **otel_hooks(tracer_name=..., meter_name=..., tracer=..., meter=...)).
OBS-015: WriteResult in Post-Operation StoreEvent¶
Invariant: After a successful write, write_text, or write_atomic
on the wrapped store, ObservedStore injects the returned WriteResult
into the post-operation StoreEvent under StoreEvent.metadata["write_result"].
The pre-operation phase of around and the failure path (exception raised
by the wrapped write) are unchanged — on failure, no "write_result" key
is present.
Postconditions:
- Successful
write/write_text/write_atomic:event.metadata["write_result"]is theWriteResultreturned by the wrapped store (identity, not a copy). StoreEvent.metadatakeeps its existingdict[str, Any]type; access viaevent.metadata["write_result"]is explicitly untyped. Callers narrow withisinstance(..., WriteResult)if static checking is required. A typed field onStoreEventis deferred (RFC-0011 Open Questions).- No other
on_<op>hook injects intoevent.metadatabeyond the existing per-operation keys (OBS-001).
Implementation note: The current _observe_op helper is a context
manager that constructs the StoreEvent before the wrapped call returns.
Injecting write_result requires either (a) mutating event.metadata
after the wrapped call completes but before hook dispatch, or (b)
re-constructing the event post-call. The invariant is neutral between
these implementations; the test harness asserts only the observable
contract (event.metadata["write_result"] is result).
See also: 045-write-result.md (WR-019).