Skip to content

Examples

Runnable example scripts demonstrating every feature of remote-store. Each example is self-contained and uses a temporary directory so you can run them directly.

Getting Started

Your first steps with remote-store — read, write, stream, and atomic semantics.

Example Description
Quickstart Minimal config, write, and read.
File operations Full Store API: read, write, delete, move, copy, list, metadata, type checks, capabilities, to_key.
Streaming I/O Streaming writes and reads with BytesIO.
Atomic writes Atomic writes and overwrite semantics.

Configuration

Wiring up stores from code, files, and registries.

Example Description
Configuration Config-as-code, from_dict(), multiple stores, S3/SFTP backend configs.
Config loaders Load registry configuration from TOML, YAML, and Pydantic models, with env-var interpolation.

Errors & Capabilities

Understanding the safety net — error hierarchy and capability gating.

Example Description
Error handling Catching NotFound, AlreadyExists, and more.
Capabilities and errors Capability querying, gating, and the structured error hierarchy.

Advanced Store Patterns

Deeper Store API concepts — paths, memory backend, child stores, async, retry, and health checks.

Example Description
Path model RemotePath normalization, properties, validation, and the / operator.
Memory backend In-process memory backend for testing and caching — no filesystem access needed.
Store.child() Runtime sub-scoping: create child stores that share a backend but isolate paths.
Async Store Async/await usage with AsyncStore -- streaming reads, async writes, child stores.
Retry Policy Configure retry attempts, backoff, and jitter per-backend.
Health Check Startup gate pattern using Store.ping() to verify backend connectivity.

Backends

These require a running service (AWS, MinIO, an SFTP server, Azure, Azurite, etc.) and credentials supplied via environment variables. Each script prints a help message when the required variables are missing.

Example Description
S3 backend Connect to Amazon S3 or any S3-compatible service (MinIO, DigitalOcean Spaces, etc.).
S3-PyArrow backend High-throughput S3 via PyArrow's C++ filesystem. Drop-in swap from the S3 backend.
S3 Listing Strategies Shallow vs. recursive listing: cost tradeoffs, iterator patterns, and MinIO endpoint usage.
SFTP backend Connect to any SSH/SFTP server with paramiko.
Azure backend Connect to Azure Blob Storage or Azure Data Lake Storage Gen2.
SQL Blob Backend SQLite key-value store — zero-infrastructure persistent file storage.
HTTP backend Read-only access to files over HTTP/HTTPS — no credentials needed for public endpoints.

Extensions

Composable Store wrappers — batch, transfer, glob, caching, and observability.

Example Description
Batch operations Bulk delete, copy, and existence checks with error aggregation.
Transfer operations Upload, download, and cross-store transfer with progress tracking.
Glob pattern matching Three-tier file filtering with list_files(pattern=), Store.glob(), and glob_files().
Caching Store-level caching with ext.cache: cached reads, auto-invalidation on writes, and cache statistics.
Observe hooks Callback-based instrumentation for Store operations — logging, metrics, auditing, and error tracking.
OpenTelemetry tracing and metrics Instrument any Store with OpenTelemetry spans and metrics.

Integrations

Third-party library bridges — PyArrow, Parquet datasets, and Dagster.

Example Description
PyArrow FileSystem adapter Use any Store as a pyarrow.fs.FileSystem for Parquet, CSV, and dataset I/O.
Parquet Dataset Store Managed Parquet datasets with manifests, completion markers, and multi-part writes.
Dagster IO Manager Use any Store as a Dagster IOManager with pluggable serialization.
Dagster v2 Resource Config-driven Store construction with RemoteStoreIOManager.
Dagster Compute Log Manager Persist op/step stdout & stderr to any Store.

Showcases

Full project examples demonstrating multiple extensions working together.

Example Description
Medallion + Dagster Showcase End-to-end Bronze/Silver/Gold pipeline with Dagster and live MeteoSwiss data

Interactive Jupyter notebooks are also available in the examples/notebooks/ directory of the repository.