Examples¶
Runnable example scripts demonstrating every feature of remote-store. Each example is self-contained and uses a temporary directory so you can run them directly.
Getting Started¶
Your first steps with remote-store — read, write, stream, and atomic semantics.
| Example | Description |
|---|---|
| Quickstart | Minimal config, write, and read. |
| File operations | Full Store API: read, write, delete, move, copy, list, metadata, type checks, capabilities, to_key. |
| Streaming I/O | Streaming writes and reads with BytesIO. |
| Atomic writes | Atomic writes and overwrite semantics. |
Configuration¶
Wiring up stores from code, files, and registries.
| Example | Description |
|---|---|
| Configuration | Config-as-code, from_dict(), multiple stores, S3/SFTP backend configs. |
| Config loaders | Load registry configuration from TOML, YAML, and Pydantic models, with env-var interpolation. |
Errors & Capabilities¶
Understanding the safety net — error hierarchy and capability gating.
| Example | Description |
|---|---|
| Error handling | Catching NotFound, AlreadyExists, and more. |
| Capabilities and errors | Capability querying, gating, and the structured error hierarchy. |
Advanced Store Patterns¶
Deeper Store API concepts — paths, memory backend, child stores, async, retry, and health checks.
| Example | Description |
|---|---|
| Path model | RemotePath normalization, properties, validation, and the / operator. |
| Memory backend | In-process memory backend for testing and caching — no filesystem access needed. |
| Store.child() | Runtime sub-scoping: create child stores that share a backend but isolate paths. |
| Async Store | Async/await usage with AsyncStore -- streaming reads, async writes, child stores. |
| Retry Policy | Configure retry attempts, backoff, and jitter per-backend. |
| Health Check | Startup gate pattern using Store.ping() to verify backend connectivity. |
Backends¶
These require a running service (AWS, MinIO, an SFTP server, Azure, Azurite, etc.) and credentials supplied via environment variables. Each script prints a help message when the required variables are missing.
| Example | Description |
|---|---|
| S3 backend | Connect to Amazon S3 or any S3-compatible service (MinIO, DigitalOcean Spaces, etc.). |
| S3-PyArrow backend | High-throughput S3 via PyArrow's C++ filesystem. Drop-in swap from the S3 backend. |
| S3 Listing Strategies | Shallow vs. recursive listing: cost tradeoffs, iterator patterns, and MinIO endpoint usage. |
| SFTP backend | Connect to any SSH/SFTP server with paramiko. |
| Azure backend | Connect to Azure Blob Storage or Azure Data Lake Storage Gen2. |
| SQL Blob Backend | SQLite key-value store — zero-infrastructure persistent file storage. |
| HTTP backend | Read-only access to files over HTTP/HTTPS — no credentials needed for public endpoints. |
Extensions¶
Composable Store wrappers — batch, transfer, glob, caching, and observability.
| Example | Description |
|---|---|
| Batch operations | Bulk delete, copy, and existence checks with error aggregation. |
| Transfer operations | Upload, download, and cross-store transfer with progress tracking. |
| Glob pattern matching | Three-tier file filtering with list_files(pattern=), Store.glob(), and glob_files(). |
| Caching | Store-level caching with ext.cache: cached reads, auto-invalidation on writes, and cache statistics. |
| Observe hooks | Callback-based instrumentation for Store operations — logging, metrics, auditing, and error tracking. |
| OpenTelemetry tracing and metrics | Instrument any Store with OpenTelemetry spans and metrics. |
Integrations¶
Third-party library bridges — PyArrow, Parquet datasets, and Dagster.
| Example | Description |
|---|---|
| PyArrow FileSystem adapter | Use any Store as a pyarrow.fs.FileSystem for Parquet, CSV, and dataset I/O. |
| Parquet Dataset Store | Managed Parquet datasets with manifests, completion markers, and multi-part writes. |
| Dagster IO Manager | Use any Store as a Dagster IOManager with pluggable serialization. |
| Dagster v2 Resource | Config-driven Store construction with RemoteStoreIOManager. |
| Dagster Compute Log Manager | Persist op/step stdout & stderr to any Store. |
Showcases¶
Full project examples demonstrating multiple extensions working together.
| Example | Description |
|---|---|
| Medallion + Dagster Showcase | End-to-end Bronze/Silver/Gold pipeline with Dagster and live MeteoSwiss data |
Interactive Jupyter notebooks are also available in the
examples/notebooks/
directory of the repository.