Research: BK-181 PR 2 — S3 cassette/replay infeasibility¶
Date: 2026-05-15
Status: Spike complete. Verdict: NO-GO for the planned PR 2 scope.
S3 replay cannot be implemented today via vcrpy/pytest-recording without
upstream changes. Recommended close-out: BK-181 ships at the Azure slice
(PR 1a/1b, already merged); revisit S3 when vcrpy supports the
aiobotocore transport stack.
Scope: Validate whether the BK-181 Azure verdict
extends to S3Backend's code path. The Azure PoC settled the mechanism
question for azure.core and warned that S3 needed separate validation
because s3fs rides aiobotocore and aiohttp rather than azure.core.
This spike answers that question.
Related: BK-181, BK-181 Azure PoC,
spec 048 (TEST-007/008/009),
the spike folder bk-181-s3-spike/.
Verdict¶
S3 cannot be brought under the cassette/replay layer in PR 2. The
underlying issue is well outside this repo: vcrpy 8.1.1's aiohttp_stubs.py
cannot transparently intercept the aiobotocore request/response cycle
that s3fs relies on. The failure is not a quirk of streaming-only
paths (the Azure caveat) but applies to every s3fs write and read,
because every such call is dispatched through aiobotocore even from
sync entry points (_fs.pipe_file, _fs.cat_file, _fs.call_s3).
The user-facing impact is small. s3_moto already provides a
high-fidelity Stage-1 in-process S3 fixture
(tests/backends/fixtures/s3_moto.py), so the conformance suite covers
S3 without network or Docker today. The original BK-181 motivation
("turn Azure HNS bugs into always-on regression guards") never applied
to S3 — Azure needed cassettes because Azurite cannot emulate the
hierarchical namespace, while moto emulates S3 with enough fidelity for
the conformance contract. Spec TEST-008
gains a "Noted exception — S3" paragraph that narrows
TEST-007's
otherwise-universal "HTTP backends support replay" invariant; the
paragraph points at this finding for the diagnosis.
What was tested¶
Spike folder: sdd/research/bk-181-s3-spike/, outside
testpaths, run by pointing pytest at it explicitly. Two probes were
exercised against a real AWS account (the same s3_live env that
tests/backends/fixtures/s3_live.py validates).
| File | Role |
|---|---|
conftest.py |
Minimal vcrpy wiring: scrub headers, query params, AWS auth signature |
test_spike.py |
S3Backend.write / read_bytes round-trip + 1 MiB streaming read |
isolation_check.py |
The same two operations driven through vcr.use_cassette directly, bypassing pytest-recording entirely |
The IAM policy on the s3_live account restricts s3:CreateBucket /
s3:PutObject to arn:aws:s3:::rs-conformance-*, so the spike uses
rs-conformance-bk181spike as a stable bucket name (no per-run UUID
rewriting needed at this stage).
Failure mode¶
Both the pytest-driven spike and the standalone isolation_check.py
fail in the same way during recording, with a --record-mode=all
(equivalently rewrite) session pointed at a real AWS endpoint:
File "site-packages/s3fs/core.py", line 211, in _error_wrapper
await tb.tb_frame.f_locals["response"]
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: "local variable ''response'' is not defined"
<sys>:0: RuntimeWarning: coroutine 'AioAwsChunkedWrapper.read' was never awaited
What this is saying:
- vcrpy hooks the underlying aiohttp connection used by
aiobotocore. - The hooked code path attempts to consume the request body
(
AioAwsChunkedWrapper.read) but vcrpy's stub does not properly drive the async-iteration protocol the wrapper expects. - The coroutine is left un-awaited and the request fails partway.
s3fs.core._error_wrapper(s3fs/core.py:211) tries to introspect the failed call's traceback for aresponselocal. Because the failure occurred before any HTTP response could be constructed,responseis never bound —s3fsthen raisesKeyError.- No cassette is written (
isolation.yamlis never created).
This is the same family as the Azure PoC's "aiohttp streaming-body bug"
(research-bk-181-cassette-replay-poc.md § Success criteria — results),
but worse: for Azure the bug applies only to response bodies and
only on the async path; for s3fs it applies to every aiobotocore
call because the request bodies themselves are wrapped in an async
chunked reader the vcrpy stub cannot drive. Sync entry points
(_fs.pipe_file etc.) trip the same path because s3fs invokes
aiobotocore through a private event loop regardless of caller mode.
Workarounds considered¶
- Inject a non-aiohttp transport into
s3fs(the Azure trick). Verdict: not available.azure.coreexposes a pluggabletransport=parameter so we can swapAioHttpTransportforAsyncioRequestsTransport.aiobotocorehas no such hook —aiohttpis hardwired inaiobotocore.endpoint.AioEndpointands3fsexposes no toggle to replace it. - Drop
s3fsfrom the replay path; use plainboto3(sync). Verdict: defeats the purpose. Cassette tests exist to exercise the real production-code SDK pipeline.S3Backendiss3fs; if a replay fixture uses rawboto3it stops testing the code that ships to users. The cassettes would catch nothing the existings3_motofixture does not already catch. - Wait for a vcrpy upstream fix. Verdict: not under our
control. vcrpy's
aiohttp_stubs.pyis a long-standing rough edge (kevin1024/vcrpy). No fix is scheduled. This finding and the preserved spike folder are the retest entry points if a future contributor wants to re-evaluate against a new vcrpy release. - Write a custom
aiobotocorerecorder. Verdict: out of scope. A bespoke recorder would be a sizeable internal tool, and the value is low:s3_motoalready covers the Stage-1 conformance surface for S3. The original Azure justification (no Stage-1 emulator covers HNS) does not apply.
Why moto is sufficient¶
For Azure, s3_moto's analogue does not exist: Azurite does not
emulate the Hierarchical Namespace, so contracts like
hdi_isfolder directory probes and the BUG-195..203 family can only
be observed against real ADLS Gen2. The replay layer turns those
contracts into always-on tests by recording them once.
For S3, every conformance contract the suite cares about — the
existing parametrised tests — already runs against s3_moto at
Stage 1 today
(tests/backends/fixtures/s3_moto.py,
tests/backends/fixtures/fixtures.toml). s3_moto exercises the
same S3Backend → s3fs → aiobotocore → aiohttp → moto server chain
end-to-end with no network, no Docker, and no live cloud account.
The few S3-specific defects that would be S3 analogues of the HNS
BUG-* family (multipart-upload edge cases, presigned-URL handling,
versioning quirks) are not on BK-181's roadmap and have no committed
backlog item demanding cassette coverage.
The remaining gap — a real-AWS contract that moto's emulation does
not match — is covered by s3_live at Stage 3 (manual or scheduled
CI cadence). The live mark keeps it out of default runs.
Recommendation¶
Close BK-181 at PR 2 with the S3 slice deferred:
- Move BK-181 from BACKLOG.md to BACKLOG-DONE.md (status flips
[~]→[x]) with the Azure-shipped, S3-deferred outcome documented in the close-out entry. - Amend spec 048 TEST-008 with a "Noted exception — S3" paragraph that narrows TEST-007's "HTTP backends support replay" invariant, pointing here for the diagnosis.
No follow-up backlog item is filed: s3_moto already covers Stage 1,
no S3-specific contract today demands cassette-level fidelity, and the
fix path runs through upstream vcrpy rather than this repo. If a
future need surfaces, this doc and the spike folder are the entry
points to re-evaluate the decision.
The trace sdd/traces/BK-181-cassette-replay-impl.yml is extended with
an s3-spike-finding phase pointing here.
Spike folder fate¶
Per the repo's PoC convention, the spike folder is frozen once the
finding is recorded. bk-181-s3-spike/ stays as evidence:
isolation_check.py is the smallest reproducer if a future contributor
wants to retest against a new vcrpy or s3fs release. test_spike.py
and conftest.py document the pytest-recording wiring path that did
not work and serve as a starting point for any future revisit.