Commit 4aaf064
marshal: recover loads regression with raw refs table + tagged state
Replace the PyList-backed reference table with a raw growable
PyObject ** array, and encode REF_STATE_INCOMPLETE_HASHABLE in the low
bit of each ref pointer so the parallel state-byte allocation is gone.
Also:
- drop the allow_incomplete_hashable parameter from r_object; it lives
on RFILE now, auto-reset on entry, flipped via a wrapper at the two
list-element / dict-value sites.
- force-inline the r_ref_* helpers so the compiler can fold the
if (flag) guards into the callers as the original R_REF macro did.
Misc/marshal-perf-diary.md records the full experiment ledger: each
idea tested in isolation, results, and the combined stack. Benchmark
harness is /tmp/marshal_bench_cpu_stable.py (200k loads x 11 repeats,
taskset -c 0, best-of-3 pinned-run median).
Combined deltas vs main on loads:
small_tuple 14.3% faster
nested_dict 6.9% faster
code_obj 6.8% faster
dumps is roughly flat to slightly faster. test_marshal passes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent eb1c4b7 commit 4aaf064
2 files changed
Lines changed: 672 additions & 84 deletions
0 commit comments