Skip to content

gh-94345: Fix dataclasses.asdict()/astuple() crash on circular references#151904

Open
iamsharduld wants to merge 2 commits into
python:mainfrom
iamsharduld:local-gh-94345-dataclasses-recursive
Open

gh-94345: Fix dataclasses.asdict()/astuple() crash on circular references#151904
iamsharduld wants to merge 2 commits into
python:mainfrom
iamsharduld:local-gh-94345-dataclasses-recursive

Conversation

@iamsharduld

@iamsharduld iamsharduld commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

asdict() and astuple() recurse into nested dataclasses and built-in containers without tracking which objects are already being processed, so a circular reference recurses until a RecursionError (or crashes the interpreter on a release build).

This tracks the objects on the current recursion path and raises ValueError("Circular reference detected") — matching json.dumps() — when one is revisited. Objects referenced more than once without forming a cycle (a DAG) are still converted normally.

…references

asdict() and astuple() recursed into nested dataclasses and containers
without tracking which objects were already being processed, so a circular
reference recursed until a RecursionError (or crashed the interpreter on a
release build). Track the objects on the current recursion path and raise
ValueError("Circular reference detected") -- matching json.dumps() -- when
one is revisited.
@read-the-docs-community

read-the-docs-community Bot commented Jun 22, 2026

Copy link
Copy Markdown

Documentation build overview

📚 cpython-previews | 🛠️ Build #33246112 | 📁 Comparing 4dccc96 against main (476b649)

  🔍 Preview build  

2 files changed
± library/dataclasses.html
± whatsnew/changelog.html

@@ -0,0 +1,4 @@
:func:`dataclasses.asdict` and :func:`dataclasses.astuple` now raise
:exc:`ValueError` when the dataclass instance contains a circular reference,
instead of recursing until a :exc:`RecursionError` (or crashing on a release

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a crash on release builds (e.g. segmentation fault) or a python exception?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a Python exception, not a crash. On a release build (current main, clang) a self-referential dataclass raises RecursionError and the process survives — the recursion limit stops the C stack from overflowing, so there's no segfault. The original report shows the same RecursionError.

I've reworded the NEWS entry to drop "(or crashing on a release build)"; it now just says "instead of recursing until a RecursionError".

reference.

.. versionchanged:: next
A circular reference now raises :exc:`ValueError` instead of

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR adds overhead to the conversion to dict or tuple. Can you benchmark this?

Adding overhead just to change the type of error message might be ok, but I would like to have some quantative numbers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarked on a release build (best of 9, over a ~1093-node nested structure of dataclasses + lists + dicts):

  • asdict: 15.5 ms → 19.3 ms (+24%)
  • astuple: 21.4 ms → 25.1 ms (+17%)

For typical shallow dataclasses the absolute cost is negligible; the percentage only shows up on deeply-nested, container-heavy data. I tried gating the cycle-tracking behind a recursion-depth threshold to make the common case cheaper, but it gave no measurable improvement — the cost is inherent to tracking object identities through the recursion, not the set operations specifically — so I kept the straightforward implementation.

On whether it's worth it beyond the error type: this also brings asdict/astuple in line with json.dumps, which detects circular references by default and raises the same ValueError("Circular reference detected"). Since asdict() output is so often passed straight to json.dumps, a cyclic structure currently dies with a deep RecursionError rather than the clear error json would give; copy.deepcopy likewise tracks identities on every call. If the default overhead is still a concern, I'd be happy to mirror json's check_circular= opt-out.

asdict()/astuple() on a circular reference raise RecursionError on a
release build, not a segfault; drop the inaccurate crash claim.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants