Recover PG-vendored C types collapsed to int during header parsing#15
Open
estebanzimanyi wants to merge 1 commit into
Open
Recover PG-vendored C types collapsed to int during header parsing#15estebanzimanyi wants to merge 1 commit into
estebanzimanyi wants to merge 1 commit into
Conversation
4692909 to
4755cbb
Compare
This was referenced May 22, 2026
4755cbb to
c57f7f7
Compare
The host-symbol-collision build prefix-renames PG types and the parse lacks pg_config.h, so opaque PG-vendored types reach libclang already macro-collapsed and are spelled int / int * / int ** in the parsed IDL. This post-parse pass recovers each from the header declaration text, preserving const / pointer levels, and only when the function's parsed type actually collapsed to int. Recovered base types: bool, int64, Timestamp(Tz), H3Index, text, GSERIALIZED, Interval, DateADT, Datum, size_t, GBOX, BOX3D, AFFINE. Audited against a correct-typed reference IDL: zero int*-where-a-named- pointer-belongs mismatches remain, so every binding that codegens from the catalog gets the real types (e.g. tcbuffer_convex_hull -> GSERIALIZED *, temporal_tprecision(..., const Interval *, ...)).
c57f7f7 to
9ef867a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Some MEOS header sets reach libclang with
bool/int64/Timestamp/TimestampTz/H3Indexalready collapsed toint(orint *) at thepreprocessor level — the real type name is gone before parsing, so it cannot be
recovered from the AST. The extracted IDL then carries
intwhere the sourcesays one of those types, and downstream binding generators mis-map them:
bool→int(should be a boolean)int64/H3Index→int(should be 64-bit /long)TimestampTz *out-param →int *— generators size the result buffer at4 bytes for an 8-byte native write (a buffer under-allocation; observed as
IndexOutOfBounds/ native-heap corruption in a JMEOS consumer).Fix
A post-parse pass (
parser/typerecover.py) that recovers these from the rawheader declaration text (which still spells the real type) and rewrites the
IDL entry. Wired into
run.pyright afterparse_all_headers.It is idempotent and a no-op on correctly-parsed headers: it only rewrites a
type that is currently
"int"/"int *"and whose header declaration spellsa recoverable type. Genuinely-int functions (e.g.
intspan_width) are leftuntouched.
Validation
plus a genuine-int control: all recovered correctly, control unchanged,
re-run is a 0/0 no-op (idempotent).
externdecls).This makes the IDL — and any binding regenerated from it — reproducible without
external post-processing scripts.