contracts: strip metadata before DATA_INVARIANT in get_contract#9056
Open
tautschnig wants to merge 1 commit into
Open
contracts: strip metadata before DATA_INVARIANT in get_contract#9056tautschnig wants to merge 1 commit into
tautschnig wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes false-positive DATA_INVARIANT failures when applying contracts across translation units by comparing function signatures after removing non-semantic metadata from the involved type ireps.
Changes:
- Adds a recursive helper to strip source-location, identifier/base-name, and contract-spec decorations from type ireps prior to equality comparison.
- Updates
get_contractto compare the stripped versions of the contract and function types.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+609
to
+627
| std::function<void(irept &)> strip_metadata = [&strip_metadata](irept &node) | ||
| { | ||
| node.remove(ID_C_source_location); | ||
| node.remove(ID_C_spec_requires); | ||
| node.remove(ID_C_spec_ensures); | ||
| node.remove(ID_C_spec_assigns); | ||
| node.remove(ID_C_spec_frees); | ||
| // Also strip per-declaration identifier/base-name ireps so | ||
| // two parameter declarations with the same type but | ||
| // different mangled caller-specific identifiers still | ||
| // compare equal. | ||
| node.remove(ID_C_identifier); | ||
| node.remove(ID_C_base_name); | ||
| // Walk named sub-ireps (the ones that aren't comments). | ||
| for(auto &[name, sub] : node.get_named_sub()) | ||
| strip_metadata(sub); | ||
| for(auto &sub : node.get_sub()) | ||
| strip_metadata(sub); | ||
| }; |
| // comments prefixed with `#`) from a type irep, recursively. | ||
| // Contract-spec sub-ireps get stripped too so they don't throw | ||
| // off the comparison against the plain function-declaration. | ||
| std::function<void(irept &)> strip_metadata = [&strip_metadata](irept &node) |
goto-instrument --replace-call-with-contract's get_contract helper
compares the contract-declaration's code_typet with the
function-symbol's code_typet via irept::operator==. That operator
recurses into every sub-irep, including metadata (source locations,
per-declaration #identifier strings) and contract- specific sub-ireps
(spec_requires / spec_ensures / spec_assigns / spec_frees). The
metadata is unrelated to whether the two declarations describe the same
function signature, so comparing it turns every cross-TU contract
application into a DATA_INVARIANT false-positive — even when the
signatures match exactly.
This commit strips the known irrelevant fields recursively from both
types before comparing:
- #source_location
- #identifier / #base_name (per-declaration parameter names)
- C_spec_requires / C_spec_ensures / C_spec_assigns /
C_spec_frees (contract-specific decorations)
The remaining type comparison is still strict on the semantic shape
(code type, return type, parameter count and types); it just no longer
fires on metadata-only differences.
All three CORE contracts regression sets (contracts-CORE,
contracts-dfcc-CORE, contracts-non-dfcc-CORE) remain green, and the
integration/linux scan pipeline (9 regression scripts) is unchanged.
A further case (contract adapter declares an opaque struct whose linked
kernel TU has the struct's full definition) is a genuine semantic
mismatch and should continue to fire the invariant. That's a design
matter for the scan adapter rather than a bug in get_contract.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
a17d733 to
34fa09f
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #9056 +/- ##
===========================================
- Coverage 80.68% 80.68% -0.01%
===========================================
Files 1714 1714
Lines 189501 189519 +18
Branches 73 73
===========================================
+ Hits 152902 152916 +14
- Misses 36599 36603 +4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
goto-instrument --replace-call-with-contract's get_contract helper compares the contract-declaration's code_typet with the function-symbol's code_typet via irept::operator==. That operator recurses into every sub-irep, including metadata (source locations, per-declaration #identifier strings) and contract- specific sub-ireps (spec_requires / spec_ensures / spec_assigns / spec_frees). The metadata is unrelated to whether the two declarations describe the same function signature, so comparing it turns every cross-TU contract application into a DATA_INVARIANT false-positive — even when the signatures match exactly.
Integration/linux LIM-016 tracks this as a blocker for the per-file harness path.
This commit strips the known irrelevant fields recursively from both types before comparing:
The remaining type comparison is still strict on the semantic shape (code type, return type, parameter count and types); it just no longer fires on metadata-only differences.
All three CORE contracts regression sets (contracts-CORE, contracts-dfcc-CORE, contracts-non-dfcc-CORE) remain green, and the integration/linux scan pipeline (9 regression scripts) is unchanged.
A further case (contract adapter declares an opaque struct whose linked kernel TU has the struct's full definition) is a genuine semantic mismatch and should continue to fire the invariant. That's a design matter for the scan adapter rather than a bug in get_contract.