Skip to content

contracts: strip metadata before DATA_INVARIANT in get_contract#9056

Open
tautschnig wants to merge 1 commit into
diffblue:developfrom
tautschnig:extract/contracts-strip-metadata-get-contract
Open

contracts: strip metadata before DATA_INVARIANT in get_contract#9056
tautschnig wants to merge 1 commit into
diffblue:developfrom
tautschnig:extract/contracts-strip-metadata-get-contract

Conversation

@tautschnig

Copy link
Copy Markdown
Collaborator

goto-instrument --replace-call-with-contract's get_contract helper compares the contract-declaration's code_typet with the function-symbol's code_typet via irept::operator==. That operator recurses into every sub-irep, including metadata (source locations, per-declaration #identifier strings) and contract- specific sub-ireps (spec_requires / spec_ensures / spec_assigns / spec_frees). The metadata is unrelated to whether the two declarations describe the same function signature, so comparing it turns every cross-TU contract application into a DATA_INVARIANT false-positive — even when the signatures match exactly.

Integration/linux LIM-016 tracks this as a blocker for the per-file harness path.

This commit strips the known irrelevant fields recursively from both types before comparing:

  • #source_location
  • #identifier / #base_name (per-declaration parameter names)
  • C_spec_requires / C_spec_ensures / C_spec_assigns / C_spec_frees (contract-specific decorations)

The remaining type comparison is still strict on the semantic shape (code type, return type, parameter count and types); it just no longer fires on metadata-only differences.

All three CORE contracts regression sets (contracts-CORE, contracts-dfcc-CORE, contracts-non-dfcc-CORE) remain green, and the integration/linux scan pipeline (9 regression scripts) is unchanged.

A further case (contract adapter declares an opaque struct whose linked kernel TU has the struct's full definition) is a genuine semantic mismatch and should continue to fire the invariant. That's a design matter for the scan adapter rather than a bug in get_contract.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@tautschnig tautschnig self-assigned this Jun 18, 2026
Copilot AI review requested due to automatic review settings June 18, 2026 15:51

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes false-positive DATA_INVARIANT failures when applying contracts across translation units by comparing function signatures after removing non-semantic metadata from the involved type ireps.

Changes:

  • Adds a recursive helper to strip source-location, identifier/base-name, and contract-spec decorations from type ireps prior to equality comparison.
  • Updates get_contract to compare the stripped versions of the contract and function types.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +609 to +627
std::function<void(irept &)> strip_metadata = [&strip_metadata](irept &node)
{
node.remove(ID_C_source_location);
node.remove(ID_C_spec_requires);
node.remove(ID_C_spec_ensures);
node.remove(ID_C_spec_assigns);
node.remove(ID_C_spec_frees);
// Also strip per-declaration identifier/base-name ireps so
// two parameter declarations with the same type but
// different mangled caller-specific identifiers still
// compare equal.
node.remove(ID_C_identifier);
node.remove(ID_C_base_name);
// Walk named sub-ireps (the ones that aren't comments).
for(auto &[name, sub] : node.get_named_sub())
strip_metadata(sub);
for(auto &sub : node.get_sub())
strip_metadata(sub);
};
// comments prefixed with `#`) from a type irep, recursively.
// Contract-spec sub-ireps get stripped too so they don't throw
// off the comparison against the plain function-declaration.
std::function<void(irept &)> strip_metadata = [&strip_metadata](irept &node)
goto-instrument --replace-call-with-contract's get_contract helper
compares the contract-declaration's code_typet with the
function-symbol's code_typet via irept::operator==.  That operator
recurses into every sub-irep, including metadata (source locations,
per-declaration #identifier strings) and contract- specific sub-ireps
(spec_requires / spec_ensures / spec_assigns / spec_frees).  The
metadata is unrelated to whether the two declarations describe the same
function signature, so comparing it turns every cross-TU contract
application into a DATA_INVARIANT false-positive — even when the
signatures match exactly.

This commit strips the known irrelevant fields recursively from both
types before comparing:
  - #source_location
  - #identifier / #base_name (per-declaration parameter names)
  - C_spec_requires / C_spec_ensures / C_spec_assigns /
    C_spec_frees (contract-specific decorations)

The remaining type comparison is still strict on the semantic shape
(code type, return type, parameter count and types); it just no longer
fires on metadata-only differences.

All three CORE contracts regression sets (contracts-CORE,
contracts-dfcc-CORE, contracts-non-dfcc-CORE) remain green, and the
integration/linux scan pipeline (9 regression scripts) is unchanged.

A further case (contract adapter declares an opaque struct whose linked
kernel TU has the struct's full definition) is a genuine semantic
mismatch and should continue to fire the invariant.  That's a design
matter for the scan adapter rather than a bug in get_contract.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the extract/contracts-strip-metadata-get-contract branch from a17d733 to 34fa09f Compare June 18, 2026 19:16
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.68%. Comparing base (321ba11) to head (34fa09f).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9056      +/-   ##
===========================================
- Coverage    80.68%   80.68%   -0.01%     
===========================================
  Files         1714     1714              
  Lines       189501   189519      +18     
  Branches        73       73              
===========================================
+ Hits        152902   152916      +14     
- Misses       36599    36603       +4     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants