hrw4u: native C++ parser and visitor for header_rewrite#12825
hrw4u: native C++ parser and visitor for header_rewrite#12825zwoop wants to merge 8 commits intoapache:masterfrom
Conversation
|
I'm making this a draft for now, I think it needs to go into ATS v11, since we're getting close to 10.2 branching and release. |
|
[approve ci debian] |
0580351 to
ae1579b
Compare
|
[approve ci rocky] |
c2479e2 to
453a6ab
Compare
|
[approve ci] |
bneradt
left a comment
There was a problem hiding this comment.
I've verified that with your latest updates I can build this with our internal Yahoo tree in my dev box. However we will need to update our build infrastructure to contain antlr before I can test this in a production box.
I can deal with that over the next few weeks. In the meantime, let's land this work and we can steer the ship if need be.
This also adds a new tool, and infrastructure, to allow us to compare and verify that an old and new (hrw4u) configuration generates the same structure of Statements and Operators (functional equivalence). That tool could also be used in an automation process migrating existing configurations to hrw4u. u4wrh: adds an option to disable optimizations
Fixes parser problems in u4wrh with implicit hooks.
This also fixes a few issues in the tool chains
The hrw4u code was developed on macOS with Clang, which is lenient about C++20 designated initializer ordering and missing fields. GCC enforces these strictly: designators must appear in struct declaration order, and skipped fields trigger -Wmissing-field-initializers (fatal under -Werror). Fix Tables.cc by reordering all designators to match MapParams field order, Error.cc by adding the skipped .code field, and suppress the missing-field warning for the hrw4u target. Also add fPIC and ANTLR4 shared library fallback for systems without the static ANTLR4 library. The hrw_confcmp tool had three link failures on GNU ld: hrw_confcmp_lib was missing its dependency on hrw_confcmp_parser (which provides Parser and HRWSimpleTokenizer symbols), the CertBase::SAN::SANBase::Join stub returned std::string instead of cripts::string (different mangled name), and GNU ld's single-pass archive processing missed intra-library symbol references that macOS ld resolves automatically. (cherry picked from commit df0ed61c784f193a945a35a68e4f98a7ce59b503)
|
[approve ci] |
There was a problem hiding this comment.
Pull request overview
This PR introduces a native C++ ANTLR4-based parser for the header_rewrite plugin, enabling direct parsing of .hrw4u configuration files without requiring external Python tooling. The implementation adds a new src/hrw4u library with visitor-based parsing, integrates it into the header_rewrite plugin, and includes a comparison tool (hrw_confcmp) to validate equivalence between legacy and hrw4u configurations.
Changes:
- Adds native C++ hrw4u parser library with ANTLR4 visitor implementation, type system, symbol resolution, and error handling
- Integrates native parser into header_rewrite plugin with factory bridge, object type system, and comparison interfaces
- Adds hrw_confcmp tool with test runner for validating configuration equivalence
- Updates Python tools for new operators (set-cc-alg, set-effective-address), improved reverse compilation, and TXN_CLOSE hook support
- Adds comprehensive .hrw4u test files and documentation updates
Reviewed changes
Copilot reviewed 85 out of 87 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/hrw4u/* | Native C++ parser implementation with visitor, types, tables, and error handling |
| include/hrw4u/* | Public API headers for parser library |
| plugins/header_rewrite/hrw4u.{cc,h} | Integration layer between parser and plugin |
| plugins/header_rewrite/objtypes.{cc,h}, types.h | Object type system for native parsing |
| plugins/header_rewrite/*.{cc,h} | Added type_name(), equals(), initialize() methods for comparison |
| tools/hrw_confcmp/* | Configuration comparison tool with test runner |
| tools/hrw4u/src/* | Python tool updates for new features and fixes |
| tests/gold_tests/* | New .hrw4u test files for existing tests |
| doc/admin-guide/* | Updated documentation for native parsing |
| CMakeLists.txt | Build system integration for new components |
This essentially converted the existing Python parser / visitor / emitters to C++ and natively calling the various factory functions to create the object structure. In addition, it adds a new testing tool, hrw_confcmp, which compares an old vs a hrw4u configuration, and make sure they both create the same set of objects. As such, the header_rewrite core also added some support to identify the various Operators and Conditions. Usage: