fix(har): correct attribute name mismatches and dict init bug#1
fix(har): correct attribute name mismatches and dict init bug#1andreabedini wants to merge 3 commits intomkb79:mainfrom
Conversation
…ar.py
The HAR exporter referenced `item.conn_id` and `item.req_id` on parser
dataclasses that expose `connection_id` and `request_id` respectively,
causing AttributeError at runtime. Also fixes `_headers_to_dict_multi`
which initialised `out` as a list (`[]`) instead of a dict (`{}`).
After these fixes, `har_from_session()` successfully exports 90 entries
from a real-world session file.
There was a problem hiding this comment.
Pull request overview
Fixes runtime crashes in the HAR export path (hc-har) by aligning hc_har.py with the parser dataclass field names and correcting an invalid dict initialization.
Changes:
- Replace incorrect parser attribute access (
conn_id/req_id) withconnection_id/request_idthroughouthar_from_session. - Fix
_headers_to_dict_multito initializeoutas a dict rather than a list. - Minor docstring/formatting adjustments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mkb79
left a comment
There was a problem hiding this comment.
Review notes after checking the local checkout and PR diff:
-
The core fix looks correct.
ConnectionFrameandRequestMainInfoexposeconnection_idandrequest_id, so replacing the previousitem.conn_id/item.req_idaccess inhc_har.pyfixes a real runtime crash in the HAR export path. The_headers_to_dict_multichange from[]to{}is also correct. -
Before merging, I would clean up two small PR-introduced formatting issues: the docstring now contains the literal text
UTF\u20118instead ofUTF-8/UTF‑8, andsrc/httpcatcher_parser/hc_har.pyno longer ends with a trailing newline. These are not runtime blockers, but they are unnecessary noise in the patch. -
There is still a naming inconsistency in the surrounding HAR code:
hc_parser.pyconsistently exposesrequest_id/connection_id, whilehc_har.pystill uses internal names likereq_id/conn_idinReqAgg,_ensure, comments, and maps. This does not currently crash, but it is the kind of mixed naming that made the original bug easy to introduce. I would consider renaming the HAR aggregator fields torequest_idandconnection_idin a follow-up or in this PR if you want to keep the naming aligned. -
A separate existing HAR issue: when
decompress_response=True,response.bodySizeis currently based onlen(body_out), i.e. the decompressed body. For HAR semantics, it should likely represent the on-wire response body size (len(resp_raw)), whilecontent.sizecan remain the decoded content size. -
Header lookups are only partially case-insensitive.
_headers_to_dict_multipreserves the original header casing, but later lookups only check variants such asContent-Typeandcontent-type. Since HTTP header names are case-insensitive, a normalized lookup helper would make this more robust forContent-Type,Content-Encoding,Location, etc. -
Minor existing style inconsistency: in
hc_parser.py,from __future__ import annotationsappears before the module docstring, sohttpcatcher_parser.hc_parser.__doc__isNone. The file also says comments/docstrings are English, butMetricsHarvesterstill has a German docstring.
Verification I ran locally:
python3 -m compileall -q srcpasses.PYTHONPATH=src python3 -m httpcatcher_parser.hc_har --helpworks.- A smoke test with
hc_sessions/2025_08_28__13_25_58wrote a HAR file with 17 entries. pytest -qreportsno tests ran, because the repository currently has no test files.
Overall: I would merge the functional fix after cleaning up the docstring escape and missing trailing newline. The other points are follow-up candidates unless you want this PR to also do a small consistency pass around the HAR exporter.
Hi @mkb79, thank you for this project! I had to make a couple of changes to make it work with my session capture.
Summary
hc_har.pyreferences attribute names that don't exist on the parser's dataclasses, causingAttributeErrorat runtime wheneverhc-haris invoked.Bug 1 — wrong attribute names in
har_from_sessionThe parser dataclasses (
ConnectionFrame,RequestMainInfo,RequestHeader,RequestBody,ResponseHeader,ResponseBody) all exposeconnection_idandrequest_id, but the HAR builder accesseditem.conn_id/item.req_id:Fixed by renaming all seven access sites to match the parser's field names.
Bug 2 — list literal instead of dict in
_headers_to_dict_multiTest plan
hc-har <session_file>— previously crashed immediately withAttributeError, now completes successfully.harfile is valid JSON and contains the expected number of entries