Skip to content

Fix tracing cow changes#1260

Open
dblnz wants to merge 5 commits intohyperlight-dev:mainfrom
dblnz:fix-tracing-cow-changes
Open

Fix tracing cow changes#1260
dblnz wants to merge 5 commits intohyperlight-dev:mainfrom
dblnz:fix-tracing-cow-changes

Conversation

@dblnz
Copy link
Contributor

@dblnz dblnz commented Feb 25, 2026

Description

This PR fixes #1242.
The guest tracing data received by the host assumes identity mapped addresses, the address needs to be translated to correctly decode the flatbuffers data.

This can be reviewed commit by commit.
The first commit tackles this issue, the other two improve on the spans and usability.

@dblnz dblnz requested a review from danbugs as a code owner February 25, 2026 16:37
@dblnz dblnz added the kind/bugfix For PRs that fix bugs label Feb 25, 2026
@dblnz dblnz force-pushed the fix-tracing-cow-changes branch from 78fb1fe to 6176db1 Compare February 25, 2026 16:46
@danbugs danbugs mentioned this pull request Feb 25, 2026
17 tasks
@ludfjig ludfjig requested a review from Copilot February 25, 2026 18:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes guest→host tracing breakage introduced by CoW by translating the guest-provided trace buffer pointer (GVA) via a page-table walk (using CR3) before decoding the FlatBuffer trace batch. It also refines guest-side spans around initialization/dispatch to improve trace usability and reduce mismatched span warnings.

Changes:

  • Translate trace batch pointers from GVA→GPA on the host (CR3-driven page table walk) before decoding trace FlatBuffers.
  • Update the guest↔host tracing ABI to pass trace batch metadata via r12/r13/r14 and adjust host handling accordingly.
  • Improve guest tracing spans around generic_init and internal_dispatch_function and enable Debug printing of FunctionCall.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/trace/context.rs Switch trace batch extraction to read via GVA translation and new from_regs API.
src/hyperlight_host/src/sandbox/snapshot.rs Expose helpers for resolving GPAs and reading page tables from shared/scratch memory.
src/hyperlight_host/src/mem/mgr.rs Add read_guest_memory_by_gva that walks page tables and reads from shared/scratch backing.
src/hyperlight_host/src/hypervisor/hyperlight_vm.rs Pass CR3 into trace handling and adjust when trace handling is attempted.
src/hyperlight_guest_tracing/src/state.rs Send trace batch pointer/len via r12/r13/r14 in the OUT instruction.
src/hyperlight_guest_bin/src/lib.rs Add a generic_init span and flush/close ordering to avoid host span warnings.
src/hyperlight_guest_bin/src/guest_function/call.rs Add an internal_dispatch_function span and trace log the FunctionCall in debug builds.
src/hyperlight_guest/src/exit.rs Update OUT path to attach trace batch metadata via r12/r13/r14.
src/hyperlight_common/src/vmem.rs Update comment noting virt_to_phys is now used for tracing reads.
src/hyperlight_common/src/flatbuffer_wrappers/function_call.rs Derive Debug for FunctionCall to support tracing output.

// to correct execution of the guest
match self.vm.sregs().map(|s| s.cr3) {
Ok(cr3) => {
tc.handle_trace(&regs, mem_mgr, cr3).unwrap_or_else(|e| {
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virt_to_phys expects root_pt to be a page-aligned PML4 base, but the value passed here comes directly from sregs.cr3 and may include low flag/PCID bits. The guest-side root_table() masks with & !0xfff; the host should do the same (or use a shared helper) before walking page tables, otherwise translation can read the wrong entries.

Suggested change
tc.handle_trace(&regs, mem_mgr, cr3).unwrap_or_else(|e| {
let root_pt = cr3 & !0xfff_u64;
tc.handle_trace(&regs, mem_mgr, root_pt).unwrap_or_else(|e| {

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I need to do this.
@syntactically, I've seen it's not done on the host, should I change this?

Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to be but not too familiar with this code. BTW why were the registers changed? Also, should we have tests to make sure this doesn't break in the future again?

- With the new Copy-on-Write changes, the guest virtual addresses and
  guest physical addresses are no longer identity mapped, so we need a
  way to do this translation when a new guest trace batch arrives from
  the guest

Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- The max virtual address was calculated based on the vmin (start of
  page), not the actual virtual address provided

Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- Use registers r12, r13 and r14 to pass trace data information from the
  guest to the host
- This is in use temporarily, until the virtio queues are implemented

Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- For each call to `internal_dispatch_function` now we create a span
  manually because instrumenting the function would be redundant because
  of the call to `new_call` that resets tracing state for each new
  function call
- The same logic is used for `generic_init` to enable it for the
  initialise call

Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
@dblnz dblnz force-pushed the fix-tracing-cow-changes branch from 6176db1 to fb1f56a Compare February 26, 2026 16:31
let mapping = Mapping {
phys_base: 0x1000,
virt_base: 0x1000,
len: 2 * PAGE_SIZE as u64, // 2 page + 512 bytes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
len: 2 * PAGE_SIZE as u64, // 2 page + 512 bytes
len: 2 * PAGE_SIZE as u64, // 2 page

@dblnz
Copy link
Contributor Author

dblnz commented Feb 26, 2026

BTW why were the registers changed?

The registers changed because I wanted to ensure the registers used couldn't be overwritten when returning from a function (currently, we set the registers in preparation for a hlt to execute, and the hlt instruction is written in assembly after the call generic_init or call internal_dispatch_function finishes).
By reading some more info on calling convention, it looks like r12,r13,r14 are preserved registers.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bugfix For PRs that fix bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix guest tracing related to CoW changes

4 participants