Skip to content

Allocate all long-lived contents of interpreter memory via the jit interface allocation apis#125091

Open
davidwrighton wants to merge 3 commits intodotnet:mainfrom
davidwrighton:mini_interpreter_linker_2
Open

Allocate all long-lived contents of interpreter memory via the jit interface allocation apis#125091
davidwrighton wants to merge 3 commits intodotnet:mainfrom
davidwrighton:mini_interpreter_linker_2

Conversation

@davidwrighton
Copy link
Member

This makes all allocation of memory for the interpreter which are long-lived be allocated in the one big allocation. This should fix any issues where we have memory leaking from collectible assemblies.

davidwrighton and others added 3 commits March 2, 2026 14:13
…aBuilder

Remove the unused InterpReloc struct, m_relocs member, and AddReloc
method from InterpMethodDataBuilder. The relocation application loop
in Finalize is also removed since there are no relocs to apply.

The MemPoolAllocator constructor parameter is removed since it was
only needed for the TArray<InterpReloc> member, and the datastructs.h
include is dropped as TArray is no longer used in the header.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @BrzVlad, @janvorli, @kg
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors CoreCLR interpreter method compilation to place long-lived per-method data (bytecode header + bytecodes + method metadata + related structures) into a single allocation obtained via the JIT allocMem interface, with the goal of preventing leaks for collectible assemblies.

Changes:

  • Introduces InterpMethodDataBuilder to size/lay out a unified method-data allocation.
  • Switches InterpCompiler::CompileMethod to return bool and adds GetTotalAllocationSize/FinalizeMethodData to drive unified allocation and finalization.
  • Adds a new interpreter memory kind (IMK_MethodData) and updates allocator plumbing (MemPoolAllocator) and build inputs.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/coreclr/interpreter/interpmethoddata.h Adds method-data section model and builder API for unified allocation layout.
src/coreclr/interpreter/interpmethoddata.cpp Implements section sizing, alignment, and final offset computation for unified allocation.
src/coreclr/interpreter/interpmemkind.h Adds MethodData memkind for compilation-time method data allocations.
src/coreclr/interpreter/interpalloc.h Moves MemPoolAllocator into shared allocator header and changes it to wrap InterpAllocator.
src/coreclr/interpreter/eeinterp.cpp Updates compile pipeline to allocate one unified block and finalize method data into it.
src/coreclr/interpreter/compiler.h Updates compiler API for new allocation/finalization flow; adds builder + fixup tracking state.
src/coreclr/interpreter/compiler.cpp Implements new finalization/copy/fixup logic; changes AllocMethodData to use arena allocator.
src/coreclr/interpreter/CMakeLists.txt Adds interpmethoddata.cpp to interpreter build.

sz = sizeof(void*); // Arena allocator does not support zero-length allocations, so allocate something of minimum size instead.
}
return m_compiler->getAllocator(m_memKind).allocate<int8_t>(sz);
return ((InterpAllocator*)&m_allocator)->allocate<int8_t>(sz);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemPoolAllocator::Alloc removes constness from m_allocator via a C-style cast to call InterpAllocator::allocate. This is undefined behavior because the allocator instance is stored as const inside TArray, and the cast can allow writes through a const object (and may be miscompiled under optimization). Consider making m_allocator mutable, storing a pointer/reference to the underlying ArenaAllocator + kind, or adjusting the allocator wrapper so Alloc can call a const-qualified allocation API without const-casting.

Suggested change
return ((InterpAllocator*)&m_allocator)->allocate<int8_t>(sz);
InterpAllocator allocatorCopy = m_allocator;
return allocatorCopy.allocate<int8_t>(sz);

Copilot uses AI. Check for mistakes.
Comment on lines +1941 to +1990
// Copy async suspend data and fix up pointers
uint32_t currentAsyncOffset = asyncSuspendDataOffset;
uint32_t currentIntervalMapOffset = intervalMapsOffset;

InterpByteCodeStart* pByteCodeStart = (InterpByteCodeStart*)rxBase;

for (int32_t i = 0; i < m_asyncSuspendDataItems.GetSize(); i++)
{
InterpAsyncSuspendData* srcData = m_asyncSuspendDataItems.Get(i);
InterpAsyncSuspendData* dstDataRW = (InterpAsyncSuspendData*)(rwBase + currentAsyncOffset);

// Copy the struct
memcpy(dstDataRW, srcData, sizeof(InterpAsyncSuspendData));

// Fix up the methodStartIP to point to the final bytecode start
dstDataRW->methodStartIP = pByteCodeStart;

// Fix up interval map pointers if they exist
// Note: The interval maps were allocated via AllocMethodData in the old model,
// we need to copy them to the new allocation and fix up the pointers
if (srcData->liveLocalsIntervals != nullptr)
{
// Count entries
int32_t count = 0;
while (srcData->liveLocalsIntervals[count].countBytes != 0) count++;
count++; // Include terminator

InterpIntervalMapEntry* dstMapRW = (InterpIntervalMapEntry*)(rwBase + currentIntervalMapOffset);
InterpIntervalMapEntry* dstMapRX = (InterpIntervalMapEntry*)(rxBase + currentIntervalMapOffset);
memcpy(dstMapRW, srcData->liveLocalsIntervals, count * sizeof(InterpIntervalMapEntry));
dstDataRW->liveLocalsIntervals = dstMapRX;
currentIntervalMapOffset += count * sizeof(InterpIntervalMapEntry);
}

if (srcData->zeroedLocalsIntervals != nullptr)
{
// Count entries
int32_t count = 0;
while (srcData->zeroedLocalsIntervals[count].countBytes != 0) count++;
count++; // Include terminator

InterpIntervalMapEntry* dstMapRW = (InterpIntervalMapEntry*)(rwBase + currentIntervalMapOffset);
InterpIntervalMapEntry* dstMapRX = (InterpIntervalMapEntry*)(rxBase + currentIntervalMapOffset);
memcpy(dstMapRW, srcData->zeroedLocalsIntervals, count * sizeof(InterpIntervalMapEntry));
dstDataRW->zeroedLocalsIntervals = dstMapRX;
currentIntervalMapOffset += count * sizeof(InterpIntervalMapEntry);
}

currentAsyncOffset += sizeof(InterpAsyncSuspendData);
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FinalizeMethodData increments currentAsyncOffset/currentIntervalMapOffset and copies data without validating that the final offsets stay within the sizes reserved in m_methodDataBuilder. Adding debug asserts (e.g., that the amount written to each section does not exceed GetSectionSize(...)) would help prevent silent buffer overruns if section reservation and copy logic ever diverge.

Copilot uses AI. Check for mistakes.
void PatchRelocations(TArray<Reloc*, MemPoolAllocator> *relocs);
InterpMethod* CreateInterpMethod();
void PrepareInterpMethod();
void UpdateWithFinalMethodByteCodeAddress(InterpByteCodeStart *pByteCodeStart);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateWithFinalMethodByteCodeAddress is still declared as a private helper, but it no longer appears to be called anywhere after switching to FinalizeMethodData. Keeping an unused method here makes it harder to reason about which fixup path is authoritative; consider deleting it or wiring it into the new finalization flow if it’s still intended to be used.

Suggested change
void UpdateWithFinalMethodByteCodeAddress(InterpByteCodeStart *pByteCodeStart);

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +99
void InterpMethodDataBuilder::Finalize(void* baseAddressRW, void* baseAddressRX)
{
assert(!m_finalized);
m_finalBaseAddress = (uint8_t*)baseAddressRX;

m_finalized = true;
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InterpMethodDataBuilder::Finalize takes baseAddressRW but doesn't use it, which can trigger unused-parameter warnings in some builds. Consider removing the parameter from the definition/signature if it isn't needed, or explicitly marking it unused (e.g., (void)baseAddressRW) if the signature is intentional for symmetry.

Copilot uses AI. Check for mistakes.
#include "interpmethoddata.h"

uint32_t InterpMethodDataBuilder::AlignUp(uint32_t value, uint32_t alignment)
{
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AlignUp assumes alignment is a power of two (bitmask trick). If AllocateInSection can ever be called with a non-power-of-two alignment, the result will be incorrect. Consider asserting that alignment is non-zero and a power of two before applying the mask-based rounding.

Suggested change
{
{
assert(alignment != 0);
assert((alignment & (alignment - 1)) == 0);

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +59
InterpSectionRef() : section(InterpMethodDataSection::Header), offset(0) {}
InterpSectionRef(InterpMethodDataSection s, uint32_t o) : section(s), offset(o) {}

bool IsNull() const { return section == InterpMethodDataSection::Header && offset == 0; }
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InterpSectionRef::IsNull treats (Header, offset 0) as a null sentinel, but (Header, 0) is also a valid reference to the allocation base. If this is meant to represent an optional reference, consider using an explicit sentinel section value (e.g., Count) or a separate boolean to avoid ambiguity.

Suggested change
InterpSectionRef() : section(InterpMethodDataSection::Header), offset(0) {}
InterpSectionRef(InterpMethodDataSection s, uint32_t o) : section(s), offset(o) {}
bool IsNull() const { return section == InterpMethodDataSection::Header && offset == 0; }
InterpSectionRef() : section(InterpMethodDataSection::Count), offset(0) {}
InterpSectionRef(InterpMethodDataSection s, uint32_t o) : section(s), offset(o) {}
bool IsNull() const { return section == InterpMethodDataSection::Count; }

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +34
// InterpMethodDataBuilder accumulates all persistent data for a compiled method
// during compilation, then finalizes it into a single contiguous allocation.
//
// Memory Layout after finalization:
// ┌────────────────────────────────────────┐
// │ InterpByteCodeStart (Method ptr) │ <- baseAddress
// ├────────────────────────────────────────┤
// │ Bytecodes (int32_t[]) │
// ├────────────────────────────────────────┤
// │ InterpMethod struct │
// ├────────────────────────────────────────┤
// │ DataItems array (void*[]) │
// ├────────────────────────────────────────┤
// │ InterpGenericLookup structs │
// ├────────────────────────────────────────┤
// │ InterpAsyncSuspendData structs │
// ├────────────────────────────────────────┤
// │ InterpIntervalMapEntry arrays │
// └────────────────────────────────────────┘
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory layout comment lists separate “InterpGenericLookup structs” storage, but the current compiler encodes InterpGenericLookup inline in the DataItems table (see GetDataItemIndex(const InterpGenericLookup&)). Consider updating the diagram/comments to reflect the actual layout, or removing the unused GenericLookups section if it’s not part of the new unified allocation plan.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants