Skip to content

Refresh ARM FDPIC ldso with XIP-aware relocation#17

Merged
jserv merged 1 commit intomainfrom
fdpic-ldso
May 4, 2026
Merged

Refresh ARM FDPIC ldso with XIP-aware relocation#17
jserv merged 1 commit intomainfrom
fdpic-ldso

Conversation

@jserv
Copy link
Copy Markdown
Owner

@jserv jserv commented May 4, 2026

FDPIC dynamic linker assumed mutable text and walked every relocation through aa callback dispatch, both wasteful on a Cortex-M XIP target where library text lives in flash and lazy binding eats RAM that userspace needs.

0021 reworks the loader so XIP is the default path:

  • inline loadmap in struct elf_resolve for the common nsegs <= 2 case
  • per-module chunked function-descriptor allocator with intrusive hashtable promoted on a 16-entry threshold
  • ARM eager binding via DL_FORCE_BIND_NOW
  • XIP classification (DL_FDPIC_XIP_TEXT / DL_FDPIC_MUTABLE_TEXT) gated by p_align with a cached read-only loadseg bitmask and overflow fallback for modules with more PT_LOADs than fit in the bitmask
  • relocation-time XIP write guards on per-relocation and batched R_ARM_RELATIVE paths
  • direct ARM relocation parser in place of per-reloc callback dispatch
  • AUX_MAX_AT_ID raised to 45 so AT_FDPIC_EXEC_MAP / AT_FDPIC_INTERP_MAP are reachable in _dl_auxvt
  • arm_fdpic_find_loadseg honors one-past-end on the last segment so _end-style symbol relocation works
  • per-segment batching in elf_machine_relative

0022 closes follow-up review findings without changing the policy:

  • broaden the writable-text fallback gate so the mprotect / map_writeable path fires for any XIP demotion (DT_TEXTREL or xip_mapped_text loss), not only DT_TEXTREL. Prevents a non-DT_TEXTREL XIP-incompatible object from relocating against a read-only mapping.
  • replace base + p_memsz end-address arithmetic with subtraction-based bounds (offset = addr - base; offset < memsz) in arm_fdpic_find_loadseg{,_runtime}, arm_addr_is_readonly_load_slow, and arm_abort_xip_text_reloc, defeating wraparound on malformed inputs that could route a write past both the XIP guard and the fallback. The last-segment one-past-end allowance is preserved via a small helper.
  • classify FDPIC objects with no executable PT_LOAD as DL_FDPIC_MUTABLE_TEXT instead of leaving them unclassified, so callers always see a defined text mode.
  • document that DL_FDPIC_STRICT_XIP() must expand to a preprocessor constant since it is consumed by both a runtime if and a #if gate.

FDPIC dynamic linker assumed mutable text and walked every relocation
through aa callback dispatch, both wasteful on a Cortex-M XIP target
where library text lives in flash and lazy binding eats RAM that
userspace needs.

0021 reworks the loader so XIP is the default path:
- inline loadmap in struct elf_resolve for the common nsegs <= 2 case
- per-module chunked function-descriptor allocator with intrusive
  hashtable promoted on a 16-entry threshold
- ARM eager binding via DL_FORCE_BIND_NOW
- XIP classification (DL_FDPIC_XIP_TEXT / DL_FDPIC_MUTABLE_TEXT) gated
  by p_align with a cached read-only loadseg bitmask and overflow
  fallback for modules with more PT_LOADs than fit in the bitmask
- relocation-time XIP write guards on per-relocation and batched
  R_ARM_RELATIVE paths
- direct ARM relocation parser in place of per-reloc callback dispatch
- AUX_MAX_AT_ID raised to 45 so AT_FDPIC_EXEC_MAP / AT_FDPIC_INTERP_MAP
  are reachable in _dl_auxvt
- arm_fdpic_find_loadseg honors one-past-end on the last segment so
  _end-style symbol relocation works
- per-segment batching in elf_machine_relative

0022 closes follow-up review findings without changing the policy:
- broaden the writable-text fallback gate so the mprotect / map_writeable
  path fires for any XIP demotion (DT_TEXTREL or xip_mapped_text loss),
  not only DT_TEXTREL.  Prevents a non-DT_TEXTREL XIP-incompatible
  object from relocating against a read-only mapping.
- replace base + p_memsz end-address arithmetic with subtraction-based
  bounds (offset = addr - base; offset < memsz) in
  arm_fdpic_find_loadseg{,_runtime}, arm_addr_is_readonly_load_slow,
  and arm_abort_xip_text_reloc, defeating wraparound on malformed
  inputs that could route a write past both the XIP guard and the
  fallback.  The last-segment one-past-end allowance is preserved
  via a small helper.
- classify FDPIC objects with no executable PT_LOAD as
  DL_FDPIC_MUTABLE_TEXT instead of leaving them unclassified, so
  callers always see a defined text mode.
- document that DL_FDPIC_STRICT_XIP() must expand to a preprocessor
  constant since it is consumed by both a runtime if and a #if gate.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Subsystem Rollup

Resident .text: 698,884 bytes

Top 12 buckets by resident .text:

bucket bytes % symbols icf lto clones
kernel 173,128 24.77 2244 0 64
fs 153,306 21.94 1643 0 27
lib 108,010 15.45 888 0 7
mm 79,620 11.39 810 0 19
drivers/tty 40,268 5.76 399 0 3
drivers/base 34,486 4.93 605 0 11
<icf-merged> 24,496 3.51 243 243 7
drivers/clk 20,780 2.97 315 0 2
drivers/of 17,982 2.57 253 0 1
arch/arm 15,208 2.18 228 0 21
include 9,620 1.38 321 0 275
scripts 7,432 1.06 97 0 1

Budget gate: ok

bucket actual limit band % delta status
<icf-merged> 24496 28000 5.0 -3504 ok
fs 153306 165000 2.0 -11694 ok
kernel 173128 200000 2.0 -26872 ok
kernel/printk 17574 18500 2.0 -926 ok
kernel/sched 21568 23500 2.0 -1932 ok
lib 108010 116000 2.0 -7990 ok
lib/lz4 0 0 0.0 +0 ok
lib/xz 0 0 0.0 +0 ok
lib/zstd 0 0 0.0 +0 ok
mm 79620 86000 2.0 -6380 ok

Source: profiles/kernel-pgo/none/subsystem-rollup.txt

@jserv jserv merged commit 8496b68 into main May 4, 2026
2 checks passed
@jserv jserv deleted the fdpic-ldso branch May 4, 2026 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant