feat: native socket I/O tracking via PLT hooks (PROF-10637)#488
Draft
feat: native socket I/O tracking via PLT hooks (PROF-10637)#488
Conversation
13 tasks
Contributor
CI Test ResultsRun: #25169060411 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-04-30 14:25:56 UTC |
d4feae7 to
7c6488b
Compare
6bec1dd to
ddc2292
Compare
a12cc1e to
44be5c7
Compare
9955e7d to
17dd7e0
Compare
Ports native socket sampling on top of MallocTracer's native_allocs pattern. Key design alignments with MallocTracer: - recordEvent passes NULL ucontext to recordSample (was: synthesized via getcontext, which confused walkVM's signal-context invariants) - Uses OS::threadId() instead of ProfiledThread::currentTid() - profiler.cpp recordSample BCI_NATIVE_SOCKET routes through walkVM with NULL ucontext (DWARF/FP + JavaFrameAnchor fallback) - Dlopen chain uses Profiler::dlopen_hook -> LibraryPatcher::install_socket_hooks (removes our custom dlopen_hook that previously clobbered Profiler::dlopen_hook) Removed: - Custom dlopen_hook and dlopen PLT patching in libraryPatcher_linux.cpp - recordEvent ucontext synthesis via getcontext() Kept: - PoissonSampler + RateLimiter (time-weighted sampling with PID) - Four hooks (send/recv/write/read) with per-thread Poisson state - fd-type cache, fd->addr cache Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tests assert the accessor for 'bytesTransferred' but I had renamed it to 'bytes' during the native_allocs rebase. Restore original field name and category/description to match existing tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rebase onto the merged-main base replaced Contexts::get()/writeContext with the writeCurrentContext helper used by sibling event writers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Allow BCI_NATIVE_SOCKET in J9/Zing walkJavaStack assert; skip cpu/wall config check when those intervals are not requested; drop Enabled/ EventFields iteration count from 5000 to 128. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Accept optional time unit on 'natsock' (e.g. natsock=100us); converted to TSC ticks at start. NativeSocketSendRecvSeparateTest uses 100us to avoid Poisson-sampling variance at the default 1ms period. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Short-lived 64KB transfers at the default 1ms Poisson period yield too few samples to reliably match the server port. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the natsock=100us override consistently to positive event tests and UdpExcluded (stronger negative). RateLimitTest and Disabled keep the 1ms default on purpose: rate-limiter PID stress and feature-off. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Instrument LivenessTracker::{stop,flush_table} and
Profiler::writeHeapUsage to pinpoint the musl JDK>=11 failure in
MemleakProfilerTest (datadog.HeapUsage empty).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
To pinpoint why _record_heap_usage is 0 at flush_table entry on musl JDK>=11 despite the command setting it in arguments.cpp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The tracker singleton early-returns on _initialized=true, skipping the assignment. On musl the test JVM is not forked per test, so the stale value made MemleakProfilerTest fail with empty datadog.HeapUsage. Also strips the debug TEST_LOG instrumentation used to diagnose it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17dd7e0 to
cecb1bd
Compare
…f/fp BCI_NATIVE_SOCKET refactor accidentally dropped the getJavaTraceAsync else-branch for cstack<CSTACK_VM. Restores the AGCT fallback (NULL ucontext -> JavaFrameAnchor) plus vthread continuation frame handling, matching the original BCI_NATIVE_MALLOC behavior from main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- libraryPatcher.h: atomic<bool> _socket_active, acquire load in install_socket_hooks - libraryPatcher_linux.cpp: per-slot patch tracking, __ATOMIC_RELEASE GOT writes, realpath before lock, _orig_* re-entry guard, drop __GLIBC__ restriction - nativeSocketSampler.h/cpp: _socket_active guards in all hooks, clearFdCache on start, generation-counter O(1) fd-type reset, char addr[64] hot-path, TOCTOU and stale-entry comments, drop __GLIBC__ guard (musl PLT patching works) - rateLimiter.h, poissonSampler.h: document accepted imprecisions - nativeSocketSampler_ut.cpp: write_hook/read_hook tests, natsock=-1us validation - NativeSocketRestartTest: stop/restart lifecycle test - NativeSocketStackTraceTest: parameterize over cstack modes - NativeSocketRateLimitTest: add upper-bound assertion - NativeSocketMacOsNoOpTest: assert profiler start succeeds - NativeSocketTestBase/DisabledTest: drop musl exclusion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… bool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On musl, libc is loaded before the profiler DSO in the link map so RTLD_NEXT finds no definition and returns NULL. Fall back to RTLD_DEFAULT which searches globally and finds libc's symbols. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cached_send (send_fn) and cached_recv (recv_fn) differ in their second parameter (const void* vs void*); chaining the assignment forced an implicit recv_fn -> send_fn conversion that clang rejects. Split into two statements to match the line below. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Intercepts libc
send,recv,write,readvia PLT hooking and recordsdatadog.NativeSocketEventJFR events with time-weighted inverse-transformsampling (Poisson process, PID rate control, target ~83 events/s / ~5000/min).
Motivation
Track blocking TCP socket I/O at the libc function level to surface socket
latency and throughput in the Datadog profiler. Netty with Java NIO transport
(the primary use case) goes through libc
send/recv/write/read, makingPLT patching the right interception point. Feature is explicitly opt-in via
the
natsockprofiler argument. Implements PROF-10637.Configuration
natsocknatsock=100usparseUnitsunit:ns,us,ms,s)The interval is the initial parameter of the time-weighted Poisson process;
the PID controller adapts it at runtime to hold the total event rate near
TARGET_EVENTS_PER_SECOND(= 83 events/s) across all four hooks combined.Architecture
flowchart LR A["Java code<br/>OutputStream.write()"] --> B["JDK native<br/>libnio/libnet"] B --> C{"PLT entry<br/>for write()"} C -- patched --> D["NativeSocketSampler::write_hook"] D --> E{"fd is TCP<br/>socket?"} E -- no --> F["orig write()"] E -- yes --> G["orig write()"] G --> H{"shouldSample<br/>(duration, op)"} H -- no --> I["return"] H -- yes --> J["recordEvent"] J --> K["JFR buffer"] F --> ISampling pipeline
flowchart TD A["Hook fires with<br/>duration_ticks"] --> B["PoissonSampler<br/>(thread_local)"] B --> C{"interval budget<br/>consumed?"} C -- no --> D["reject"] C -- yes --> E["compute weight<br/>1 / (1 - exp(-d/interval))"] E --> F["RateLimiter::accept()"] F --> G{"PID budget<br/>for this epoch?"} G -- no --> D G -- yes --> H["record event<br/>operation / fd-addr / bytes / weight"] H --> I["RateLimiter<br/>maybeUpdateInterval<br/>(PID: P=31 I=511 D=3)"] I --> J["next epoch interval"]Two
thread_local PoissonSamplerinstances per thread: outbound (send+write)and inbound (
recv+read). Both share oneRateLimiterso the ~83 events/starget is a single budget across all directions.
Hook installation
flowchart TD A["NativeSocketSampler::start"] --> B["dlsym RTLD_NEXT<br/>(RTLD_DEFAULT fallback on musl)<br/>resolve _orig_send/recv/write/read"] B --> C["patch_socket_functions"] C --> D["walk every loaded CodeCache"] D --> E{"lib imports<br/>send/recv/write/read?"} E -- yes --> F["rewrite PLT entry<br/>to *_hook"] E -- no --> G["skip"] C --> H["also patch dlopen PLT"] H --> I["dlopen_hook wraps<br/>subsequent loads"] I --> J["updateSymbols +<br/>install_socket_hooks"] J --> ELate-loaded libraries (e.g. HotSpot lazily loads
libnet.soon the firstjava.net.Socketuse) are caught by thedlopenwrapper which re-runspatch_socket_functionsafter each new library is registered.JFR event fields (
datadog.NativeSocketEvent)startTimeeventThreadstackTracedurationoperationSENDorRECVremoteAddressip:port(IPv4 or[ipv6]:port); empty if unknownbytesTransferredweight1 / P)spanIdlocalRootSpanIdsum(weight × duration)is an unbiased estimator of total socket I/O time;sum(weight × bytesTransferred)estimates total bytes.Key design notes
send/recv/write/readonly. UDP (sendto/recvfrom)and Netty's native epoll / io_uring transports are explicitly out of scope.
write/readhooks short-circuit on non-socket fds via alock-free
std::atomic<uint8_t> _fd_type_cache[65536]— one relaxed atomicload per call on the non-socket path.
handler), so
mallocandstd::mutexare safe.install_socket_hooks()(called fromdlopen_hookafterupdateSymbols) re-runspatch_socket_functionssolazily-loaded libraries (
libnet.so,libnio.so, native Netty transports)are patched the moment they appear.
stale addresses for reused fds are a known, accepted trade-off to avoid
racing with in-flight recordings.
_orig_send/recv/write/readareassigned once (when
_socket_size == 0) and are intentionally not nulled inunpatch_socket_functionsto avoid a memory-ordering race with in-flighthook invocations that may still be executing.
#if defined(__linux__)— compiled as no-op stubs onmacOS. Runs on both glibc and musl Linux; on musl
RTLD_NEXTreturns NULL(libc precedes the profiler DSO in the link map), so
RTLD_DEFAULTis usedas fallback to locate the real libc symbols. Feature is additionally disabled
in Java tests on J9/Zing via
Platform.isJ9()/Platform.isZing()guards.NativeSocketEventlives innativeSocketSampler.h(not the sharedevent.h) since it is only used byNativeSocketSampler.How to test the change?
C++ unit tests —
NativeSocketSamplerHookTestinddprof-lib/src/test/cpp/nativeSocketSampler_ut.cpp— verifies thatsend_hook/recv_hook/write_hook/read_hookeach delegate to theinstalled
_orig_*pointer and return its value.JUnit integration tests in
ddprof-test/src/test/java/com/datadoghq/profiler/nativesocket/:NativeSocketEnabledTestNativeSocketDisabledTestNativeSocketEventFieldsTestNativeSocketEventThreadTesteventThreadpopulated with the calling threadNativeSocketStackTraceTestNativeSocketRemoteAddressTestremoteAddressinip:portformatNativeSocketSendRecvSeparateTestNativeSocketBytesAccuracyTestsum(weight × duration)unbiased estimator within toleranceNativeSocketRateLimitTestweight > 1on sampled eventsNativeSocketRestartTestNativeSocketUdpExcludedTestsendto/recvfrom) produces zero eventsNativeSocketMacOsNoOpTestSpec: #487
For Datadog employees
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.