feat: Add AIConfigTracker with at-most-once tracking and resumption tokens#179
feat: Add AIConfigTracker with at-most-once tracking and resumption tokens#179mattrmc1 wants to merge 15 commits into
Conversation
jsonbailey
left a comment
There was a problem hiding this comment.
A few low-severity / cosmetic notes on the resumption token handling.
| * | ||
| * @return the resumption token, or {@code null} if not available | ||
| */ | ||
| String getResumptionToken(); |
There was a problem hiding this comment.
Consider documenting here (the producer side, where a caller decides what to do with the value) that the resumption token embeds the flag's variationKey and version — so it should be kept server-side and not exposed to untrusted clients (e.g. round-tripped through a browser), where it could leak flag targeting details.
There was a problem hiding this comment.
Thanks for adding the security note — it reads well. One follow-up: it currently lives on createTracker (the consumer side). It should be on getResumptionToken() at a minimum, since that's where a caller is holding the token and deciding where to send it — that's the point where the warning actually changes behavior. Keeping it on createTracker as well is fine (a copy on both ends doesn't hurt), but the producer side is the important one.
Summary
Implements the full
LDAIConfigTrackerinterface — previously a stub. Callers can now record AI operation metrics (duration, tokens, success/error, feedback, tool calls, judge results) with at-most-once enforcement, extract metrics from runner operations viatrackMetricsOf, and reconstruct trackers across processes via resumption tokens.Tracking methods
Records wall-clock duration. Null silently dropped (debug log); negatives clamped to zero.
trackDurationOfwraps aCallable, measures viaSystem.nanoTime(), records duration infinallyeven on exception.All-in-one wrapper: starts timer, invokes operation, stops clock before calling the extractor (slow extractors don't inflate duration). On success: prefers runner-reported
durationMsover wall-clock, then delegates totrackSuccess/trackError,trackTokens,trackToolCalls. On exception: records wall-clock duration, callstrackError, rethrows. If the extractor itself throws, operation duration is still recorded before propagating —trackErroris NOT called since the AI operation succeeded.Share a single
AtomicReference<Boolean>guard — only the first to fire wins.Validates and resolves the event name before claiming the at-most-once guard, so null/invalid input doesn't burn the slot.
Emits events for each positive count (total, input, output). All-zero usage does not consume the at-most-once slot.
Multi-fire (not at-most-once). Each call emits a separate
$ld:ai:tool_callevent.Silently dropped when not sampled, not successful, or when
metricKeyis blank/null orscoreis null/non-finite. Multi-fire.Records time-to-first-token duration. At-most-once.
Resumption tokens
getResumptionToken()returns URL-safe Base64 (no padding) JSON containing{ runId, configKey, variationKey, version, graphKey }.variationKeyandgraphKeyomitted when null. No length cap — large config keys are supported. EmptyrunId/configKeyare rejected on decode.Tracker factory wiring
LDAIClientImplnow creates realLDAIConfigTrackerImplinstances. A privatetrackerFactorymethod captures config identity and returns aSupplier<LDAIConfigTracker>producing a fresh tracker with a newrunIdon each call. Default configs also get real trackers. Default version is1.NoOpAIConfigTrackerdeleted — no longer needed.New types
FeedbackKind— enum:POSITIVE,NEGATIVE.TokenUsage— immutable record:total,input,output.AIMetrics— immutable builder:success, optionaltokens,durationMs,toolCalls.JudgeResult— immutable builder:metricKey,score,sampled,success, optionaljudgeConfigKey,reasoning,errorMessage.MetricSummary— snapshot of all tracked metrics plus resumption token.TrackData— run identity fields withtoLDValue().Thread safety
All at-most-once slots use
AtomicReference<T>.compareAndSet(null, value)— single atomic guard+value, no race window. Tool calls useCopyOnWriteArrayList.Test plan
./gradlew :lib:sdk:server-ai:testpassesLDAIConfigTrackerImplTest— duration (emit, clamp, at-most-once, null), durationOf (success + exception), success/error (emit, shared guard both directions), feedback (emit, at-most-once, null slot preservation), tokens (positive counts, zero skip, slot preservation), tool calls (multi-fire, null), judge result (sampled/success/metricKey/score guards, multi-fire), trackMetricsOf (success path, error path, extractor failure duration tracking, null AIMetrics guard), variationKey/graphKey in payload, concurrency (20-thread contention), constructor null rejectionResumptionTokensTest— encode/decode round-trips, large keys, special character escaping, null/malformed rejection, empty runId/configKey rejectionNote
Medium Risk
New public tracking API and telemetry emission change observability behavior; resumption tokens embed flag-targeting metadata if exposed to clients.
Overview
Replaces the no-op
LDAIConfigTrackerstub with a full implementation that emits LaunchDarkly custom metrics for AI runs (duration, time-to-first-token, success/error, feedback, tokens, tool calls, and judge scores).LDAIClientImplnow supplies a per-configSupplierthat createsLDAIConfigTrackerImplinstances (new UUIDrunIdpercreateTracker()), including when falling back to caller defaults.NoOpAIConfigTrackeris removed.LDAIClient#createTracker(String, LDContext)decodes a resumption token to continue the same run across requests.The expanded
LDAIConfigTrackerAPI addstrackMetricsOf,getSummary,getTrackData, andgetResumptionToken, with at-most-once semantics on most metrics (tool calls and judge results are multi-fire).LDAITrackingTypesholds the new immutable value types;ResumptionTokensencodes/decodes URL-safe Base64 JSON for run identity (docs warn tokens can expose variation key / version and should stay server-side).Reviewed by Cursor Bugbot for commit 121b140. Bugbot is set up for automated code reviews on this repo. Configure here.