Skip to content

[AzureMonitorAutoConfigure] Add customer-facing SDKStats metrics (Item_Success_Count, Item_Dropped_Count, Item_Retry_Count)#48077

Open
rajkumar-rangaraj wants to merge 10 commits intomainfrom
rajrang/exporterCustomerSdkStats
Open

[AzureMonitorAutoConfigure] Add customer-facing SDKStats metrics (Item_Success_Count, Item_Dropped_Count, Item_Retry_Count)#48077
rajkumar-rangaraj wants to merge 10 commits intomainfrom
rajrang/exporterCustomerSdkStats

Conversation

@rajkumar-rangaraj
Copy link
Contributor

Description

PR: Add customer-facing SDKStats metrics (Item_Success_Count, Item_Dropped_Count, Item_Retry_Count)

Implements customer-facing SDKStats per spec. The exporter now tracks per-telemetry-type success, drop, and retry counts and periodically exports them as Metric TelemetryItems through the existing pipeline to the customer's own Application Insights resource.


New files

Production

File Purpose
CustomerSdkStats.java Thread-safe accumulator (ConcurrentHashMap<Key, AtomicLong>) for three counter families: successCounts, droppedCounts, retryCounts. collectAndReset() atomically snapshots and clears all counters, returning a list of TelemetryItem metrics with the correct dimensions (computeType, language, version, telemetry_type, telemetry_success, drop.code/drop.reason, retry.code/retry.reason). Static factory create(version) auto-detects the resource provider.
CustomerSdkStatsTelemetryType.java Maps TelemetryItem.getName() → spec dimension strings: RequestREQUEST, RemoteDependencyDEPENDENCY, MessageTRACE, ExceptionEXCEPTION, MetricCUSTOM_METRIC, EventCUSTOM_EVENT, PageViewPAGE_VIEW, AvailabilityAVAILABILITY. Returns null for internal items (e.g. Statsbeat) to skip counting.
CustomerSdkStatsExceptionCategory.java Classifies exceptions into low-cardinality reason strings for drop.reason/retry.reason: "Timeout exception", "Network exception", "Storage exception", "Client exception". Traverses the cause chain (depth ≤ 10). Also provides isTimeout() to choose between CLIENT_TIMEOUT and CLIENT_EXCEPTION retry codes.
CustomerSdkStatsTelemetryPipelineListener.java TelemetryPipelineListener that routes pipeline responses to the accumulator: 200→success, retryable (401/403/408/429/500/502/503/504)→retry, redirect (307/308)→skip, all others→drop. onException categorizes the throwable and records a retry. Includes getReasonPhraseForStatusCode() for common HTTP status codes.

Tests

File # Tests
CustomerSdkStatsTest.java 10 (accumulation, reset, concurrent increments, telemetry_success split)
CustomerSdkStatsTelemetryTypeTest.java 10 (all 8 mappings + Statsbeat→null + unknown→null)
CustomerSdkStatsExceptionCategoryTest.java 9 (timeout, network, storage, client, null, wrapped exceptions)
CustomerSdkStatsTelemetryPipelineListenerTest.java 9 (success, retry 429/500, drop 402, timeout/network exception, empty skip, redirect skip, reason phrases)
Total 38 tests

Sample

File Purpose
SimpleWebAppSample.java Long-running web app with 5 endpoints and 3 test modes (success/drop/retry). In drop/retry modes, a built-in mock ingestion server on port 9090 returns the configured error status and prints all gunzipped payloads to the console, tagged [SDKStats] when applicable.
SimpleWebAppSample-README.md Step-by-step manual execution guide with env var setup, endpoint table, Kusto query, and expected console output.

Modified files

File Change
TelemetryItemExporter.java Added computeItemCountMetadata() to compute per-type item counts (with success/failure split for REQUEST/DEPENDENCY) before serialization. Added sendWithoutTracking() to send items without triggering recursive customer SDKStats counting. Updated internalSendByBatch() to pass item count maps to the pipeline.
TelemetryPipelineRequest.java Added itemCountsByType, successItemCountsByType, failureItemCountsByType fields + overloaded public constructor + getters. The original constructor delegates with empty maps.
TelemetryPipelineResponse.java Made constructor public (was package-private) for test access.
TelemetryPipeline.java Added overloaded send() accepting item count maps; original send() delegates with empty maps. Passes maps into TelemetryPipelineRequest.
AzureMonitorHelper.java createTelemetryItemExporter() now accepts CustomerSdkStats, creates a CustomerSdkStatsTelemetryPipelineListener, and wires it into the composite listener chain (alongside DiagnosticTelemetryPipelineListener and LocalStorageTelemetryPipelineListener).
AzureMonitorExporterBuilder.java Added SDKSTATS_DISABLED_ENV_VAR, SDKSTATS_EXPORT_INTERVAL_ENV_VAR constants. Added createCustomerSdkStats() and startCustomerSdkStats() with a ScheduledExecutorService (daemon thread, default 900s interval) that calls collectAndReset() and sendWithoutTracking().

Configuration

Environment Variable Default Description
APPLICATIONINSIGHTS_SDKSTATS_DISABLED false Set to true to disable customer SDKStats entirely.
APPLICATIONINSIGHTS_SDKSTATS_EXPORT_INTERVAL 900 (15 min) Export interval in seconds.

Metric details

Metric Name Dimensions When counted
Item_Success_Count computeType, language, version, telemetry_type HTTP 200 from ingestion
Item_Dropped_Count computeType, language, version, telemetry_type, drop.code, drop.reason, telemetry_success Non-retryable HTTP status (e.g. 400, 402, 404)
Item_Retry_Count computeType, language, version, telemetry_type, retry.code, retry.reason Retryable HTTP status (401, 403, 408, 429, 500-504) or client exception/timeout

Validation

  • Unit tests: 38/38 passing
  • Code formatting: spotlessApply clean
  • Compilation: Module compiles successfully
  • E2E success mode: Item_Success_Count visible in Azure Monitor customMetrics table with correct dimensions
  • E2E drop mode: Item_Dropped_Count visible in mock server console with drop.code=400, drop.reason="Bad request", per-type/per-success breakdown
  • E2E retry mode: Mock infrastructure ready (mock server returns 500)

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings February 24, 2026 00:02
@github-actions github-actions bot added the Monitor - Autoconfigure Monitor OpenTelemetry Autoconfigure label Feb 24, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds customer-facing “SDKStats” metric telemetry (success/drop/retry item counts) to the Azure Monitor OpenTelemetry autoconfigure exporter, along with unit tests and a manual sample for validation.

Changes:

  • Introduces CustomerSdkStats accumulator + telemetry type/exception categorization + pipeline listener to record per-type success/drop/retry counts.
  • Plumbs per-batch item-count metadata through TelemetryItemExporterTelemetryPipeline/Request, and adds sendWithoutTracking() for exporting SDKStats metrics without recursion.
  • Adds unit tests for the new components plus a long-running sample + manual run guide.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStats.java New thread-safe counters + collectAndReset() producing Metric TelemetryItems.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsExceptionCategory.java New exception categorization + timeout detection for retry dimensions.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsTelemetryType.java Maps TelemetryItem.getName() to SDKStats telemetry_type dimension values.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsTelemetryPipelineListener.java Observes pipeline responses/exceptions and increments the accumulator.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/pipeline/TelemetryItemExporter.java Computes per-batch item-count metadata + adds sendWithoutTracking().
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/pipeline/TelemetryPipeline.java Adds overload to pass item-count metadata into requests.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/pipeline/TelemetryPipelineRequest.java Stores item-count metadata maps and exposes getters.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/pipeline/TelemetryPipelineResponse.java Makes constructor public for test construction.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/utils/AzureMonitorHelper.java Wires CustomerSdkStatsTelemetryPipelineListener into the exporter listener chain.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/main/java/com/azure/monitor/opentelemetry/autoconfigure/AzureMonitorExporterBuilder.java Creates/starts periodic export of SDKStats metrics (env-var configurable).
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/test/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsTest.java Unit tests for accumulation/reset/concurrency.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/test/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsExceptionCategoryTest.java Unit tests for exception categorization/timeout detection.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/test/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsTelemetryTypeTest.java Unit tests for telemetry type mapping and null/unknown handling.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/test/java/com/azure/monitor/opentelemetry/autoconfigure/implementation/statsbeat/CustomerSdkStatsTelemetryPipelineListenerTest.java Unit tests for response/exception classification + reason phrases.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/samples/java/com/azure/monitor/opentelemetry/autoconfigure/SimpleWebAppSample.java Manual sample app + mock ingestion server to validate metric emission.
sdk/monitor/azure-monitor-opentelemetry-autoconfigure/src/samples/java/com/azure/monitor/opentelemetry/autoconfigure/SimpleWebAppSample-README.md Step-by-step manual guide for the sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Monitor - Autoconfigure Monitor OpenTelemetry Autoconfigure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants