Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ public final class GeneralConfig {
public static final String TRACER_METRICS_MAX_PENDING = "trace.tracer.metrics.max.pending";
public static final String TRACER_METRICS_IGNORED_RESOURCES =
"trace.tracer.metrics.ignored.resources";

public static final String AZURE_APP_SERVICES = "azure.app.services";
public static final String INTERNAL_EXIT_ON_FAILURE = "trace.internal.exit.on.failure";

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package datadog.trace.common.metrics;

import datadog.metrics.api.Histogram;
import datadog.trace.api.Config;
import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString;
import datadog.trace.core.monitor.HealthMetrics;
import datadog.trace.util.Hashtable;
Expand Down Expand Up @@ -33,36 +34,54 @@ final class AggregateEntry extends Hashtable.Entry {

private static final UTF8BytesString[] EMPTY_TAGS = new UTF8BytesString[0];

// Sentinel substitution is disabled until per-component config is wired in a follow-up PR.
// Tests that need sentinel mode should pass useBlockedSentinel=true explicitly.
static final boolean LIMITS_ENABLED = false;

// Per-field cardinality handlers. Limits live on MetricCardinalityLimits -- see that class for
// per-field rationale.
// Per-field cardinality handlers. Limits are tunable via DD_TRACE_STATS_{field}_CARDINALITY_LIMIT
// (e.g. DD_TRACE_STATS_RESOURCE_CARDINALITY_LIMIT). Defaults live on MetricCardinalityLimits.
// Frozen at first class-load from Config.
static final PropertyCardinalityHandler RESOURCE_HANDLER =
new PropertyCardinalityHandler("resource", MetricCardinalityLimits.RESOURCE, LIMITS_ENABLED);
new PropertyCardinalityHandler(
"resource",
Config.get().getTraceStatsCardinalityLimit("resource", MetricCardinalityLimits.RESOURCE));
Comment on lines +42 to +43

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: just curious if this MetricCardinalityLimits can be an enum and no need to pass string and int const. Just enum and get the rest from it where needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we might be able to do that. The only complication is that we might have to move the enum next to Config.

static final PropertyCardinalityHandler SERVICE_HANDLER =
new PropertyCardinalityHandler("service", MetricCardinalityLimits.SERVICE, LIMITS_ENABLED);
new PropertyCardinalityHandler(
"service",
Config.get().getTraceStatsCardinalityLimit("service", MetricCardinalityLimits.SERVICE));
static final PropertyCardinalityHandler OPERATION_HANDLER =
new PropertyCardinalityHandler(
"operation", MetricCardinalityLimits.OPERATION, LIMITS_ENABLED);
"operation",
Config.get()
.getTraceStatsCardinalityLimit("operation", MetricCardinalityLimits.OPERATION));
static final PropertyCardinalityHandler SERVICE_SOURCE_HANDLER =
new PropertyCardinalityHandler(
"service_source", MetricCardinalityLimits.SERVICE_SOURCE, LIMITS_ENABLED);
"service_source",
Config.get()
.getTraceStatsCardinalityLimit(
"service_source", MetricCardinalityLimits.SERVICE_SOURCE));
static final PropertyCardinalityHandler TYPE_HANDLER =
new PropertyCardinalityHandler("type", MetricCardinalityLimits.TYPE, LIMITS_ENABLED);
new PropertyCardinalityHandler(
"type",
Config.get().getTraceStatsCardinalityLimit("type", MetricCardinalityLimits.TYPE));
static final PropertyCardinalityHandler SPAN_KIND_HANDLER =
new PropertyCardinalityHandler(
"span_kind", MetricCardinalityLimits.SPAN_KIND, LIMITS_ENABLED);
"span_kind",
Config.get()
.getTraceStatsCardinalityLimit("span_kind", MetricCardinalityLimits.SPAN_KIND));
static final PropertyCardinalityHandler HTTP_METHOD_HANDLER =
new PropertyCardinalityHandler(
"http_method", MetricCardinalityLimits.HTTP_METHOD, LIMITS_ENABLED);
"http_method",
Config.get()
.getTraceStatsCardinalityLimit("http_method", MetricCardinalityLimits.HTTP_METHOD));
static final PropertyCardinalityHandler HTTP_ENDPOINT_HANDLER =
new PropertyCardinalityHandler(
"http_endpoint", MetricCardinalityLimits.HTTP_ENDPOINT, LIMITS_ENABLED);
"http_endpoint",
Config.get()
.getTraceStatsCardinalityLimit(
"http_endpoint", MetricCardinalityLimits.HTTP_ENDPOINT));
static final PropertyCardinalityHandler GRPC_STATUS_CODE_HANDLER =
new PropertyCardinalityHandler(
"grpc_status_code", MetricCardinalityLimits.GRPC_STATUS_CODE, LIMITS_ENABLED);
"grpc_status_code",
Config.get()
.getTraceStatsCardinalityLimit(
"grpc_status_code", MetricCardinalityLimits.GRPC_STATUS_CODE));

// Single authoritative list used by resetCardinalityHandlers(). populateFrom() and hashOf() keep
// named access for readability and to avoid per-span iteration overhead; this array ensures the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,10 @@ AggregateEntry findOrInsert(SpanSnapshot snapshot) {
* {@code onStatsAggregateDropped}) rather than evicting an established one. Cap is sized to the
* steady-state working set, so eviction is rare in the common case.
*
* <p>How often this fires depends on {@link AggregateEntry#LIMITS_ENABLED}. With limits enabled,
* over-cap values for a given field collapse into a shared {@code blocked_by_tracer} bucket, so
* the table itself rarely reaches {@code maxAggregates}. With limits disabled (the default),
* over-cap values flow to distinct buckets and {@code maxAggregates} becomes the load-bearing
* <p>With per-field cardinality limits enabled, over-cap values for a given field collapse into a
* shared {@code tracer_blocked_value} bucket, so the table itself rarely reaches {@code
* maxAggregates}. Without per-field limits, over-cap values flow to distinct buckets and {@code
* maxAggregates} becomes the load-bearing
* backstop -- the cursor-resumed scan was added specifically for this regime.
*/
private boolean evictOneStale() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,10 @@ private MetricCardinalityLimits() {}

/**
* Distinct {@code resource.name} values per cycle. Highest-cardinality field by far: DB-query
* obfuscations, HTTP route templates, custom resources. Typical service: 30-200 unique.
* obfuscations, HTTP route templates, custom resources. Typical service: 30-200 unique; 1024
* leaves headroom for high-cardinality SQL/HTTP workloads without risking premature collapse.
*/
static final int RESOURCE = 256;
static final int RESOURCE = 1024;

/**
* Distinct {@code service.name} values per cycle. Local service plus downstream peer-service
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import static datadog.trace.api.DDTags.BASE_SERVICE;

import datadog.communication.ddagent.DDAgentFeaturesDiscovery;
import datadog.trace.api.Config;
import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString;
import datadog.trace.core.monitor.HealthMetrics;
import java.util.Set;
Expand Down Expand Up @@ -79,7 +80,10 @@ static PeerTagSchema of(Set<String> names, String state) {
for (int i = 0; i < names.length; i++) {
this.handlers[i] =
new TagCardinalityHandler(
names[i], MetricCardinalityLimits.PEER_TAG_VALUE, AggregateEntry.LIMITS_ENABLED);
names[i],
Config.get()
.getTraceStatsCardinalityLimit(
"peer_tag", MetricCardinalityLimits.PEER_TAG_VALUE));
}
}

Expand Down
11 changes: 11 additions & 0 deletions internal-api/src/main/java/datadog/trace/api/Config.java
Original file line number Diff line number Diff line change
Expand Up @@ -3863,6 +3863,17 @@ public int getTraceStatsInterval() {
return traceStatsInterval;
}

/**
* Returns the per-cycle cardinality limit for the named stats field, following the RFC naming
* pattern {@code DD_TRACE_STATS_{tagName}_CARDINALITY_LIMIT} (e.g. {@code
* DD_TRACE_STATS_RESOURCE_CARDINALITY_LIMIT}). The caller supplies the default from {@code
* MetricCardinalityLimits} so per-field rationale stays co-located with the defaults.
*/
public int getTraceStatsCardinalityLimit(String tagName, int defaultLimit) {
return configProvider.getInteger(
"trace.stats." + tagName + ".cardinality.limit", defaultLimit);
Comment on lines +3873 to +3874

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate stats cardinality limits before returning them

When any of the new DD_TRACE_STATS_*_CARDINALITY_LIMIT values is set to 0, a negative number, or a value above 2^29, this method returns it unchanged; the callers added in AggregateEntry and PeerTagSchema pass it directly into PropertyCardinalityHandler/TagCardinalityHandler, whose constructors throw for those ranges. In that misconfigured environment the stats classes fail initialization instead of falling back to the default/logging, so the accessor should reject invalid limits before handing them to the handlers.

Useful? React with 👍 / 👎.

}

public boolean isLogsInjectionEnabled() {
return logsInjectionEnabled;
}
Expand Down
80 changes: 80 additions & 0 deletions metadata/supported-configurations.json
Original file line number Diff line number Diff line change
Expand Up @@ -10641,6 +10641,86 @@
"aliases": []
}
],
"DD_TRACE_STATS_RESOURCE_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use int for the new config metadata type

The supported-configurations metadata uses type: "int" for the existing integral settings (including the tracer metrics limits), while these ten new cardinality entries are the only ones using "integer". The local supported-configurations validation/publishing jobs consume this file, so the new configs can be rejected or omitted by downstream metadata consumers even though the agent code reads integers; please use int consistently for these entries.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, int is used instead of integer in the file.

"default": "1024",
"aliases": []
}
],
"DD_TRACE_STATS_SERVICE_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "32",
"aliases": []
}
],
"DD_TRACE_STATS_OPERATION_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "64",
"aliases": []
}
],
"DD_TRACE_STATS_SERVICE_SOURCE_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "16",
"aliases": []
}
],
"DD_TRACE_STATS_TYPE_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "16",
"aliases": []
}
],
"DD_TRACE_STATS_SPAN_KIND_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "8",
"aliases": []
}
],
"DD_TRACE_STATS_HTTP_METHOD_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "16",
"aliases": []
}
],
"DD_TRACE_STATS_HTTP_ENDPOINT_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "64",
"aliases": []
}
],
"DD_TRACE_STATS_GRPC_STATUS_CODE_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "24",
"aliases": []
}
],
"DD_TRACE_STATS_PEER_TAG_CARDINALITY_LIMIT": [
{
"version": "A",
"type": "integer",
"default": "512",
"aliases": []
}
],
"DD_TRACE_STATUS404DECORATOR_ENABLED": [
{
"version": "A",
Expand Down
Loading