update(link): Relax link schemas to support domain-level identifiers by xibz · Pull Request #292 · cdevents/spec

xibz · 2026-02-11T18:42:21Z

This change updates all link schemas (START, END, RELATION, and embedded variants) to allow references to either a CDEvent contextId, a domainId, or both.

Previously, links could only reference event context IDs. This limited cross-system connectivity and encouraged embedding execution identifiers in customData purely for graph reconstruction.

By allowing domainId alongside contextId:

Links can represent relationships between domain executions (e.g., pipelinerun) as well as individual events.
Connectivity metadata no longer needs to be embedded in event payloads.
Chain-first modeling constraints are relaxed, enabling relation-first graph modeling.
The change remains backward compatible.

At least one of contextId or domainId is now required for link endpoints. AdditionalProperties are restricted to prevent schema drift.

This preserves existing semantics while improving flexibility and reducing customData pollution.

afrittoli · 2026-02-17T17:57:59Z

Thanks @xibz - could you clarify the definition of domainId?

xibz · 2026-02-17T21:52:48Z

@afrittoli

The Core Problem

contextId requires the publisher to know the parent event's context ID.

But if the parent isn't a CDEvent, there is no context ID to know.

Today: GitHub doesn't emit CDEvents. So we use domainId to link to GitHub PRs explicitly.
Tomorrow: If/when GitHub emits CDEvents, we can migrate those links to contextId.
In the meantime: We have explicit causality without polluting customData.
This is a temporary bridge, not a permanent design.

Solution

Introduce a domain specific identifier which can be used to relate information

URNs will be used for domain IDs, where it follows the format of
urn:<provider>:<namespace>:<instance>:<type>:<resource id>

Examples:

GitHub PR: urn:github:xibz:repo:pr:42
Jira ticket: urn:jira:xibz:project:ticket:12345
Datadog alert: urn:datadog:prod:monitor:alert:98765

Example 1 (GH to CI)

Build event wants to link to GitHub PR

Publisher asks: "What is the GitHub PR's contextId?"
Answer: GitHub doesn't emit CDEvents. There is no contextId.
Result: Can't use contextId. Forced into customData.

How domainId solves this

{
  "context": { "id": "build-event-789" },
  "links": [
    {
      "linkType": "RELATION",
      "linkKind": "triggeredBy",
      "target": {
        "domainId": "urn:github:xibz:repo:pr:42"
      }
    }
  ]
}

Example 2 (Jira to CI)

Imagine CircleCI wants to relate a Jira ticket

Build event links to CircleCI task:
{
  "links": [
    {
      "target": {
        "contextId": "circleci-task-event-456"  // ✓ Works, circle ci emits CDEvents
      }
    }
  ]
}

CircleCI task wants to link to Jira ticket:
{
  "links": [
    {
      "target": {
        "contextId": "???"  // ✗ Jira doesn't emit CDEvents. No contextId exists.
      }
    }
  ]
}

Result: CircleCI task can't link to Jira. Forced into customData.

Example 3: Datadog Alert Triggers Rollback Pipeline

Imagine you have a Datadog alert that monitors system health during releases.
If it detects an issue, it automatically triggers a rollback pipeline.

The rollback pipeline needs to link back to the alert that triggered it:

{
  "links": [
    {
      "linkType": "RELATION",
      "linkKind": "triggeredBy",
      "target": {
        "domainId": "urn:datadog:prod:monitor:alert:98765"
      }
    }
  ]
}

Example 4 Linking to Events Without Knowing Their Context ID

Imagine a consumer (like a dashboard or audit system) receives an event and wants to query for all related events, but doesn't know their context IDs upfront.

A deployment fails. You want to find:

What build triggered it?
What git commit caused the build?
What PR introduced that commit?

Without domainId, you're stuck:

Deployment Failed event:
{
  "context": { "id": "deploy-event-999" },
  "customData": {
    "buildId": "build-789",
    "commitHash": "abc123def456",
    "prNumber": 42
  }
}

Problem: You have to parse customData and hope the IDs are there. No standardized way to query back.

With domainId, you can link forward AND backward:

Deployment Failed event:
{
  "context": { "id": "deploy-event-999" },
  "links": [
    {
      "linkType": "RELATION",
      "linkKind": "causedBy",
      "target": {
        "domainId": "urn:circleci:xibz:build:789"
      }
    },
    {
      "linkType": "RELATION",
      "linkKind": "causedBy",
      "target": {
        "domainId": "urn:github:xibz:repo:commit:abc123def456"
      }
    },
    {
      "linkType": "RELATION",
      "linkKind": "causedBy",
      "target": {
        "domainId": "urn:github:xibz:repo:pr:42"
      }
    }
  ]
}

with this:

Consumer doesn't need to know CircleCI's context ID for build-789
Consumer doesn't need to know GitHub's context ID for the commit
Consumer can query by domainId URN directly
Causality is explicit and discoverable without parsing customData

This shows that domainId isn't just for "non-CDEvent systems", but it's also useful for querying across systems when you don't have context IDs.

Why it works

Each system uses what it knows. Systems knows its own context IDs (contextId). Systems also knows how to identify triggering systems (domainId URN). No system needs to know another system's internal IDs or context IDs.

FAQS

Why link something outside CDEvents?

Because causality exists outside CDEvents.

GitHub PRs cause builds.
Jira tickets cause tasks.
Datadog alerts cause rollbacks.

If you don't link them, you lose that causality. If you can't link them with contextId (because they're not CDEvents), you're forced to hide it in customData.

domainId lets you link anything, anywhere. That's why it matters.

If you don't solve cross-domain linking, who does?

Your engineers will. They'll put it in customData. Because causality is real whether CDEvents acknowledges it or not.

Doesn't this mean CDEvents becomes a catch-all for every system?

No. domainId is a stopgap until systems emit CDEvents natively.
As more systems adopt CDEvents, those domainId links naturally
migrate to contextId links. This is a temporary bridge, not a
permanent design decision.

davidB · 2026-03-02T19:34:24Z

Thanks, I now understand your proposal better, I guess.

I'm not sure, it will be usable by the producer. Often, when they emit an event, there is no information about the cause/trigger (it's also why I'm looking for some other rules,...).
IMO, domainId and subject.id should be matchable to allow correlation. If my dashboard, SIEM,... received a CDEvent about an incident, and then received a rollback with a link with a domainId (because the rollback system doesn't take CDEvent as input), then I want to be able to link both events to the same "incident".

Side questions: Is "links" just for "tigger", "causeby", or can it be used to define other types of relation? (eg for a test to define what the system under test is (a source, a change, an artifact), in which context (ci, environment), triggered by what (a scheduler, a change, a deployment, ...)

xibz · 2026-03-02T19:49:16Z

@davidB

Not every producer will always know the trigger or cause at emission time.

The proposal does not require links to always be present. It simply provides a standardized way to express causality when it is known.

If a producer does not know the trigger, it emits no link.

The key difference is:

Today:
When causality is known, it goes into customData in a tool-specific way.

With domainId:
When causality is known, it is expressed in a standard, traversable way.

This proposal does not require perfect causality capture. It enables correct modeling when information exists.

Regarding domainId and subject.id correlation

Correlation is exactly one of the motivations.

The intention is that domainId represents the canonical identity of an entity within its domain. subject.id is too flexible, hence the strict URN format.

If a system later emits a native CDEvent for that entity, the subject.id of that event should correspond to the same logical identifier represented in the domainId.

This allows dashboards, SIEM systems, and audit systems to correlate across both:
• native CDEvents (via contextId)
• API calls with no context.id (via domainId)

domainId is not meant to replace subject.id, but to provide a stable cross-domain reference when contextId is unavailable.

Are links only for trigger/cause?

Links are not limited to trigger/cause relationships.

They are intended to model typed relationships between entities.

Examples include:
• triggeredBy
• causedBy
• derivedFrom
• produced
• testedAgainst
• runsIn
• deployedTo
• dependsOn

The goal is not only causality modeling, but explicit relationship modeling.

This allows us to describe:
• what system was under test
• what artifact was deployed
• what environment was targeted
• what scheduler triggered execution

davidB · 2026-03-10T16:15:29Z

About the urn, after some search, the pre-accepted proposal about converting the subject.id to a global id (#252), and the fact that the purpose of domainId is to link a cause like if it was tiggered by a CDEvent. I propose the format: urn:cdevents:<subjectType>:<percentEncoded(subjectId)>

Provider is always cdevents because in the urn format urn:<NID>:<NSS>, NSS (Namespace Specific String) is unique in the NID (Namespace Identifier) and is defined by the owner of the NID. The NID may be registered (in the future) at https://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml so provider can not be datadog, github,...

An alternative can be to not use domainId: <subjectUrn> but to accept <subjectType>Id: <subjectId> and be consistent with how we reference subject inside subject.content (today we only have artifactId, but the proposal (#252) included to change reference using the format <subjectType> : { id: <subjectId>, source: <subjectSource> } by <subjectType>Id: <subjectId>)

xibz · 2026-03-10T16:32:05Z

urn:cdevents::<percentEncoded(subjectId)>
I like that, but things may not line up appropriately if the trigger is not a CDEvent. So I think we should leave it flexible, but, if a CDEvent is known, then they can follow that format.

How does that sound @davidB?

davidB · 2026-03-10T16:32:25Z

The schema of the Link is complex (IMO) (over?):

embedded vs non-embedded (with different requirements)
split between "END" / "PATH" / "RELATION"

Maybe, it's also the opportunity to review and simplify it

xibz · 2026-03-10T16:37:06Z

@davidB I think the complexity here is coming from the model trying to represent two genuinely different situations, rather than complexity for its own sake.

The embedded vs non-embedded split exists for a specific recovery case: sometimes the event graph becomes disconnected, and we need a way to reconnect it after the fact.

For example, imagine system A is now disconnected from B, and B and C are also disconnected from each other. In that situation, we may want to go back and repair those connections manually so the graph reflects the real flow again. Non-embedded links exist for that purpose. They let us express a connection even when we are not embedding or directly referencing a concrete CDEvent in the normal path structure.

That is also why the distinction between END, PATH, and RELATION is intentional.

END and PATH are meant to describe structural, navigable links in the event graph. RELATION is different: it is meant to describe a semantic relationship outside of that strict path model. I do not want to overload RELATION to cover everything, because then we lose an important distinction in the graph semantics.

In particular, when a link uses domainId, the implication is that we do not know the concrete CDEvent on the other side. At that point, it is not really the same thing as a direct CDEvent-to-CDEvent path. Preserving that distinction is important if we want the graph to carry meaning, rather than just storing generic connections.

So from my perspective, the schema is trying to model two different truths:

we know the exact event connection

we only know the semantic or domain-level connection

Those cases look similar at first glance, but collapsing them into one shape or one relation type would blur semantics that are important for reconstructing and interpreting the graph correctly.

That said, I do think it is fair to ask whether the current shape is the simplest way to express that distinction. If there is a cleaner way to preserve those semantics without losing the separation, I would be very open to reviewing it.

davidB · 2026-03-10T16:43:41Z

urn:cdevents::<percentEncoded(subjectId)>
I like that, but things may not line up appropriately if the trigger is not a CDEvent. So I think we should leave it flexible, but, if a CDEvent is known, then they can follow that format.

How does that sound @davidB?

You forgot the subjectType: urn:cdevents:<subjectType>:<percentEncoded(subjectId)> mandatory to help classify/scope the subjectId.

The idea is to work with not existing CDEvent like existing. If we reuse your 3 samples (github, jira, datadog), they match existing subjectType (generated or not by CDEvent)

GitHub PR: urn:github:xibz:repo:pr:42
-> urn:cdevents:change:<percentEncoded("https://github.com/xibz/repo/pull/42")>
Jira ticket: urn:jira:xibz:project:ticket:12345
-> urn:cdevents:ticket:<percentEncoded("https://xbiz.jira.com/PRO-12345")> (I don't remember the uri of of ticket it depends of plan, cloud/premise,...)
Datadog alert: urn:datadog:prod:monitor:alert:98765
-> urn:cdevents:incident:<percentEncoded(api url of the indicident)>

xibz · 2026-03-10T16:47:08Z

The idea is to work with not existing CDEvent like existing. If we reuse your 3 samples (github, jira, datadog), they match existing subjectType (generated or not by CDEvent)

For known CDEvents subject types, I agree. But if the subject is outside CDEvents, modeling it as urn:cdevents:* is semantically confusing, because it implies a CDEvents classification rather than just providing an identifier.

davidB · 2026-03-10T16:58:17Z

If the subject is outside CDEvents, we can encourage to create a "custom" one using a dotted notation, like for for custom events.
incident is a shortcut for dev.cdevents.incident. But no shortcut for custom. WDYT?

My issue is that without subjectType, we lost information and a scope for the identifier.
TBH, without subjectType, I will have to fallback to customData or tags to create relation by example between a testsuiterun and a set of services or artifacts, an environment, a pipelinerun.

EDIT: without subjectType, we don't need urn; we can just use the uri/id and let the consumer handle it (to guess what it is).

xibz · 2026-03-10T17:45:04Z

Using a common URN format helps systems parse identifiers consistently. But if the subject is external to CDEvents, the producer still has to decide how to classify it. That means two producers may assign different CDEvents subject types to the same thing. In that case, the format is standardized, but the meaning is not.

xibz · 2026-03-10T17:48:36Z

I will have to fallback to customData or tags to create relation by example between a testsuiterun and a set of services or artifacts, an environment, a pipelinerun.

That is a worst-case scenario. This solution is not meant to be foolproof when producers are unable to provide all of the relevant information. That is part of the purpose of domainId: to help create connections where they may not otherwise be obvious.

If subjectType is known, producers can provide it as additional context for known CDEvent types, and that is a constraint we could add. For non-CDEvent types, though, I am not convinced it provides the same value.

I would also argue that, for known CDEvent subject types, the domainId should already be sufficient to derive the subject ID in a consistent way. I am less convinced that this works well for custom types. And more broadly, your proposal ends up mirroring subjectId almost identically. I am not sure we gain enough by making the representation that close.

davidB · 2026-03-10T18:01:01Z

If subjectType is unknown, we could be explicit about it.
urn:cdevents:<subjectType|"unknown">:<percentEncoded(subjectId)> ?

davidB · 2026-03-17T17:13:03Z

+        "contextType": {
+          "type": "string",
+          "minLength": 1
+        },
        "contextId": {
          "type": "string",
          "minLength": 1
+        },
+        "subjectId": {
+          "type": "string",
+          "minLength": 1


What we need is subject'type (the fragment from a context.type).
I though talk providing a struct instead of a flatten fields like

"target": { "subject": { "type": "pipelinerun", "id": "https://....." } }

EDIT:
FYI I started to use a similar "links" as part of customData. (I hope using the similar approach with ease the upgrade when release and support of both locations by consumer).
see https://github.com/cdviz-dev/transformers-community/blob/main/RULES.md#use-customdata

This change updates all link schemas (START, END, RELATION, and embedded variants) to allow references to either a CDEvent contextId, a domainId, or both. Previously, links could only reference event context IDs. This limited cross-system connectivity and encouraged embedding execution identifiers in customData purely for graph reconstruction. By allowing domainId alongside contextId: - Links can represent relationships between domain executions (e.g., pipelinerun) as well as individual events. - Connectivity metadata no longer needs to be embedded in event payloads. - Chain-first modeling constraints are relaxed, enabling relation-first graph modeling. - The change remains backward compatible. At least one of contextId or domainId is now required for link endpoints. AdditionalProperties are restricted to prevent schema drift. This preserves existing semantics while improving flexibility and reducing customData pollution. Signed-off-by: xibz <bjp@apple.com>

xibz requested a review from a team as a code owner February 11, 2026 18:42

davidB reviewed Mar 17, 2026

View reviewed changes

xibz force-pushed the links branch 2 times, most recently from e3ab50a to fa2fa86 Compare April 16, 2026 21:30

xibz force-pushed the links branch 2 times, most recently from a1c4db4 to 228a20e Compare April 28, 2026 15:33

xibz force-pushed the links branch from 228a20e to 249e081 Compare April 29, 2026 19:13

Conversation

xibz commented Feb 11, 2026

Uh oh!

afrittoli commented Feb 17, 2026

Uh oh!

xibz commented Feb 17, 2026

The Core Problem

Solution

Example 1 (GH to CI)

Example 2 (Jira to CI)

Example 3: Datadog Alert Triggers Rollback Pipeline

Example 4 Linking to Events Without Knowing Their Context ID

Why it works

FAQS

Uh oh!

davidB commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xibz commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidB commented Mar 10, 2026

Uh oh!

xibz commented Mar 10, 2026

Uh oh!

davidB commented Mar 10, 2026

Uh oh!

xibz commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidB commented Mar 10, 2026

Uh oh!

xibz commented Mar 10, 2026

Uh oh!

davidB commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xibz commented Mar 10, 2026

Uh oh!

xibz commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidB commented Mar 10, 2026

Uh oh!

davidB Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidB commented Mar 2, 2026 •

edited

Loading

xibz commented Mar 2, 2026 •

edited

Loading

xibz commented Mar 10, 2026 •

edited

Loading

davidB commented Mar 10, 2026 •

edited

Loading

xibz commented Mar 10, 2026 •

edited

Loading

davidB Mar 17, 2026 •

edited

Loading