Skip to content

feat: record user agent and best effort client#158

Open
pawbana wants to merge 3 commits intomainfrom
pb/user-agent
Open

feat: record user agent and best effort client#158
pawbana wants to merge 3 commits intomainfrom
pb/user-agent

Conversation

@pawbana
Copy link
Contributor

@pawbana pawbana commented Jan 29, 2026

Adds raw user agent value and heuristically guessed client name to interception recording.

Fixes: #31

Copy link
Contributor Author

pawbana commented Jan 29, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@matifali
Copy link
Member

matifali commented Jan 30, 2026

This targets #31 and not #20. Let's keep #20 open till we have a complete system of identification in place. Thanks :)

@pawbana pawbana force-pushed the pb/user-agent branch 2 times, most recently from fa711aa to b16f48a Compare February 3, 2026 17:49
@pawbana pawbana changed the title feat: add user agent recording feat: record user agent and best effort client Feb 3, 2026
@pawbana pawbana marked this pull request as ready for review February 3, 2026 17:57
Copy link
Collaborator

What more would you need to see added for this to be considered "complete system of identification" @matifali?

Copy link
Collaborator

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@matifali
Copy link
Member

matifali commented Feb 4, 2026

What more would you need to see added for this to be considered "complete system of identification" @matifali?

I think when

  1. This gets emitted in logs (if not already)
  2. Request Logs View
  3. Prometheus metrics
  4. Anything I missed?

@matifali
Copy link
Member

matifali commented Feb 4, 2026

What more would you need to see added for this to be considered "complete system of identification" @matifali?

See my commnet on issue: #20 (comment)

Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

getResponseIDFunc: getAnthropicResponseID,
createRequest: createAnthropicMessagesReq,
expectedMsgID: "msg_01Pvyf26bY17RcjmWfJsXGBn",
userAgent: "GitHubCopilotChat/0.37.2026011603",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍 This is the one used for Copilot in VSCode.
nit: maybe you could also add a test for Copilot CLI, user agent is usually something like User-Agent: copilot/0.0.403 (client/cli linux v24.11.1)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good call; we probably want to distinguish the CLI and VS Code versions (and others).

Model string
Provider string
StartedAt time.Time
UserAgent string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we already have Client do we need to store this? Wouldn't it be possible to add it to Metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider UserAgent a separate entity from Metadata.

The way we store it in coder DB (as part of metadata) is an implementation detail. If AI Bridge would be part of coder/coder repo then yes it would make sense to add it to metadata here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider UserAgent a separate entity from Metadata.

Why is that? I'm not too familiar with how metadata is used in aibridge.
I think storing client as a separate field makes sense since it's data we validate and generate ourselves. But the raw user agent header is just additional context that may not always be relevant (e.g., when client is "unknown").
Since the user agent ends up being added to metadata in coder anyway, couldn't we just add it directly to metadata here and skip the extra field/proto changes?

This is a non-blocking comment: I'm not saying this is wrong, just trying to understand the reasoning 🙂

Copy link
Contributor Author

@pawbana pawbana Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'm open to adding user agent as a field to metadata and I agree that it would simplify a bit code in coder PRs but since we have separate repos I'd like to maintain "neutrality" and rather not think about coder implementation details in AI Bridge PRs. Maybe I'm a bit too strict with this approach.

I think this approach makes it more flexible. If AI Bridge user wants to store user-agent in metadata it is possible (how later PRs are doing) if they rather store it differently it is also possible without deleting things from metadata. With separate fields there is clear boundary Metadata == what was set in AsActor.Metadata. If user would like to have it in metadata it could be added sooner in AsActor. Maybe this would be better approach (although then each user would have to extract user-agent himself)?.
Now that I look at it, it seems a bit unnecessary that metadata is simply going back and forth but I think there will be some use for it later.

ClientRoo = "Roo Code"
ClientCursor = "Cursor"
ClientUnknown = "Unknown"
)
Copy link
Member

@matifali matifali Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add cline and JetBrains to the list?

There is some investigation here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately for both cline and JetBrains user agents are not useful.
If I understand correctly both are from some JS and Kotlin library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to assume that ktor-client will always be JetBrains

Ideally, we should allow folks to specify their own mapping somewhere in the UI, but that's for another time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm too defensive here but I think we should set client only when we have very high confidence the client is correct, so when user agent (or some other header) uniquely identifies application.

My assumption is that it would be better to have as little false positives as possible, even with the cost of not recording JetBrains 90% of the time correctly. For investigating there is fallback with raw user agent being added to metadata if needed.

Comment on lines +335 to +338
case strings.HasPrefix(userAgent, "kilo-code/") || originator == "kilo-code":
return ClientKilo
case strings.HasPrefix(userAgent, "roo-code/") || originator == "roo-code":
return ClientRoo
Copy link
Member

@matifali matifali Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Provider is Anthropic Both kilo code and Roo code have a different user-agent. Please see: #20 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Anthropic cases user agents seem to not be useful as they look like they originate from JS library. If originator header is set then it should still work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If originator header is set then it should still work.

Unfortunatly this is also only set for OpenAI :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Collect user agent as metadata for interceptions

4 participants