Skip to content

get_diff / get_files: JSON serialization inflates diff payload far beyond token limits #2242

@Lunatik-006

Description

@Lunatik-006

Problem

pull_request_read methods get_diff and get_files return diff content as JSON-encoded strings. A unified diff is already a text format, but wrapping it in a JSON string escapes every \n, \", \t, etc. — inflating the payload several times over the raw text size.

Real-world example

A PR with 2,920 lines of diff (~73 KB of raw text) produces a 125 KB+ JSON response, exceeding the MCP token limit. The tool returns an error instead of the diff.

This is a ~1,700-line PR (additions + deletions) — not unusually large. It includes some generated files (src/generated/graphql.ts), but even without them the JSON overhead makes moderate PRs hit the limit.

Why this matters

  • The token limit is hit not because the diff is large, but because JSON serialization inflates it
  • get_files has the same problem — each file's patch field is a JSON-encoded diff string
  • This forces users to fall back to gh pr diff via shell, losing the benefit of MCP

Suggestion

Return diff content as raw text rather than a JSON-wrapped string, or use a more efficient serialization that doesn't escape every newline. This would let get_diff handle PRs several times larger than it can today with no other changes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions