Skip to content

docs(advance): add Add a New Speculative Decoding Method guide#4589

Open
SuperMarioYL wants to merge 1 commit into
InternLM:mainfrom
SuperMarioYL:docs/spec-decoding-new-method
Open

docs(advance): add Add a New Speculative Decoding Method guide#4589
SuperMarioYL wants to merge 1 commit into
InternLM:mainfrom
SuperMarioYL:docs/spec-decoding-new-method

Conversation

@SuperMarioYL
Copy link
Copy Markdown

Motivation

The PyTorch engine has a clean plug-in surface for speculative decoding
(BaseSpecProposer + SPEC_PROPOSERS registry in
lmdeploy/pytorch/spec_decode/proposers/base.py), and four shipped
methods register against it: eagle, eagle3, deepseek_mtp,
qwen3_5_mtp. The user-facing docs/en/advance/spec_decoding.md
teaches usage of those four names but never explains how to add a
fifth, so users have asked the question externally:

Both are open. A short extension-contract page closes the gap without
locking the engine into anything new.

Modification

Add docs/en/advance/spec_decoding_new_method.md and a toctree entry
for it in docs/en/index.rst, right next to spec_decoding.md.

The page mirrors the shape of the existing
docs/en/advance/pytorch_new_model.md (which documents the model-patch
extension contract):

  1. The registry / base class / method string triad.
  2. The build_specdecode_proposer entry point and why
    proposers/__init__.py must import the new class.
  3. What BaseSpecProposer already provides so contributors don't
    re-implement weight loading, draft forward, decoding-input update,
    or fallbacks.
  4. A minimal MyMethod(BaseSpecProposer) skeleton with
    @SPEC_PROPOSERS.register_module(name='my_method').
  5. The 3-tuple return contract for get_outputs (draft token ids,
    model_metas, target_hidden_states).
  6. When to override build_model, illustrated with the two in-tree
    precedents (Qwen3_5MTP shares the target embeddings; Eagle3
    swaps embeddings conditionally and widens
    get_target_hidden_size).
  7. A 5-item shipping checklist.

No code changes. All snippets and references point to symbols that
exist in lmdeploy/pytorch/spec_decode/proposers/.

BC-breaking

None — docs only.

Use cases

Anyone wanting to add a new draft-token proposer (e.g. the DFlash
method requested in #4530) can now read one page and know which class
to subclass, which method to implement, what to return, and where to
register.

Checklist

  • pre-commit run --files docs/en/advance/spec_decoding_new_method.md docs/en/index.rst passes (mdformat, codespell, trailing whitespace, end-of-file, copyright check).
  • Documentation only; no code touched, no new tests needed.
  • Existing supported versions unaffected.
  • Doc cross-links to spec_decoding.md and explicitly names the four shipped methods so the new page does not drift from them.

Closes (partially) the docs side of #1738 and #4530.

Document the BaseSpecProposer + SPEC_PROPOSERS extension contract so
that third parties can add a draft-token proposer without reverse
engineering the engine. The existing spec_decoding.md teaches usage
for the four shipped methods (eagle, eagle3, deepseek_mtp,
qwen3_5_mtp) but does not explain the plug-in surface; users have
asked for this in InternLM#1738 and InternLM#4530.

Contents follow the same shape as docs/en/advance/pytorch_new_model.md:
the registry / base-class / method-string triad, what BaseSpecProposer
already implements, a minimal new proposer, the get_outputs contract,
when to override build_model (with the in-tree Qwen3_5MTP and Eagle3
examples), and a 5-item shipping checklist.

Add the page to docs/en/index.rst under the Advance section right
next to spec_decoding.md.
@lvhan028 lvhan028 requested review from RunningLeon and Copilot May 18, 2026 03:14
@lvhan028 lvhan028 added the documentation Improvements or additions to documentation label May 18, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new documentation page that explains how to extend the PyTorch engine's speculative decoding pipeline with a new proposer, and wires it into the docs toctree. This addresses the docs gap referenced by issues #1738 and #4530.

Changes:

  • Adds docs/en/advance/spec_decoding_new_method.md walking through the SPEC_PROPOSERS registry, BaseSpecProposer contract, get_outputs return tuple, when to override build_model, and a contributor checklist.
  • Registers the new page in docs/en/index.rst next to spec_decoding.md.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
docs/en/advance/spec_decoding_new_method.md New guide describing the proposer plug-in contract, with examples mirrored from the in-tree deepseek_mtp, eagle3, and qwen3_5_mtp proposers.
docs/en/index.rst Adds the new doc to the advance toctree.

Verified against lmdeploy/pytorch/spec_decode/proposers/{base,deepseek_mtp,eagle3,qwen3_5_mtp}.py: registry name, build_specdecode_proposer signature, BaseSpecProposer API surface, the get_outputs 3-tuple, and the Eagle3/Qwen3_5MTP build_model overrides quoted in the doc all match the current code.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@SPEC_PROPOSERS.register_module(name='qwen3_5_mtp')
class Qwen3_5MTP(DeepseekMTP):

def build_model(self, empty_init, target_model=None, build_model_ctx=None):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one may also need to make changes in lmdeploy/pytorch/configurations and add model definition in lmdeploy/pytorch/models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants