[GSoC 2026] docs(chatbot): latency numbers by berardifra · Pull Request #68 · intelowlproject/docs

berardifra · 2026-06-24T16:45:06Z

Description

Follow-up to #65 (chatbot docs): adds the citable end-to-end latency numbers from the W10 latency
benchmark to the Fine-tuning & Prompting guide ("Choosing a model").

chatbot_tuning.md ("Choosing a model"): replaces the qualitative latency text with measured
warm numbers (no-tool ~5–6 s, tool-backed ~30–50 s, first token ~0.3–13 s) and the one-time ~70 s
cold-load. States the empirical finding that a 7B model (mistral) did not emit tool calls on
this stack (it answered with invented data) — so qwen2.5:3b is the default for tool-calling
reliability, not only speed.

The _Available from version >= 6.7.0_ availability note is already on main (added in #65), so this
PR only adds the latency numbers.

Numbers come from a live end-to-end benchmark (real Ollama, no mocks, CPU-only): qwen2.5:3b
(3.1B, Q4_K_M) vs mistral:latest (7.2B, Q4_K_M).

Refs intelowlproject/IntelOwl#3810

Checklist

The pull request is for the branch main
I added documentation related to existing work and referenced it (follow-up to [GSoC 2026] docs(chatbot): user, deployment, developer & fine-tuning guides #65; Refs [GSoC 2026] docs: add chatbot latency numbers + availability note IntelOwl#3810)
- Availability is already indicated on main (_Available from version >= 6.7.0_, added in [GSoC 2026] docs(chatbot): user, deployment, developer & fine-tuning guides #65)
N/A — this is additive documentation, not a fix, so it is opened as a PR (not pushed directly to main)

[GSoC 2026] docs(chatbot): add measured latency numbers

7a9bdc0

berardifra requested a review from mlodic June 24, 2026 16:45

berardifra changed the title ~~[GSoC 2026] docs(chatbot): latency numbers~~ [GSoC 2026] docs(chatbot): latency numbers Jun 24, 2026

mlodic approved these changes Jun 25, 2026

View reviewed changes

berardifra merged commit 4b9c8fa into main Jun 25, 2026

berardifra deleted the gsoc-2026/chatbot-latency-docs branch June 25, 2026 08:11

berardifra mentioned this pull request Jun 25, 2026

[GSoC 2026] docs: add chatbot latency numbers + availability note intelowlproject/IntelOwl#3810

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GSoC 2026] docs(chatbot): latency numbers#68

[GSoC 2026] docs(chatbot): latency numbers#68
berardifra merged 1 commit into
mainfrom
gsoc-2026/chatbot-latency-docs

berardifra commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

berardifra commented Jun 24, 2026

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants