fix: critical bug fixes in query extraction, FAISS search, and MCP config#425
Open
zhengdaqi wants to merge 1 commit into
Open
fix: critical bug fixes in query extraction, FAISS search, and MCP config#425zhengdaqi wants to merge 1 commit into
zhengdaqi wants to merge 1 commit into
Conversation
…nfig - Fix broken regex in r1_searcher_query_extract: unescaped `|` in `<|begin_of_query|>` was treated as regex alternation, causing the pattern to never match the intended token - Fix FAISS search returning wrong passages when fewer than top_k neighbors exist (-1 padding indices were used to index contents) - Fix remote MCP server config stored as tuple instead of dict due to trailing comma in both build() and load_pipeline_context() - Fix _surveycpm_to_one_line() duplicating content by concatenating the same text with an unnecessary recursive call - Fix evaluate_trec_pvalue crashing when metrics parameter is None (function signature allows None but loop body did not guard it) - Fix Exa web search backend silently using "EMPTY" as API key; now raises ToolError with a clear message, consistent with Tavily backend Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Six bug fixes for correctness issues that produce wrong results or crash at runtime. All fixes are non-breaking for valid inputs.
1. Broken regex in
r1_searcher_query_extract(95/100 severity)The regex pattern
r"<|begin_of_query|>([^<]*)"treats|as alternation, so it matches<ORbOReOR ... instead of the literal token<|begin_of_query|>. This completely breaks query extraction in the Search-R1 pipeline.Fix: Escape the pipes →
r"<\|begin_of_query\|>([^<]*)".2. Remote MCP server config stored as tuple (90/100)
A trailing comma turns the config into a 1-tuple
(dict,)instead of a dict:This affects both
build()andload_pipeline_context(), breaking all HTTP/HTTPS remote MCP server configurations.3. FAISS search returns wrong passages (85/100)
IndexIDMap.search()returns-1for padding when fewer thantop_kneighbors exist. The code usedself.contents[-1], silently returning the last document instead of filtering the padding.4.
_surveycpm_to_one_line()content duplication (78/100)The function concatenated content with a recursive call on the same content string, producing doubled output text:
5. Exa web search uses "EMPTY" API key silently (73/100)
When
EXA_API_KEYis not configured, the backend silently setapi_key="EMPTY", leading to confusing 401 errors at search time. Now raisesToolErrorwith a clear message immediately, consistent with the Tavily backend's behavior.6.
evaluate_trec_pvaluecrashes whenmetricsis None (72/100)The function signature allows
metrics: List[str] | None, but the loopfor m in metricsraisesTypeErrorwhenNoneis passed. Added a default metrics list.Test Plan
pytestto verify no regressions<|begin_of_query|>tokensMade with Cursor