Skip to content

Upgrade RagFlow to latest v0.24.1#16

Open
JasleenKaurSethi wants to merge 336 commits intomainfrom
upgrade-0.24.1
Open

Upgrade RagFlow to latest v0.24.1#16
JasleenKaurSethi wants to merge 336 commits intomainfrom
upgrade-0.24.1

Conversation

@JasleenKaurSethi
Copy link

What problem does this PR solve?

Upgrade to latest v0.24.0 upstream Ragflow

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • [ x] New Feature (non-breaking change which adds functionality)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

JimmyBenKlieve and others added 30 commits January 13, 2026 09:41
### What problem does this PR solve?

Trailing white-spaces in commit 6814ace
got automatically trimmed by code editor may causes documentation
typesetting broken.

Mostly for double spaces for soft line breaks.  

### Type of change

- [x] Documentation Update
### What problem does this PR solve?

infiniflow#12558
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
…ow#12565)

### What problem does this PR solve?

1. PaddleOCR PDF parser supports thumnails and positions.
2. Add FAQ documentation for PaddleOCR PDF parser.


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
Fixes infiniflow#12440
### What problem does this PR solve?
The current implementation uses `python3 -m pip` which can fail in
certain environments. This change leverages `uv pip install` instead,
which aligns with the project's existing tooling.

### Type of change
- Removed the ensurepip line (not needed since uv manages pip)
- Changed python3 to "$PY" for consistency with the rest of the script
- Changed python3 -m pip install to uv pip install

Co-authored-by: Gongzi <gongzi@192.168.0.100>
Otherwise, slide files cannot be opened in Chat module

### What problem does this PR solve?

Backend Reason (API): In the api/utils/web_utils.py file of the backend,
the CONTENT_TYPE_MAP dictionary is missing ppt and pptx.
MIME type mapping. This means that when the frontend requests a PPTX
file, the backend cannot correctly inform the browser that it is a PPTX
file, resulting in the file being displayed incorrectly.
Type identification error.

### Type of change

-  Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…2527)

### What problem does this PR solve?

Fix zip extraction vulnerabilities:
   - Block symlink entries in zip files.
   - Reject encrypted zip entries.
   - Prevent absolute path attacks (including Windows paths).
   - Block path traversal attempts (../).
   - Stop zip slip exploits (directory escape).
   - Use streaming for memory-safe file handling.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…12574)

### What problem does this PR solve?

Fix image not displaying thumbnails when using pipeline.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…guration infiniflow#11796 (infiniflow#12579)

### What problem does this PR solve?

Feat: Exported Agent JSON Should Include Conversation Variables
Configuration infiniflow#11796

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
…w#12516)

### What problem does this PR solve?

Add uv-aarch64-unknown-linux-gnu.tar.gz to support building ARM64 Docker
images.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Previously, we added support for previewing PPT and PPTX files in the
backend. Now, we are adding it to the frontend, so when the slides in
the chat interface are referenced, they will no longer be blank.
### Type of change

- Bug Fix (non-breaking change which fixes an issue)
…​via search. (infiniflow#12585)

### What problem does this PR solve?

Feat: The MetadataFilterConditions component supports adding values
​​via search.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
…low#12588)

### What problem does this PR solve?

When there are multiple users, parsing a document for a new user can
trigger the reuse of column objects, leading to the error
`sqlalchemy.exc.ArgumentError: Column object 'id' already assigned to
Table xxx`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Wrong input trace in Category component

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Updates pre-existing HTTP API and SDK tests to align with current
backend behavior (validation errors, 404s, and schema defaults). This
ensures p3 regression coverage is accurate without changing production
code.

### Type of change

- [x] Other (please describe): align p3 HTTP/SDK tests with current
backend behavior

---------

Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
### What problem does this PR solve?

This PR adds missing HTTP API test coverage for dataset
graph/GraphRAG/RAPTOR tasks, metadata summary, chat completions, agent
sessions/completions, and related questions. It also introduces minimal
HTTP test helpers to exercise these endpoints consistently with the
existing suite.

### Type of change

- [x]  Other (please describe): Test coverage (HTTP API tests)

---------

Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?

This PR eliminates unnecessary debug print statements that were left in
hot paths of the codebase.

### Type of change

- [x] Refactoring
### What problem does this PR solve?

Fix: Unable to copy category node. infiniflow#12607

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Add CN regions for AWS.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?

This PR adds a dedicated HTTP benchmark CLI for RAGFlow chat and
retrieval endpoints so we can measure latency/QPS.

### Type of change

- [x] Documentation Update
- [x] Other (please describe): Adds a CLI benchmarking tool for
chat/retrieval latency/QPS

---------

Co-authored-by: Liu An <asiro@qq.com>
### Issue

When using Qwen3 models (`qwen3-32b`, `qwen3-max`) through the
Tongyi-Qianwen provider for non-streaming calls (e.g., knowledge graph
generation), the API fails with:

Closes infiniflow#12424

```
parameter.enable_thinking must be set to false for non-streaming calls
```

### Root Cause

In `LiteLLMBase.async_chat()`, the `extra_body={"enable_thinking":
False}` was set in `kwargs` but never forwarded to
`_construct_completion_args()`.

### What problem does this PR solve?

Pass merged kwargs to `_construct_completion_args()` using
`**{**gen_conf, **kwargs}` to safely handle potential duplicate
parameters.

### Changes

- `rag/llm/chat_model.py`: Forward kwargs containing `extra_body` to
`_construct_completion_args()` in `async_chat()`


_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=42954461
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
) (infiniflow#12611)

### What problem does this PR solve?

Fixes infiniflow#12604 - DOCX files containing hyperlinks to internal bookmarks
(e.g., `#_文档目录`) cause a `KeyError` during parsing:

```
KeyError: "There is no item named 'word/#_文档目录' in the archive"
```

This happens because python-docx incorrectly tries to read internal
bookmark references as files from the ZIP archive. Internal bookmarks
are relationship targets starting with `#` and are not actual files.

This PR extends the existing `load_from_xml_v2` workaround (which
already handles `NULL` targets) to also skip relationship targets
starting with `#`.

Related upstream issue:
python-openxml/python-docx#902

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---
Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=94194147
…nfiniflow#12628)

### What problem does this PR solve?

Fix: Fix the styles of the multi-select component and the filter pop-up.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…12626)

## Description

Fixes connection error handling when langfuse service is unavailable.
The application now gracefully handles connection failures instead of
crashing.

## Changes

- Wrapped `langfuse.auth_check()` calls in try-except blocks in:
  - `api/db/services/dialog_service.py`
  - `api/db/services/tenant_llm_service.py`

## Problem

When langfuse service is unavailable or connection is refused,
`langfuse.auth_check()` throws `httpx.ConnectError: [Errno 111]
Connection refused`, causing the application to crash during document
parsing or dialog operations.

## Solution

Added try-except blocks around `langfuse.auth_check()` calls to catch
connection errors and gracefully skip langfuse tracing instead of
crashing. The application continues functioning normally even when
langfuse is unavailable.

## Related Issue

Fixes infiniflow#12621

---

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=158349177
### Type of change

- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?

Feat: Hash doc id to avoid duplicate name. 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?

Fixes infiniflow#12570 - The slicing method dropdown was empty when deploying
RAGFlow v0.23.1 from source code.

The issue occurred because `parser_ids` from the tenant info was empty
or undefined, causing `useSelectParserList` to return an empty array.
This PR adds a fallback to a default parser list when `parser_ids` is
empty, ensuring the dropdown always has options.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---
Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=94194147
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…nfiniflow#12633)

### What problem does this PR solve?

Fix regex pattern validation in split_with_pattern (infiniflow#12605)

- Add try-except block to validate user-provided regex patterns before
use
- Gracefully fallback to single chunk when invalid regex is provided
- Prevent server crash during DOCX parsing with malformed delimiters

## Problem

Parsing DOCX files with custom regex delimiters crashes with `re.error:
nothing to repeat at position 9` when users provide invalid regex
patterns.

Closes infiniflow#12605 

## Solution

Validate and compile regex pattern before use. On invalid pattern, log
warning and return content as single chunk instead of crashing.

## Changes

- `rag/nlp/__init__.py`: Add regex validation in `split_with_pattern()`
function

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=42954461
MkDev11 and others added 28 commits February 6, 2026 14:44
…fter chunking (infiniflow#13010)

Fix RDBMS field separation after chunking by wrapping field names in
brackets (【field】: value). This ensures fields remain distinguishable
even when TxtParser strips newline delimiters during chunk merging.

Closes  infiniflow#13001

Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>
### What problem does this PR solve?

Fix: task cancel infiniflow#11745 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Fix: Lazy loading adds a loading state to the page

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

MCP host mode supports STREAMABLE-HTTP endpoint

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
## Description
  Upgrade dashscope package to support text-embedding-v4 model.

  ## Changes
  - Update dashscope version from 1.20.11 to 1.25.11 in pyproject.toml

  ## Reason
The text-embedding-v4 model requires dashscope >= 1.25.0 to function
properly. This upgrade ensures compatibility with the latest embedding
models.

Co-authored-by: Clint-chan <Clint-chan@users.noreply.github.com>
### What problem does this PR solve?

Fix: whyDidYouRender error

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
infiniflow#13049)

## Description

This PR fixes an issue where the input and variable configuration tables
in the Agent Canvas (specifically for **Begin**, **UserFillUp**, and
**Invoke** nodes) were truncated at 10 items.

**Root Cause:**
The tables utilized `@tanstack/react-table` with
`getPaginationRowModel()` enabled. Since the default page size is 10 and
no pagination UI controls were implemented, users could not access items
beyond the 10th row.

**Solution:**
Removed `getPaginationRowModel` from the table configurations. These
lists (inputs/variables) are typically short, so rendering all items in
a single scrollable view is the intended behavior.

* Modified `query-table.tsx`
* Modified `variable-table.tsx`


## How to verify

1. Create a **Begin**, **UserFillUp**, or **Invoke** node in the Agent
Canvas.
2. Add more than 10 input items or variables.
3. Verify that all items are visible in the list and not truncated at
the 10th item.

## What kind of change does this PR introduce?

* [x] Bugfix
### What problem does this PR solve?

improve ppt shape order logic

### Type of change

- [x] Refactoring
### Type of change

- [x] Documentation Update
### What problem does this PR solve?

Adjust highlight parsing, add row-count SQL override, tweak retrieval
thresholding, and update tests with engine-aware skips/utilities.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Fix: gemini model names infiniflow#13053


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
… parser (infiniflow#13068)

### What problem does this PR solve?

Fix parameter of calling self.dataStore.get() and warning info during
parser

infiniflow#13036

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- add resizable support to shared textarea component
- enable vertical resizing for chat inputs in chat and share surfaces
- preserve autosize behavior while honoring manual resize height

## Test plan
- not run (not requested)

Fixes infiniflow#12803

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…embedded pages. (infiniflow#13078)

### What problem does this PR solve?

Fix: Add authentication validation to the document API interface for
embedded pages and modify the document display styles.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Feat: Add Explore page

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?

Replaced TOC Enhance with Page Index.

### Type of change

- [x] Documentation Update
### What problem does this PR solve?

Feat: Translation page index.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?

Fix: Bugs fixed
- metadata icon error
- search page's image not display

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

Feat: Hide log button
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?

Fix: Memory log style

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?

boost OpenAI-compatible reranker UX.

### Type of change

- [x] Refactoring
### What problem does this PR solve?

Updated UI tips.

### Type of change


- [x] Refactoring
When match_expressions contains coroutine objects (from GraphRAG's
Dealer.get_vector()), the code cannot identify this type because it only
checks for MatchTextExpr, MatchDenseExpr, or FusionExpr.

As a result:

score_func remains initialized as an empty string ""
This empty string is appended to the output list
The output list is passed to Infinity SDK's table_instance.output()
method
Infinity's SQL parser (via sqlglot) fails to parse the empty string,
throwing a ParseError
### What problem does this PR solve?

Judge table created with current infinity mapping before migrate db.
infiniflow#13089

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
…iflow#13095)

### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.23.1 to v0.24.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
@JasleenKaurSethi JasleenKaurSethi changed the title Upgrade 0.24.1 Upgrade RagFlow to latest v0.24.1 Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.