Skip to content

S3 objects deleted after successful upload (admin backend deletes source file post‑ingest) #224

@a-klos

Description

@a-klos

Description
When running the local RAG setup (k3d + Tilt + MinIO), uploaded documents disappear from the
MinIO bucket immediately after the ingestion finishes. The MinIO client shows
s3:ObjectRemoved:Delete events for the same key that was uploaded, even though the pod/cluster
wasn’t restarted.

Observed behavior

  • Upload a document via admin UI or API.
  • Ingestion completes successfully.
  • The S3 object is deleted right after ingestion.

Evidence
mc watch --events delete local/documents reports:

[2026-02-02T13:33:15.666Z] s3:ObjectRemoved:Delete http://localhost:9000/documents/
c6d3c60ee_file_dokument2.pdf

Root cause (code)
The admin backend uploads the file to S3, then after extraction it calls the document deleter to
“replace” old content. The deleter always deletes the S3 object, so the source file is removed
after ingestion.

Relevant code paths:

  • Upload + post‑processing:
    services/admin-backend/src/admin_backend/file_uploader.py
    (calls document_deleter.adelete_document(... remove_from_key_value_store=False))
  • Deleter always deletes S3 object:
    libs/admin-api-lib/src/admin_api_lib/impl/api_endpoints/default_document_deleter.py

Expected behavior
Ingestion should not delete the source document from S3 unless the user explicitly deletes it
(e.g., via admin UI or /delete_document).

Proposed fix
Introduce a remove_from_storage flag on DocumentDeleter and pass False from upload flows. Keep
True for explicit deletes.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions