Skip to content

fix: async R2-backed export for large database dumps (fixes #59)#101

Open
bcornish1797 wants to merge 1 commit intoouterbase:mainfrom
bcornish1797:fix/large-database-dumps
Open

fix: async R2-backed export for large database dumps (fixes #59)#101
bcornish1797 wants to merge 1 commit intoouterbase:mainfrom
bcornish1797:fix/large-database-dumps

Conversation

@bcornish1797
Copy link

@bcornish1797 bcornish1797 commented Mar 1, 2026

Summary

  • add streaming to the existing sync dump to reduce peak memory pressure
  • add async dump job flow for databases too large to export within the 30s worker timeout:
    • POST /export/dump kicks off a background job, returns { dumpId, statusUrl } immediately
    • GET /export/dump/:dumpId to poll status/progress
    • GET /export/dump/:dumpId/download streams the completed dump file from R2
  • dump progress is persisted in DO storage; large files are written via R2 multipart uploads
  • existing GET /export/dump (sync path) is unchanged — still works for small databases

Why

The current endpoint loads the whole database into memory and has to finish inside the 30-second Workers limit. That's fine for small databases but breaks for anything approaching the DO storage limit. The async path runs the export in chunks via DO alarms that chain themselves until done, then writes the result to R2 for streaming download.

Test

pnpm test --run src/export/dump-async.test.ts

/claim #59

Fixes outerbase#59

- Add POST /export/dump for async dump initiation via DO alarm
- Add GET /export/dump/:id for status polling
- Add GET /export/dump/:id/download for streaming from R2
- Use DO alarms for multi-cycle processing of large databases
- Use R2 multipart upload API to avoid memory limits
- Keep existing GET /export/dump for small databases (backward compatible)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant