Skip to content

Add compression to replicator#6013

Draft
lacklacklack wants to merge 2 commits into
apache:mainfrom
lacklacklack:replicator-compression-support
Draft

Add compression to replicator#6013
lacklacklack wants to merge 2 commits into
apache:mainfrom
lacklacklack:replicator-compression-support

Conversation

@lacklacklack

Copy link
Copy Markdown

Adds configurable HTTP compression (gzip/deflate) to reduce bandwidth during replication.

@janl

janl commented May 29, 2026

Copy link
Copy Markdown
Member

Can you please use our PR template?

@nickva nickva left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @lacklacklack!

A few suggestions / ideas as bullet points:

  1. We're trying to do bit of both request and response sides; let's keep it simpler at first and focus on just the request side (_bulk_docs and _revs_diff). Since we already have server side handling for gzip, let's handle gzip only for the request side. Then we can cleanly test the feature in CI without extra proxies or having to add server side compression sending here too.

  2. Let's skip deflate and use gzip only. But add the possibility to add zstd in the future (not this PR though) so keep the configurable compression algorithm. The reason to skip deflate is simplicity (our server handle gzip), and deflate is a bit of a mess according to https://en.wikipedia.org/wiki/HTTP_compression

Another problem found while deploying HTTP compression on large scale is due to the deflate encoding definition: while HTTP 1.1 defines the deflate encoding as data compressed with deflate (RFC 1951) inside a zlib formatted stream (RFC 1950), Microsoft server and client products historically implemented it as a "raw" deflated stream making its deployment unreliable. For this reason, some software, including the Apache HTTP Server, only implements gzip encoding.

  1. As it stands the most important thing to compress (_bulk_docs body) won't actually be compressed. The body there isn't an iolist or binary but {BodyFun, [prefix | Docs]}}. So that makes me think maybe a better place for this is not in httpc but in api_wrap

  2. Do not set AcceptEncodings = config:get("replicator", "accept_encodings", "gzip, deflate, zstd") unless we can always handle these responses and decompress them. If the server then sends us zstd data and we're on OTP 27 we won't be able to handle it and fail the request. For this pr let's just skip setting that altogether

  3. Do not enable gzip compression by default. Since that is not a negotiated setting, if the replicator was talking to an older CouchDB or other server not implementing gzip decompression we'd break a customers' setup as soon as they upgrade.

  4. Don't forget to fill out the template like Jan suggested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants