Skip to content

feat(bitmap): support DIFF, DIFF1, ANDOR, ONE for BITOP command#3471

Open
nkroker wants to merge 22 commits into
apache:unstablefrom
nkroker:feat/bitop-diff-diff1-andor-one
Open

feat(bitmap): support DIFF, DIFF1, ANDOR, ONE for BITOP command#3471
nkroker wants to merge 22 commits into
apache:unstablefrom
nkroker:feat/bitop-diff-diff1-andor-one

Conversation

@nkroker
Copy link
Copy Markdown
Contributor

@nkroker nkroker commented May 1, 2026

Summary

  • Implements Redis 8.2+ BITOP operations: DIFF, DIFF1, ANDOR, ONE
  • DIFF: bits set in X but not in any Y key (X & ~Y1 & ~Y2 & ...)
  • DIFF1: bits set in any Y key but not in X ((Y1 | Y2 | ...) & ~X)
  • ANDOR: bits set in X and in at least one Y key (X & (Y1 | Y2 | ...))
  • ONE: bits set in exactly one key across all inputs

Changes

  • src/types/redis_bitmap.h — 4 new BitOpFlags enum values
  • src/types/redis_bitmap.cc — byte-loop logic for new ops + excluded from 64-bit fast path
  • src/commands/cmd_bit.cc — parse diff/diff1/andor/one op name strings
  • tests/gocase/unit/type/bitmap/bitmap_test.go — 9 targeted tests + fuzzing for all 4 ops

Test plan

  • Basic correctness tests for each op
  • Multi-key scenarios
  • Missing key treated as zero bytes
  • Fuzzing across random bitmaps with simulated reference implementation
  • All existing BITOP tests pass

Ai Tools Used

  • Claude Opus 4.6

Closes #3132

Implements Redis 8.2+ bitwise operations:
- DIFF: bits set in X but not in any Y
- DIFF1: bits set in any Y but not in X
- ANDOR: bits set in X and at least one Y
- ONE: bits set in exactly one key

Closes apache#3132

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nkroker nkroker force-pushed the feat/bitop-diff-diff1-andor-one branch from 3603de9 to 9526046 Compare May 1, 2026 06:26
@jihuayu
Copy link
Copy Markdown
Member

jihuayu commented May 1, 2026

Hi @nkroker. Thanks for your contribution. Please read our guidelines for AI-assisted code contributions: https://kvrocks.apache.org/community/contributing#guidelines-for-ai-assisted-contributions

If you used AI-assisted coding, please let me know which software and model you used. This will help us review the code more effectively.

@nkroker
Copy link
Copy Markdown
Contributor Author

nkroker commented May 2, 2026

Hi @jihuayu, I have used the Claude Opus 4.6 to make changes to the codebase, but I have also reviewed and checked the code by performing a build locally. I'll mention this in the description as well.

Copy link
Copy Markdown
Member

@jihuayu jihuayu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. DIFF, DIFF1, and ANDOR produce incorrect results when the first source key does not exist, which is inconsistent with Redis semantics.

BITFIELD y SET u8 #0 15
BITOP DIFF dest nosuch y
BITFIELD dest GET u8 #0

In this PR
BITOP return length 1
BITFIELD dest GET u8 #0 return 15

But in Redis:

BITOP return length 1
BITFIELD dest GET u8 #0 return 0

It is also in DIFF、DIFF1、ANDOR

  1. DIFF, DIFF1, and ANDOR lack minimum source key validation and do not align with Redis semantics.

BITFIELD x SET u8 #0 170

BITOP DIFF dest x
BITFIELD dest GET u8 #0

it will throw error in redis.

PS: You can explicitly require your AI tools to follow Redis semantics when coding to avoid these types of issues.

@jihuayu
Copy link
Copy Markdown
Member

jihuayu commented May 2, 2026

Also, you need to format the code (use clang-format) to ensure the GitHub Actions pass.

@jihuayu
Copy link
Copy Markdown
Member

jihuayu commented May 22, 2026

Hi @nkroker , just checking in on this PR since it's been quiet for a bit. Are you still planning to work on this? Let me know if you run into any issues and feel free to ping me.

@nkroker
Copy link
Copy Markdown
Contributor Author

nkroker commented May 23, 2026

Hi @nkroker , just checking in on this PR since it's been quiet for a bit. Are you still planning to work on this? Let me know if you run into any issues and feel free to ping me.

Yes I'll update it actually I'm travelling that's why it's been open from few days

@jihuayu
Copy link
Copy Markdown
Member

jihuayu commented May 25, 2026

Yes I'll update it actually I'm travelling that's why it's been open from few days

Wow, thanks for your reply. Hope you have a great trip!

aleksraiden and others added 18 commits May 29, 2026 09:49
Bump crate-ci/typos to v1.46.0

Changes:

- Updated the dictionary
- Ignore ssh ed25519 public keys
- (action) Use a temp dir for caching
Bump oneTBB to v2023.0.0 (see:
https://github.com/uxlfoundation/oneTBB/blob/master/RELEASE_NOTES.md)

Key changes
- Introduced ability to wait for a single task in a task_group instead
of waiting for all tasks to finish. This increases reactivity and
decreases latency in key user workloads.
- Introduced task_arena core type selector to better support hybrid
architectures with several core types
- Added global control parameter to set default block time behavior on
server HW
- Added new API to create a set of NUMA bound task arenas, simplifying
common patterns used to optimize for NUMA architectures.
- Using a hwloc version other than 1.11, 2.0, or 2.5 may cause an
undefined behavior on Windows OS
- Significantly improved scalability for concurrent ordered containers
on systems with many threads
- Fixed ODR violations when public inline functions expose entities with
internal linkage
-

Co-authored-by: 纪华裕 <jihuayu123@gmail.com>
Bump rocksdb to v11.1.1 (see:
https://github.com/facebook/rocksdb/releases/tag/v11.1.1)

**This PR need to merge** apache#3431 

**Key changes**

- Fix a bug in round-robin compaction that missed selecting input files
that are needed to guarantee data correctness and cause crashing in
debug builds or silent data corruption in release builds for Get()
- Fix a memory accounting leak in IODispatcher
- Add a new option open_files_async
- Added BlockBasedTableOptions::kAuto index block search type that
automatically selects between binary and interpolation search on a
per-index-block basis
- Add memtable_batch_lookup_optimization option to use batch lookup
optimization for memtable MultiGet
- Added new mutable DB option verify_manifest_content_on_close (default:
false)
-

---------

Co-authored-by: PragmaTwice <twice@apache.org>
Co-authored-by: Twice <twice.mliu@gmail.com>
…struction (apache#3477)

Remove `ENV` variables from the `RUN` instruction and use the dedicated
`ENV` directive instead.

This change eliminates the following debconf warnings that appear during
package updates:

`
apache#8 5.404 debconf: unable to initialize frontend: Dialog
apache#8 5.404 debconf: (TERM is not set, so the dialog frontend is not
usable.)
apache#8 5.404 debconf: falling back to frontend: Readline
apache#8 5.404 debconf: unable to initialize frontend: Readline
apache#8 5.404 debconf: (Can't locate Term/ReadLine.pm in @inc (you may need
to install the Term::ReadLine module) (@inc contains: /etc/perl
/usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0
/usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5
/usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36
/usr/share/perl/5.36 /usr/local/lib/site_perl) at
/usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7.)
apache#8 5.404 debconf: falling back to frontend: Teletype
`
We need to disable jemalloc (and use system) at Arch (due to error with
std::__throw_bad_alloc, its patch at this PR
jemalloc/jemalloc#2900)

---------

Co-authored-by: Twice <twice.mliu@gmail.com>
Bump fast_float to 8.2.5 (see:
https://github.com/fastfloat/fast_float/releases/tag/v8.2.5)

Changes
 - Replace std::min with ternary operators to avoid dependency
…3478)

Change base docker image to debian:trixie-slim - current used bookworm
are too old. So, I propose to use stable tagged Debian image - now it's
a codename trixie.
Linked issue apache#3469 

## Problem
`BITFIELD` commands with positional offset `#N` syntax failed in
kvrocks. Redis supports `#N` as shorthand for `N * bit_width`, so
`INCRBY u16 #0 1` means bit
offset `0 * 16 = 0`, `apache#1` means `16`, etc.
Example that worked in Redis but not kvrocks:
BITFIELD mykey OVERFLOW SAT INCRBY u16 #0 1
## Root Cause
`GetBitOffsetFromArgument` called `ParseInt<uint32_t>` directly on
offset string. `"#0"` fails integer parsing — `#` prefix was never
handled.
## Fix
In `CommandBitfield::Parse()`, after encoding is parsed (bit width
known), detect `#` prefix and expand: `offset = N * encoding.Bits()`.
Plain integer offsets
  unchanged.
## Test
Added integration tests in
`tests/gocase/unit/type/bitmap/bitmap_test.go` covering:
- `SET`/`GET` with `#N` across multiple positions
- `INCRBY` with `#N`
- `OVERFLOW SAT INCRBY` with `#N` (original failing case)
  - `BITFIELD_RO GET` with `#N`

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Vikram Alagh <sixthkrum@gmail.com>
Co-authored-by: 纪华裕 <jihuayu123@gmail.com>
Co-authored-by: Aleks Lozovyuk <aleks.raiden@gmail.com>
## Summary
This PR adds support for the `BYTE`/`BIT` unit option to the `BITPOS`
command, aligning it with the Redis 7.0 specification.

The previous implementation only checked for `BIT` at a fixed argument
position (`args[5]`) and ignored `BYTE` entirely. This PR refactors the
parser to correctly handle the full Redis 7.0 syntax:

```
BITPOS key bit [start [end [BYTE | BIT]]]
```

The `BYTE|BIT` unit is strictly nested: it can only appear after both
`start` and `end` are provided, matching Redis behavior exactly.

## Changes
### 1. Command Parser Refactor (`src/commands/cmd_bit.cc`)
- Refactored `CommandBitPos::Parse` to cleanly separate argument parsing
by count.
- `args[4]` (the 5th argument) is always parsed as the `end` integer —
never as a unit keyword. This matches Redis, which rejects `BITPOS key 1
0 BIT` with a "not an integer" error.
- `args[5]` (the 6th argument) is checked for `BIT`/`BYTE` keyword, with
`errInvalidSyntax` on anything else.
- Added explicit `BYTE` recognition (previously only `BIT` was partially
handled).
- Moved `bit` argument validation to the top for clarity.
- Replaced verbose `ParseInt` + manual error check with `GET_OR_RET`
macro for consistency.

### 2. Storage Layer — No Changes Needed
- The `CHECK(stop_given)` assertion in `Bitmap::BitPos` is preserved as
a defensive guard. Since the parser guarantees `stop_given=true`
whenever `is_bit_index=true` (unit keyword requires `end` to be
present), this assertion is correct and protects against future
regressions.

## Compatibility

| Command Form | Redis 7.0 | Kvrocks (after PR) |
| --- | --- | --- |
| `BITPOS key bit` | OK | OK |
| `BITPOS key bit start` | OK | OK |
| `BITPOS key bit start end` | OK | OK |
| `BITPOS key bit start end BYTE` | OK (since 7.0) | **OK (new)** |
| `BITPOS key bit start end BIT` | OK (since 7.0) | **OK (new)** |
| `BITPOS key bit start BIT` | ERR not integer | ERR not integer |

## Verification
- Argument parsing logic verified against Redis 7.0 syntax definition:
`BITPOS key bit [start [end [BYTE | BIT]]]`.
- `is_bit_index` is only set when `args.size() == 6`, guaranteeing
`stop_given_ == true`, so the `CHECK(stop_given)` assertion in the
storage layer is always satisfied.
- `BITCOUNT` already has correct `BYTE/BIT` support and is not affected
by this PR.

Co-authored-by: gongna-au <gongna1@xiaomi.com>
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Co-authored-by: 纪华裕 <jihuayu123@gmail.com>
## What
Reintroduce conditional options IFEQ / IFNE / IFDEQ / IFDNE for the SET
command.

## Context
This is a clean, reworked version of the previously closed apache#3452.  
It addresses all previous feedback, reverts the accidental deletion of
`CommandDelEX`, and is fully rebased onto the latest `unstable` branch.

All CI checks have been successfully run and verified on my personal
fork before submission.

## Behavior
- IFEQ / IFNE: case-sensitive value comparison  
- IFDEQ / IFDNE: case-insensitive digest comparison  
- Return `WRONGTYPE` for non-string keys  
- Mutually exclusive with NX/XX  
- Fix: `SET key value NX GET` now correctly returns `WRONGTYPE` for
non-string keys (matching Redis 8.x behavior)

## Testing
- C++ unit test: Kept only the IFDEQ empty-string boundary coverage  
- Go integration tests: Syntax, behavior, edge cases (uppercase &
malformed digests), and regression coverage

## AI-assisted Contribution Disclosure
AI was used for code pattern suggestions, test scaffolding, and
debugging assistance.
Core logic, bug fixes, validation, and final implementation were written
and verified manually.

Co-authored-by: hulk <hulk.website@gmail.com>
Co-authored-by: jihuayu <jihuayu123@gmail.com>
Bump golangci-lint to 2.12.1 (changelog:
https://golangci-lint.run/docs/product/changelog/#v2121)

Bugfix and update a lot of linters
Mark PSYNC, _FETCH_META, and _FETCH_FILE as admin-only commands so
namespace users cannot invoke replication transfer internals.

Move the regression coverage into the integration replication suite,
where the command behavior is exercised against a running server.

Assisted-by: Codex/GPT 5.5 xhigh
Replication fullsync uses the peer's file names to fetch checkpoint
files
and materialize them in the local sync checkpoint directory. Previously,
those names were joined directly with local directories,
so traversal components could escape the intended directory during
existence checks,
temporary file creation, rename, cleanup, or master-side file open.

Assisted-by: Codex/GPT 5.5 xhigh
I propose updating the minimum required Go version to 1.25.

We are not moving to 1.26 yet because our CI still builds on OpenSUSE
Leap 15, where Go 1.25 (1.25.8) is the version available in the base
repositories. Go 1.26 is currently only present in the devel repository.
`ZDIFFSTORE` writes results to a destination key, but it was previously
registered as `read-only slow`. I fix it.


I used codex gpt-5.5 to find and fix it.
…ion commands (apache#3472)

Proposal apache#3432. Tracking issue apache#3436.  

Add hash field expiration support based on the new persistent-field
metadata model.

  This PR:
- Tracks persistent hash fields in metadata for field-expiration
encoding.
  - Adds `HEXPIRE` and `HPERSIST`.
- Updates existing hash read/write commands to handle expired fields
correctly.
- Filters expired fields from hash read commands without mutating
metadata.
- Adds C++ and Go coverage for persistent, TTL, expired, and
compaction-ghost states.

Assisted-by: Codex/GPT 5.5 xhigh
`encoded_mode == 0` should not be valid here since when encoding mode is
legacy this field should be missing.
…e#3490)

These abstraction is for hash subkey scanning, but now we don't use it
anymore. So better to remove them.
git-hulk and others added 3 commits May 29, 2026 09:49
Upgrade LuaJIT to a build that disables loading binary bytecode chunks,
mitigating a DoS where crafted bytecode could crash the server. Add a
regression test that confirms `loadstring` on malformed bytecode is
refused with "attempt to load chunk with wrong mode" and the server
keeps serving requests.
fix apache#3493


Lua's error return poses an injection risk; see issues for details.

I used codex gpt-5.5 to fix it.
…d key count validation

- DIFF/DIFF1/ANDOR now correctly treat missing X as zero (Redis semantics)
- DIFF/DIFF1/ANDOR now require at least two source keys
- Fix clang-format violations
- Add edge case tests documenting Redis semantics for missing keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nkroker nkroker force-pushed the feat/bitop-diff-diff1-andor-one branch from 7171f79 to d6a53d2 Compare May 29, 2026 04:43
@jihuayu
Copy link
Copy Markdown
Member

jihuayu commented May 29, 2026

It looks like there may have been an issue while updating the code. Do you need any help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support DIFF, DIFF1, ANDOR, and ONE for command BITOP

7 participants