feat(bitmap): support DIFF, DIFF1, ANDOR, ONE for BITOP command#3471
feat(bitmap): support DIFF, DIFF1, ANDOR, ONE for BITOP command#3471nkroker wants to merge 22 commits into
Conversation
Implements Redis 8.2+ bitwise operations: - DIFF: bits set in X but not in any Y - DIFF1: bits set in any Y but not in X - ANDOR: bits set in X and at least one Y - ONE: bits set in exactly one key Closes apache#3132 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3603de9 to
9526046
Compare
|
Hi @nkroker. Thanks for your contribution. Please read our guidelines for AI-assisted code contributions: https://kvrocks.apache.org/community/contributing#guidelines-for-ai-assisted-contributions If you used AI-assisted coding, please let me know which software and model you used. This will help us review the code more effectively. |
|
Hi @jihuayu, I have used the Claude Opus 4.6 to make changes to the codebase, but I have also reviewed and checked the code by performing a build locally. I'll mention this in the description as well. |
jihuayu
left a comment
There was a problem hiding this comment.
- DIFF, DIFF1, and ANDOR produce incorrect results when the first source key does not exist, which is inconsistent with Redis semantics.
BITFIELD y SET u8 #0 15
BITOP DIFF dest nosuch y
BITFIELD dest GET u8 #0
In this PR
BITOP return length 1
BITFIELD dest GET u8 #0 return 15
But in Redis:
BITOP return length 1
BITFIELD dest GET u8 #0 return 0
It is also in DIFF、DIFF1、ANDOR
- DIFF, DIFF1, and ANDOR lack minimum source key validation and do not align with Redis semantics.
BITFIELD x SET u8 #0 170
BITOP DIFF dest x
BITFIELD dest GET u8 #0
it will throw error in redis.
PS: You can explicitly require your AI tools to follow Redis semantics when coding to avoid these types of issues.
|
Also, you need to format the code (use clang-format) to ensure the GitHub Actions pass. |
|
Hi @nkroker , just checking in on this PR since it's been quiet for a bit. Are you still planning to work on this? Let me know if you run into any issues and feel free to ping me. |
Yes I'll update it actually I'm travelling that's why it's been open from few days |
Wow, thanks for your reply. Hope you have a great trip! |
Bump crate-ci/typos to v1.46.0 Changes: - Updated the dictionary - Ignore ssh ed25519 public keys - (action) Use a temp dir for caching
Bump oneTBB to v2023.0.0 (see: https://github.com/uxlfoundation/oneTBB/blob/master/RELEASE_NOTES.md) Key changes - Introduced ability to wait for a single task in a task_group instead of waiting for all tasks to finish. This increases reactivity and decreases latency in key user workloads. - Introduced task_arena core type selector to better support hybrid architectures with several core types - Added global control parameter to set default block time behavior on server HW - Added new API to create a set of NUMA bound task arenas, simplifying common patterns used to optimize for NUMA architectures. - Using a hwloc version other than 1.11, 2.0, or 2.5 may cause an undefined behavior on Windows OS - Significantly improved scalability for concurrent ordered containers on systems with many threads - Fixed ODR violations when public inline functions expose entities with internal linkage - Co-authored-by: 纪华裕 <jihuayu123@gmail.com>
Bump rocksdb to v11.1.1 (see: https://github.com/facebook/rocksdb/releases/tag/v11.1.1) **This PR need to merge** apache#3431 **Key changes** - Fix a bug in round-robin compaction that missed selecting input files that are needed to guarantee data correctness and cause crashing in debug builds or silent data corruption in release builds for Get() - Fix a memory accounting leak in IODispatcher - Add a new option open_files_async - Added BlockBasedTableOptions::kAuto index block search type that automatically selects between binary and interpolation search on a per-index-block basis - Add memtable_batch_lookup_optimization option to use batch lookup optimization for memtable MultiGet - Added new mutable DB option verify_manifest_content_on_close (default: false) - --------- Co-authored-by: PragmaTwice <twice@apache.org> Co-authored-by: Twice <twice.mliu@gmail.com>
…struction (apache#3477) Remove `ENV` variables from the `RUN` instruction and use the dedicated `ENV` directive instead. This change eliminates the following debconf warnings that appear during package updates: ` apache#8 5.404 debconf: unable to initialize frontend: Dialog apache#8 5.404 debconf: (TERM is not set, so the dialog frontend is not usable.) apache#8 5.404 debconf: falling back to frontend: Readline apache#8 5.404 debconf: unable to initialize frontend: Readline apache#8 5.404 debconf: (Can't locate Term/ReadLine.pm in @inc (you may need to install the Term::ReadLine module) (@inc contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0 /usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36 /usr/share/perl/5.36 /usr/local/lib/site_perl) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7.) apache#8 5.404 debconf: falling back to frontend: Teletype `
We need to disable jemalloc (and use system) at Arch (due to error with std::__throw_bad_alloc, its patch at this PR jemalloc/jemalloc#2900) --------- Co-authored-by: Twice <twice.mliu@gmail.com>
Bump fast_float to 8.2.5 (see: https://github.com/fastfloat/fast_float/releases/tag/v8.2.5) Changes - Replace std::min with ternary operators to avoid dependency
…3478) Change base docker image to debian:trixie-slim - current used bookworm are too old. So, I propose to use stable tagged Debian image - now it's a codename trixie.
Linked issue apache#3469 ## Problem `BITFIELD` commands with positional offset `#N` syntax failed in kvrocks. Redis supports `#N` as shorthand for `N * bit_width`, so `INCRBY u16 #0 1` means bit offset `0 * 16 = 0`, `apache#1` means `16`, etc. Example that worked in Redis but not kvrocks: BITFIELD mykey OVERFLOW SAT INCRBY u16 #0 1 ## Root Cause `GetBitOffsetFromArgument` called `ParseInt<uint32_t>` directly on offset string. `"#0"` fails integer parsing — `#` prefix was never handled. ## Fix In `CommandBitfield::Parse()`, after encoding is parsed (bit width known), detect `#` prefix and expand: `offset = N * encoding.Bits()`. Plain integer offsets unchanged. ## Test Added integration tests in `tests/gocase/unit/type/bitmap/bitmap_test.go` covering: - `SET`/`GET` with `#N` across multiple positions - `INCRBY` with `#N` - `OVERFLOW SAT INCRBY` with `#N` (original failing case) - `BITFIELD_RO GET` with `#N` --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Vikram Alagh <sixthkrum@gmail.com> Co-authored-by: 纪华裕 <jihuayu123@gmail.com> Co-authored-by: Aleks Lozovyuk <aleks.raiden@gmail.com>
## Summary This PR adds support for the `BYTE`/`BIT` unit option to the `BITPOS` command, aligning it with the Redis 7.0 specification. The previous implementation only checked for `BIT` at a fixed argument position (`args[5]`) and ignored `BYTE` entirely. This PR refactors the parser to correctly handle the full Redis 7.0 syntax: ``` BITPOS key bit [start [end [BYTE | BIT]]] ``` The `BYTE|BIT` unit is strictly nested: it can only appear after both `start` and `end` are provided, matching Redis behavior exactly. ## Changes ### 1. Command Parser Refactor (`src/commands/cmd_bit.cc`) - Refactored `CommandBitPos::Parse` to cleanly separate argument parsing by count. - `args[4]` (the 5th argument) is always parsed as the `end` integer — never as a unit keyword. This matches Redis, which rejects `BITPOS key 1 0 BIT` with a "not an integer" error. - `args[5]` (the 6th argument) is checked for `BIT`/`BYTE` keyword, with `errInvalidSyntax` on anything else. - Added explicit `BYTE` recognition (previously only `BIT` was partially handled). - Moved `bit` argument validation to the top for clarity. - Replaced verbose `ParseInt` + manual error check with `GET_OR_RET` macro for consistency. ### 2. Storage Layer — No Changes Needed - The `CHECK(stop_given)` assertion in `Bitmap::BitPos` is preserved as a defensive guard. Since the parser guarantees `stop_given=true` whenever `is_bit_index=true` (unit keyword requires `end` to be present), this assertion is correct and protects against future regressions. ## Compatibility | Command Form | Redis 7.0 | Kvrocks (after PR) | | --- | --- | --- | | `BITPOS key bit` | OK | OK | | `BITPOS key bit start` | OK | OK | | `BITPOS key bit start end` | OK | OK | | `BITPOS key bit start end BYTE` | OK (since 7.0) | **OK (new)** | | `BITPOS key bit start end BIT` | OK (since 7.0) | **OK (new)** | | `BITPOS key bit start BIT` | ERR not integer | ERR not integer | ## Verification - Argument parsing logic verified against Redis 7.0 syntax definition: `BITPOS key bit [start [end [BYTE | BIT]]]`. - `is_bit_index` is only set when `args.size() == 6`, guaranteeing `stop_given_ == true`, so the `CHECK(stop_given)` assertion in the storage layer is always satisfied. - `BITCOUNT` already has correct `BYTE/BIT` support and is not affected by this PR. Co-authored-by: gongna-au <gongna1@xiaomi.com> Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> Co-authored-by: 纪华裕 <jihuayu123@gmail.com>
## What Reintroduce conditional options IFEQ / IFNE / IFDEQ / IFDNE for the SET command. ## Context This is a clean, reworked version of the previously closed apache#3452. It addresses all previous feedback, reverts the accidental deletion of `CommandDelEX`, and is fully rebased onto the latest `unstable` branch. All CI checks have been successfully run and verified on my personal fork before submission. ## Behavior - IFEQ / IFNE: case-sensitive value comparison - IFDEQ / IFDNE: case-insensitive digest comparison - Return `WRONGTYPE` for non-string keys - Mutually exclusive with NX/XX - Fix: `SET key value NX GET` now correctly returns `WRONGTYPE` for non-string keys (matching Redis 8.x behavior) ## Testing - C++ unit test: Kept only the IFDEQ empty-string boundary coverage - Go integration tests: Syntax, behavior, edge cases (uppercase & malformed digests), and regression coverage ## AI-assisted Contribution Disclosure AI was used for code pattern suggestions, test scaffolding, and debugging assistance. Core logic, bug fixes, validation, and final implementation were written and verified manually. Co-authored-by: hulk <hulk.website@gmail.com> Co-authored-by: jihuayu <jihuayu123@gmail.com>
Bump golangci-lint to 2.12.1 (changelog: https://golangci-lint.run/docs/product/changelog/#v2121) Bugfix and update a lot of linters
Mark PSYNC, _FETCH_META, and _FETCH_FILE as admin-only commands so namespace users cannot invoke replication transfer internals. Move the regression coverage into the integration replication suite, where the command behavior is exercised against a running server. Assisted-by: Codex/GPT 5.5 xhigh
Replication fullsync uses the peer's file names to fetch checkpoint files and materialize them in the local sync checkpoint directory. Previously, those names were joined directly with local directories, so traversal components could escape the intended directory during existence checks, temporary file creation, rename, cleanup, or master-side file open. Assisted-by: Codex/GPT 5.5 xhigh
I propose updating the minimum required Go version to 1.25. We are not moving to 1.26 yet because our CI still builds on OpenSUSE Leap 15, where Go 1.25 (1.25.8) is the version available in the base repositories. Go 1.26 is currently only present in the devel repository.
`ZDIFFSTORE` writes results to a destination key, but it was previously registered as `read-only slow`. I fix it. I used codex gpt-5.5 to find and fix it.
…ion commands (apache#3472) Proposal apache#3432. Tracking issue apache#3436. Add hash field expiration support based on the new persistent-field metadata model. This PR: - Tracks persistent hash fields in metadata for field-expiration encoding. - Adds `HEXPIRE` and `HPERSIST`. - Updates existing hash read/write commands to handle expired fields correctly. - Filters expired fields from hash read commands without mutating metadata. - Adds C++ and Go coverage for persistent, TTL, expired, and compaction-ghost states. Assisted-by: Codex/GPT 5.5 xhigh
`encoded_mode == 0` should not be valid here since when encoding mode is legacy this field should be missing.
…e#3490) These abstraction is for hash subkey scanning, but now we don't use it anymore. So better to remove them.
Upgrade LuaJIT to a build that disables loading binary bytecode chunks, mitigating a DoS where crafted bytecode could crash the server. Add a regression test that confirms `loadstring` on malformed bytecode is refused with "attempt to load chunk with wrong mode" and the server keeps serving requests.
fix apache#3493 Lua's error return poses an injection risk; see issues for details. I used codex gpt-5.5 to fix it.
…d key count validation - DIFF/DIFF1/ANDOR now correctly treat missing X as zero (Redis semantics) - DIFF/DIFF1/ANDOR now require at least two source keys - Fix clang-format violations - Add edge case tests documenting Redis semantics for missing keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7171f79 to
d6a53d2
Compare
|
It looks like there may have been an issue while updating the code. Do you need any help? |
Summary
BITOPoperations:DIFF,DIFF1,ANDOR,ONEDIFF: bits set in X but not in any Y key (X & ~Y1 & ~Y2 & ...)DIFF1: bits set in any Y key but not in X ((Y1 | Y2 | ...) & ~X)ANDOR: bits set in X and in at least one Y key (X & (Y1 | Y2 | ...))ONE: bits set in exactly one key across all inputsChanges
src/types/redis_bitmap.h— 4 newBitOpFlagsenum valuessrc/types/redis_bitmap.cc— byte-loop logic for new ops + excluded from 64-bit fast pathsrc/commands/cmd_bit.cc— parsediff/diff1/andor/oneop name stringstests/gocase/unit/type/bitmap/bitmap_test.go— 9 targeted tests + fuzzing for all 4 opsTest plan
Ai Tools Used
Closes #3132