Skip to content

MDEV-39998 Fix JSON_OVERLAPS asymmetry for nested array comparison#5239

Open
akshatnehra wants to merge 1 commit into
MariaDB:mainfrom
akshatnehra:MDEV-39998
Open

MDEV-39998 Fix JSON_OVERLAPS asymmetry for nested array comparison#5239
akshatnehra wants to merge 1 commit into
MariaDB:mainfrom
akshatnehra:MDEV-39998

Conversation

@akshatnehra

Copy link
Copy Markdown
Contributor

Description

This PR fixes MDEV-39998 where JSON_OVERLAPS() produces asymmetric results when comparing nested arrays of different lengths.

json_compare_arrays_in_order() only checked if the second array (value) reached its end after the element-by-element comparison loop. This allowed a longer first array to falsely match a shorter second array as a "prefix match", breaking the required symmetry of JSON_OVERLAPS.

The fix involves:

  1. Adding a check that both array engines have reached their end state (JST_ARRAY_END or JST_OBJ_END) after the comparison loop, ensuring arrays must have the same length to be considered equal in ordered comparison.

How can this PR be tested?

Run the MTR test:

build/mysql-test/mtr main.mdev_39998

Or manually:

SELECT JSON_OVERLAPS('[[1,2]]', '[[1]]') AS r1, JSON_OVERLAPS('[[1]]', '[[1,2]]') AS r2;
-- Both should return 0 (symmetric)

SELECT JSON_OVERLAPS('[[[1,2]]]', '[[[1]]]') AS r1, JSON_OVERLAPS('[[[1]]]', '[[[1,2]]]') AS r2;
-- Both should return 0 (deep nesting symmetric)

Results from my testing

  1. Before changes
MariaDB> SELECT JSON_OVERLAPS('[[1,2]]', '[[1]]') AS r1, JSON_OVERLAPS('[[1]]', '[[1,2]]') AS r2;
+----+----+
| r1 | r2 |
+----+----+
|  1 |  0 |
+----+----+

MariaDB> SELECT JSON_OVERLAPS('[[[1,2]]]', '[[[1]]]') AS r1, JSON_OVERLAPS('[[[1]]]', '[[[1,2]]]') AS r2;
+----+----+
| r1 | r2 |
+----+----+
|  1 |  0 |
+----+----+

The result is asymmetric — [1,2] is treated as "overlapping" [1] in one direction but not the other.

  1. After changes
MariaDB> SELECT JSON_OVERLAPS('[[1,2]]', '[[1]]') AS r1, JSON_OVERLAPS('[[1]]', '[[1,2]]') AS r2;
+----+----+
| r1 | r2 |
+----+----+
|  0 |  0 |
+----+----+

MariaDB> SELECT JSON_OVERLAPS('[[[1,2]]]', '[[[1]]]') AS r1, JSON_OVERLAPS('[[[1]]]', '[[[1,2]]]') AS r2;
+----+----+
| r1 | r2 |
+----+----+
|  0 |  0 |
+----+----+

Results are now symmetric. Arrays of different lengths are correctly not considered overlapping.

Basing the PR against the correct MariaDB version

  • This is a bug fix and the PR is based against the latest MariaDB development branch.

PR quality check

  • I have checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am fine with the reviewer making the changes themselves.

json_compare_arrays_in_order() only checked if the second array (value)
reached its end after the element-by-element comparison loop. This
allowed a longer first array to falsely match a shorter second array
as a prefix match, breaking the required symmetry of JSON_OVERLAPS.

Fix: require both arrays to be at end state (JST_ARRAY_END or
JST_OBJ_END) after the loop, ensuring arrays must have the same
length to be considered equal in ordered comparison.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an asymmetry issue in JSON_OVERLAPS when comparing ordered nested arrays (MDEV-39998). The function json_compare_arrays_in_order in sql/item_jsonfunc.cc was updated to ensure that both compared JSON elements reach their respective array or object end, preventing a prefix of a nested array from falsely matching a shorter array. Additionally, comprehensive test cases and expected results have been added to verify the correct and symmetric behavior of JSON_OVERLAPS with nested arrays, objects, and mixed types. There are no review comments, so no further feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jun 16, 2026
@gkodinov gkodinov self-assigned this Jun 16, 2026

@gkodinov gkodinov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

Please consider rebasing the code to the lowest affected actively supported version (10.11 in this case).

There are some test optimization and extra hardening suggestions below.

SELECT JSON_OVERLAPS('[[1]]', '[[1,2]]') AS must_be_0_too;
must_be_0_too
0
SELECT JSON_OVERLAPS('[[1,2]]', '[[1,2]]') AS must_be_1;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the tests to the minimum required to fully cover the new functionality. This one is not needed. There's a very extensive test in func_json.test already.

JSON_OVERLAPS('[[1,2]]', '[[1,2,3]]') AS r2;
r1 r2
0 0
SELECT JSON_OVERLAPS('[1,2,3]', '[3,4,5]') AS scalar_overlap;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed too.

SELECT JSON_OVERLAPS('[1,2,3]', '[3,4,5]') AS scalar_overlap;
scalar_overlap
1
SELECT JSON_OVERLAPS('[[]]', '[[]]') AS empty_match;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed. There's more that are not needed below. Please keep only the one(s) that exercise the new code in unique ways.

Comment thread sql/item_jsonfunc.cc
res= (value->state == JST_ARRAY_END || value->state == JST_OBJ_END ?
TRUE : FALSE);
res= (value->state == JST_ARRAY_END || value->state == JST_OBJ_END) &&
(js->state == JST_ARRAY_END || js->state == JST_OBJ_END);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of your tests are with arrays. But there's also the same issue with JST_OBJ_END. Can you maybe add few tests that exercise that? Is it at all possible that value->state is JST_OBJ_END and js->state is JST_ARRAY_END? I'd guess not. So how about comparing the two states?
And, finally, please keep TRUE : FALSE part. This is a bugfix, not a refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

2 participants