Skip to content

[fix](uniform function) fix constant argument handling and use ColumnView #63076

Open
Mryange wants to merge 1 commit intoapache:masterfrom
Mryange:fix-uniform
Open

[fix](uniform function) fix constant argument handling and use ColumnView #63076
Mryange wants to merge 1 commit intoapache:masterfrom
Mryange:fix-uniform

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented May 8, 2026

What problem does this PR solve?

Issue Number: N/A

Problem Summary:

The uniform function takes three arguments: min, max, and seed. Only the first two (min, max) are truly "always constant" — the seed column should be treated as a
regular column, not a constant. Without overriding get_arguments_that_are_always_constant(), when a user passes a constant value as the third argument (seed), the
default framework logic treats it as a constant column, leading to incorrect results.

Root cause: the base class default get_arguments_that_are_always_constant() does not distinguish between the seed argument and the min/max arguments, so a constant
seed would be folded by the constant-handling path rather than being treated as a per-row value.

Fix:

  • Override get_arguments_that_are_always_constant() to return {0, 1}, explicitly marking only min and max as always-constant arguments.
  • Refactor seed column access to use ColumnView<TYPE_BIGINT> for safer and more idiomatic typed column iteration.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 8, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed PR #63076 against the Doris code-review checklist. I found one blocking correctness issue in the BE scalar function constant-argument path.

Critical checkpoints:

  • Goal/test: The PR tries to fix uniform constant argument handling, but the all-constant execution path can still execute with inconsistent temporary column sizes and is not proven by a regression test.
  • Scope: The code change is small, but it changes framework metadata with non-local effects in PreparedFunctionImpl::default_implementation_for_constant_arguments.
  • Concurrency/lifecycle/config/compatibility/persistence/data writes: Not applicable; this is a stateless BE scalar function change with no new config, storage format, transaction, or protocol changes.
  • Parallel paths: Nereids rejects literal seed arguments, but BE/classic execution and BE constant-fold execution still need to be safe because this BE implementation is the final execution boundary.
  • Conditional checks/error handling: Existing min/max validation is preserved, but the seed literal/non-constant invariant is still not enforced in BE.
  • Tests: Existing regression tests cover Nereids analysis errors, but there is no added test covering the BE all-constant framework path affected by this override.
  • Observability/performance: No new observability needed; performance impact is negligible relative to per-row RNG.

User focus: No additional user-provided review focus was specified.

@@ -157,6 +159,8 @@ class FunctionUniform : public IFunction {
return Impl::get_variadic_argument_types();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override makes the default all-constant path keep min/max as ColumnConst but unwrap the seed argument to its nested data column. In PreparedFunctionImpl::default_implementation_for_constant_arguments, temporary_block.rows() is then taken from the first column, so for a row-producing query such as a classic-planner path that reaches BE with uniform(1, 100, 1), the temp block has min as ColumnConst(size = input_rows_count) and seed as ColumnInt64(size = 1). execute_impl loops input_rows_count rows and ColumnView treats the unwrapped seed as non-const, so value_at(1) and later read past the one-row seed column.

Nereids currently rejects a literal seed, but this BE function is also the execution boundary for non-Nereids/classic and BE constant-fold paths, and this PR specifically changes BE constant handling. Please either enforce the third argument must not be constant at BE execution/open time or disable/adjust the default constant-argument path so the seed remains a correctly sized ColumnConst rather than a one-row nested column.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override makes the default all-constant path keep min/max as ColumnConst but unwrap the seed argument to its nested data column. In PreparedFunctionImpl::default_implementation_for_constant_arguments, temporary_block.rows() is then taken from the first column, so for a row-producing query such as a classic-planner path that reaches BE with uniform(1, 100, 1), the temp block has min as ColumnConst(size = input_rows_count) and seed as ColumnInt64(size = 1). execute_impl loops input_rows_count rows and ColumnView treats the unwrapped seed as non-const, so value_at(1) and later read past the one-row seed column.

Nereids currently rejects a literal seed, but this BE function is also the execution boundary for non-Nereids/classic and BE constant-fold paths, and this PR specifically changes BE constant handling. Please either enforce the third argument must not be constant at BE execution/open time or disable/adjust the default constant-argument path so the seed remains a correctly sized ColumnConst rather than a one-row nested column.

某个设计上的问题,目前的be的执行框架,在执行的时候,一个函数/表达式 返回会返回column const,或者非column const (有一些函数会根据runtime的value来直接返回一个column const 的column

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 8, 2026

run buildall

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 8, 2026

run beut

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.76% (27767/37643)
Line Coverage 57.63% (300623/521658)
Region Coverage 54.85% (250522/456715)
Branch Coverage 56.39% (108384/192189)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants