Gap
FFI_AggregateUDF in datafusion/ffi/src/udaf/mod.rs does not plumb several defaulted methods of AggregateUDFImpl. Producer overrides are silently lost on the consumer side.
Missing methods
display_name
schema_name
human_display
window_function_schema_name
window_function_display_name
simplify
simplify_expr_op_literal
reverse_udf / reverse_expr
is_descending
value_from_stats
default_value
supports_null_handling_clause
supports_within_group_clause
set_monotonicity
documentation
Why it matters
Severity: critical. value_from_stats enables statistics-driven shortcuts (e.g. MIN/MAX from precomputed stats) — silent loss forces full re-aggregation across the FFI boundary. default_value affects empty-group correctness. supports_null_handling_clause / supports_within_group_clause change accepted SQL surface area. simplify / simplify_expr_op_literal / reverse_* are optimizer hooks. SQL output naming wrong without display_name / schema_name.
Implementation notes
- Plumb each as a plain
unsafe extern \"C\" fn; wrapper body calls the trait method on inner Arc<dyn AggregateUDFImpl> and dispatch handles override-or-default.
- Methods that ship
Expr (simplify, simplify_expr_op_literal, reverse_expr) require the embedded FFI_LogicalExtensionCodec.
- Layout change →
api change label, target main only, no back-port to branch-<major>.
- Add unit tests (local-bypass +
mock_foreign_marker_id forced-foreign) and integration tests under datafusion/ffi/tests/ for any method shipping non-trivial FFI types.
Generated from an audit performed for PR #22327 (datafusion-ffi agent skill). If a PR addressing this finds any item to be a false positive (e.g., a method intentionally omitted for a documented reason), please also propose an update to the datafusion-ffi skill so future audits do not re-flag it.
Gap
FFI_AggregateUDFindatafusion/ffi/src/udaf/mod.rsdoes not plumb several defaulted methods ofAggregateUDFImpl. Producer overrides are silently lost on the consumer side.Missing methods
display_nameschema_namehuman_displaywindow_function_schema_namewindow_function_display_namesimplifysimplify_expr_op_literalreverse_udf/reverse_expris_descendingvalue_from_statsdefault_valuesupports_null_handling_clausesupports_within_group_clauseset_monotonicitydocumentationWhy it matters
Severity: critical.
value_from_statsenables statistics-driven shortcuts (e.g.MIN/MAXfrom precomputed stats) — silent loss forces full re-aggregation across the FFI boundary.default_valueaffects empty-group correctness.supports_null_handling_clause/supports_within_group_clausechange accepted SQL surface area.simplify/simplify_expr_op_literal/reverse_*are optimizer hooks. SQL output naming wrong withoutdisplay_name/schema_name.Implementation notes
unsafe extern \"C\" fn; wrapper body calls the trait method on innerArc<dyn AggregateUDFImpl>and dispatch handles override-or-default.Expr(simplify,simplify_expr_op_literal,reverse_expr) require the embeddedFFI_LogicalExtensionCodec.api changelabel, targetmainonly, no back-port tobranch-<major>.mock_foreign_marker_idforced-foreign) and integration tests underdatafusion/ffi/tests/for any method shipping non-trivial FFI types.Generated from an audit performed for PR #22327 (datafusion-ffi agent skill). If a PR addressing this finds any item to be a false positive (e.g., a method intentionally omitted for a documented reason), please also propose an update to the
datafusion-ffiskill so future audits do not re-flag it.