Standardize UDF coercion error messages#20070
Conversation
|
If we decide to go this route, there is still some followup work:
For example this is how array function errors look like: > select array_extract(1,2,3);
Error during planning: Failed to coerce function call 'array_element(Int64, Int64, Int64)'. You might need to add explicit type casts.
Candidate functions:
array_element(array, index)
|
| select union_extract(union_column, 'bool') from union_table; | ||
|
|
||
| query error DataFusion error: Error during planning: 'union_extract' does not support zero arguments | ||
| query error Failed to coerce function call 'union_extract\(\)' |
There was a problem hiding this comment.
the old message about not supporting zero arguments seems like it was nicer
| 1 | ||
|
|
||
| query error DataFusion error: Error during planning: Function 'strpos' requires TypeSignatureClass::Native\(LogicalType\(Native\(String\), String\)\), but received Int64 \(DataType: Int64\) | ||
| query error Failed to coerce function call 'strpos\(Int64, Int64\)' |
There was a problem hiding this comment.
it feels like the previous messages were better even if they were more verbose, as they at least try to explain to a user what was expected. I do admit that the TypeSignatureClass::Native<... is not very user friendly, however
|
Yeah I was initially thinking of trying to cleanup the existing errors, specifically cases where they are really ugly (coercion API, also if the signature is a I can work towards keeping the existing specific error messages, but my main concern is how to deal with
However for cases where the input arguments match more than one of the datafusion/datafusion/functions/src/math/log.rs Lines 83 to 100 in 796c7d1
If we supply > select log(1, '');
Error during planning: Internal error: Function 'log' failed to match any signature, errors: Error during planning: Function 'log' expects 1 arguments but received 2,Error during planning: Function 'log' expects 1 arguments but received 2,Error during planning: Function 'log' requires TypeSignatureClass::Decimal, but received String (DataType: Utf8).,Error during planning: Function 'log' requires TypeSignatureClass::Float, but received String (DataType: Utf8)..
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues No function matches the given name and argument types 'log(Int64, Utf8)'. You might need to add explicit type casts.
Candidate functions:
log(Coercion(TypeSignatureClass::Decimal))
log(Coercion(TypeSignatureClass::Float, implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64))
log(Coercion(TypeSignatureClass::Float, implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64), Coercion(TypeSignatureClass::Decimal))
log(Coercion(TypeSignatureClass::Float, implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64), Coercion(TypeSignatureClass::Float, implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64))
Maybe only for cases like this we erase the error message and only show candidate functions, like this PR does? |
Which issue does this PR close?
candidate functionssuggestions #15817 and Improve error messages when coercion fails #19004Rationale for this change
Current error message for calling functions with incorrect arguments isn't very user friendly. Some examples:
OneOfsignature, and we currently concatenate each error from each signature inside theOneOfresulting in a large error message that isn't very helpfulabs, which is niceCandidate functionswhich isn't very helpful (greatest(UserDefined)doesn't really help the user understand the available signatures since user defined is opaque)Contrast this with DuckDB:
Or Postgres:
It looks like they prefer to omit stating specific errors, and only give general advice; for DuckDB they list the available call types, for Postgres they omit this and leave it a simple error with only details on how it was called.
This PR looks into removing some of the word spam from error messages and tries to make them more consistent with each other.
What changes are included in this PR?
When a function call fails because no signatures match, the displayed error either:
coerce_typescall of the UDF (we omit candidate functions message)For example, user-defined:
For all other signatures:
Although it means we don't show a specific error anymore (see how
absno longer specifies it was because it received string instead of a numeric), I think this is an easier way to manage signature errors to be consistent for functions that have a single signature and those that areOneOftypes. It also aligns with DuckDB.Are these changes tested?
Yes.
Are there any user-facing changes?
Yes, error messages for failed function calls changes.