Skip to content

[C++] Wrong and low inefficient expression execution for [if/else, case/when ... etc] expression #41094

@ZhangHuiGui

Description

@ZhangHuiGui

Describe the bug, including details regarding any error messages, version, and platform.

A bug is found here. When we execute a similar case when a != 0 then b / a, div 0 exception will be reported. But logically, such a statement should be executed.
The reason why the above execution will report an error is because of the existing ExecuteScalarExpression execution system logic, as shown in the following code:

for (size_t i = 0; i < arguments.size(); ++i) {
     ARROW_ASSIGN_OR_RAISE(
         arguments[i], ExecuteScalarExpression(call->arguments[i], input, exec_context));
     if (arguments[i].is_array()) {
       all_scalar = false;
     }
   }

For the above statement, the expression arguments[0] corresponding to the judgment condition a != 0 will be executed first, and then the arguments[1] corresponding to b / a will be executed. Obviously, logically speaking, b / a should not be fully executed, it should be partially executed based on a != 0

Because if-else related expressions have a special execution order, should we extend ExecuteScalarIfElseExpression to handle such expressions separately? The execution order is as follows:

  1. First execute the condition corresponding to the if statement
  2. Filter the corresponding data based on the execution result of the if statement
  3. Perform the remaining operations

To do this, we obviously need to extend Expression to support if-else related logic and callbacks

laotan332 qinpengxiang@outlook.com

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions