-
Notifications
You must be signed in to change notification settings - Fork 1k
[BUG] cudf::left_anti_join fails with a signal error (SIGABRT) instead of throwing an exception when there is an OOM condition #16059
Description
Describe the bug
When an out-of-memory (OOM) condition occurs, cudf::left_anti_join fails with a signal error (SIGABRT) instead of throwing an appropriate exception (std::bad_alloc).
Steps/Code to reproduce bug
TEST(CudfTest, LeftAntiJoinOOM) {
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource();
auto pool_mr = std::make_shared<rmm::mr::pool_memory_resource<rmm::mr::device_memory_resource>>(mr, 256, 2560);
rmm::mr::set_current_device_resource(pool_mr.get());
auto make_table = [](int32_t size, int32_t start) -> std::unique_ptr<cudf::table> {
auto sequence_column = cudf::sequence(size, cudf::numeric_scalar<int32_t>(start));
std::vector<std::unique_ptr<cudf::column>> columns;
columns.push_back(std::move(sequence_column));
return std::make_unique<cudf::table>(std::move(columns));
};
try {
auto left = make_table(64, 0);
auto right = make_table(128, 50);
std::cerr << "left size: " << left->num_rows() << ", right size: " << right->num_rows() << "\n";
std::unique_ptr<rmm::device_uvector<cudf::size_type>> left_indices =
cudf::left_anti_join(left->view(), right->view());
std::cerr << "done left_anti_join " << "\n";
} catch(const std::exception& e) {
std::cerr << "Caught exception: " << e.what() << "\n";
}
}
left size: 64, right size: 128
terminate called after throwing an instance of 'rmm::out_of_memory'
what(): std::bad_alloc: out_of_memory: RMM failure at:/home/alexander/envs/theseus_dev/include/rmm/mr/device/pool_memory_resource.hpp:313: Maximum pool size exceeded
Aborted (core dumped)
Running this test produces a SIGABRT (Abort signal) instead of catching a std::bad_alloc exception:
Expected behavior
The function should throw a std::bad_alloc exception which can be caught and handled gracefully, instead of terminating the program with a signal error.
Environment details
Method of cuDF install: source code
v24.06.00 branch release
Additional context
After debugging the internal functions utilized in cudf::left_anti_join, I determined that the cudf::detail::contains call is failing.
cudf/cpp/src/join/semi_join.cu
Line 70 in c83e5b3
| auto const flagged = cudf::detail::contains(right_keys, |
cudf/cpp/include/cudf/detail/search.hpp
Line 95 in c83e5b3
| rmm::device_uvector<bool> contains(table_view const& haystack, |