Skip to content

List element Equality comparator#10289

Merged
rapids-bot[bot] merged 129 commits intorapidsai:branch-22.06from
devavret:list-row-eq
Apr 13, 2022
Merged

List element Equality comparator#10289
rapids-bot[bot] merged 129 commits intorapidsai:branch-22.06from
devavret:list-row-eq

Conversation

@devavret
Copy link
Copy Markdown
Contributor

@devavret devavret commented Feb 14, 2022

This PR implements equality comparator for LIST columns. This only supports "self" comparison for now, meaning the two rows to be compared should belong to the same table. A comparator that works on rows of two different tables will be implemented in another PR.

This works only on "sanitized" list columns. See #10291 for details.

This will partially support #10186.

@devavret devavret added the DO NOT MERGE Hold off on merging; see PR for details label Feb 14, 2022
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Feb 14, 2022
Copy link
Copy Markdown
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty satisfied with this PR at this stage. My remaining suggestions and questions aren't really blocking, so I'll probably approve this in the next round of review, but some of my questions may spawn work for future PRs (e.g. the [lists|structs]_column_device_view question) if we think they're worthwhile.


data_profile table_data_profile;
table_data_profile.set_distribution_params(dtype, distribution_id::UNIFORM, 0, 5);
table_data_profile.set_null_frequency((include_nulls) ? 0.1 : 0.0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might balloon benchmark times too much, so feel free to decline, but this seems like a benchmark where it might be quite interesting to see how performance changes with different amounts of nulls (i.e. parametrizing this).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes the axis from include_nulls to null_frequency. Here's the results:
image
image
There's a big jump from no nulls to 10% nulls because for 0 nulls, a portion of the code is inactive. Adding nulls seems to help flat column because it doesn't have to load and check the actual value.
For list, I think we don't see the early return benefits until after 70% nulls.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the 0 nulls case is qualitatively different. For the rest I'm guessing that it's some balance between less work and more divergence? The cost of a thread idling for a list is worse than for scalars and should get worse the larger the lists since in theory the amount of idle time is potentially only bounded by the largest list. I don't think there's anything actionable here, but good to see

Copy link
Copy Markdown
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't seem to comment on the start indexes discussion anymore, but the commit you linked shows a good example of what you're aiming for so I'm going to go ahead and resolve that discussion. There are a couple of small outstanding tasks (documenting the safe template parameter and making a decision on the curr_col/temp_col/prev_col naming) but otherwise LGTM! Really awesome work here.

void const* data,
bitmask_type const* null_mask,
size_type offset)
CUDF_HOST_DEVICE column_device_view_base(data_type type,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking, this macro is really only necessary in a .hpp file that is expected to work in both host and device code TUs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change made as part of this feedback #10289 (comment). I can revert the macro changes made to all .cuh headers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking, this macro is really only necessary in a .hpp file that is expected to work in both host and device code TUs.

I agree that it's not necessary in cuh. I made this suggestion because not everyone will think about that and people will invariably copy-paste, so uniformly using CUDF_HOST_DEVICE helps reduce the chance of future errors and leaves one less thing for developers to think about. It's a minor suggestion though, if you prefer to use it more surgically that's OK with me.

@devavret
Copy link
Copy Markdown
Contributor Author

rerun tests

@vyasr
Copy link
Copy Markdown
Contributor

vyasr commented Apr 12, 2022

@gpucibot merge

@vyasr vyasr requested a review from a team April 12, 2022 23:26
@vyasr
Copy link
Copy Markdown
Contributor

vyasr commented Apr 12, 2022

The merge is blocked by needing an ops review because of the one-line change to the conda recipe. I've made that request.

@rapids-bot rapids-bot bot merged commit 0ea6f8e into rapidsai:branch-22.06 Apr 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Needs Review Waiting for reviewer to review or respond CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants