Store any_resource in device_buffer and device_uvector#2201
Store any_resource in device_buffer and device_uvector#2201rapids-bot[bot] merged 6 commits intorapidsai:mainfrom
Conversation
Change the per-device resource ref map to store `cuda::mr::any_resource<device_accessible>` instead of `device_async_resource_ref`. This provides ownership semantics - the global map now owns the stored resources. Key changes: - `get_ref_map()` now returns map with `any_resource` values - `set_per_device_resource_ref()` reifies incoming refs to `any_resource` - `get_per_device_resource_ref()` returns refs from stored `any_resource` - Added conversion operator and constructor in `cccl_async_resource_ref` to support conversion to/from `any_resource` Note: Refs returned from get/set operations now point to the owned copy inside `any_resource`, not the original object. Tests updated to verify functional correctness rather than ref identity.
This reverts commit e791a1e.
This implements the next step of the CCCL 3.2 migration plan: adopting any_resource in custom containers, starting with device_buffer. Changes: - Change device_buffer::_mr from device_async_resource_ref to cuda::mr::any_resource<cuda::mr::device_accessible> - Make memory_resource() non-const on device_buffer and device_uvector (required because CCCL's resource_ref can only be constructed from non-const any_resource&) - Use std::move for _mr in move constructor and move assignment This ensures device_buffer owns its memory resource, preventing use-after-free if the original resource is destroyed while the buffer still exists.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
| * @briefreturn{The resource used to allocate and deallocate} | ||
| */ | ||
| [[nodiscard]] rmm::device_async_resource_ref memory_resource() const noexcept { return _mr; } | ||
| [[nodiscard]] rmm::device_async_resource_ref memory_resource() noexcept { return _mr; } |
There was a problem hiding this comment.
This is a breaking change. Should we make the resource mutable or make this method non-const? I discussed a bit with the CCCL team.
The CCCL buffer returns any_resource const& which requires a copy to be made before the resource (or ref) is usable (constructing a ref and making an allocation are not possible from a const resource). I would consider that approach but I don’t want to alter the API in that way just yet. It might be better to handle that as we transition to the CCCL buffer type.
| EXPECT_EQ(mr, rmm::device_async_resource_ref{rmm::mr::detail::initial_resource()}); | ||
| } | ||
|
|
||
| TEST(DefaultTest, SetCurrentDeviceResourceRefFromPointer) |
There was a problem hiding this comment.
This is an extra test I wanted to add after #2200. Not buffer related.
PointKernel
left a comment
There was a problem hiding this comment.
I have one design question for clarification and learning purposes; otherwise, the PR looks good to me.
|
/merge |
Summary
Part of #2011.
This PR adopts
any_resourceindevice_bufferanddevice_uvector.device_buffer::_mrfromdevice_async_resource_reftocuda::mr::any_resource<cuda::mr::device_accessible>memory_resource()non-const ondevice_bufferanddevice_uvector(required because CCCL'sresource_refcan only be constructed from non-constany_resource&)std::movefor_mrin move constructor and move assignmentThis ensures
device_bufferowns its memory resource, preventing use-after-free if the original resource is destroyed while the buffer still exists.Stacked PR
This PR is stacked on top of #2200. To view only the changes in this PR:
bdice/rmm@device-ref-map-any-resource...containers-any-resource
Test plan