Skip to content

Migrate Python/Cython bindings from device_memory_resource to device_async_resource_ref#2300

Merged
bdice merged 1 commit intorapidsai:stagingfrom
bdice:python-remove-dmr
Mar 17, 2026
Merged

Migrate Python/Cython bindings from device_memory_resource to device_async_resource_ref#2300
bdice merged 1 commit intorapidsai:stagingfrom
bdice:python-remove-dmr

Conversation

@bdice
Copy link
Copy Markdown
Collaborator

@bdice bdice commented Mar 14, 2026

Summary

Replaces shared_ptr[device_memory_resource] with per-subclass unique_ptr[ConcreteType] (owning) and optional[device_async_resource_ref] (non-owning reference) across all Python/Cython bindings. This is a part of #2011.

There are significant opportunities to make this Cython code better over time but I have to get something that removes device_memory_resource from the Python/Cython side before I can finish migration on the C++ side (#2296). I welcome critique of this design, and ideas for how it can be improved, particularly from @vyasr @wence-. I would like to address any suggested improvements in follow-up PRs, because this changeset is necessary to unblock #2301.

The changes in cdef class DeviceMemoryResource are perhaps the most significant changes here from a design perspective.

The solution I'm going with for now is to keep the DeviceMemoryResource class around, as a base class for the Cython MRs, and let it handle allocate/deallocate. It owns a optional[device_async_resource_ref] which is used for allocation/deallocation. It's optional so that the class can be default-constructed (Cython requires nullary constructors), but it should never be nullopt except during initialization.

Then, each MR class owns a c_obj like unique_ptr[cuda_memory_resource]. This is unique_ptr so it can be default-constructed for Cython's requirements. I chose unique_ptr over optional here to emphasize that this member is the thing that actually owns the resource. As with the c_ref, this should never be nullptr except during initialization. When an MR class is created, it initializes its c_obj and then constructs a c_ref (a member inherited from the DeviceMemoryResource base class).

"Special" methods for an MR like getting the statistics counts go through deref(self.c_obj), and "common" methods like allocate/deallocate go through self.c_ref.value().

Changes

  • .pxd declarations: Remove device_memory_resource class. Declare device_async_resource_ref and a make_device_async_resource_ref() inline C++ template that returns optional to work around Cython generating default-constructed temporaries for non-default-constructible types. All adaptor constructors take device_async_resource_ref instead of device_memory_resource*.
  • .pxd class definitions: DeviceMemoryResource base holds optional[device_async_resource_ref] c_ref; each concrete subclass holds unique_ptr[ConcreteType] c_obj.
  • .pyx implementations: All __cinit__ methods construct via unique_ptr then set c_ref via make_device_async_resource_ref. Typed accessors (pool_size, flush, etc.) use deref(self.c_obj). Per-device functions use set_per_device_resource_ref.
  • device_buffer.pyx: Passes self.mr.c_ref.value() instead of self.mr.get_mr().

Closes #2294

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 14, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bdice
Copy link
Copy Markdown
Collaborator Author

bdice commented Mar 14, 2026

/ok to test

@bdice bdice moved this to In Progress in RMM Project Board Mar 14, 2026
@bdice bdice added breaking Breaking change improvement Improvement / enhancement to an existing function labels Mar 14, 2026
@bdice bdice self-assigned this Mar 14, 2026
…async_resource_ref

Replace shared_ptr[device_memory_resource] with per-subclass
unique_ptr[ConcreteType] for ownership and optional[device_async_resource_ref]
for the non-owning reference used by allocate/deallocate.

Key changes:
- librmm .pxd declarations: remove device_memory_resource class, declare
  device_async_resource_ref, add make_device_async_resource_ref() inline
  C++ template returning optional to avoid Cython default-constructor
  temporaries, update all adaptor constructors to take resource_ref
- pylibrmm .pxd declarations: DeviceMemoryResource holds
  optional[device_async_resource_ref] c_ref instead of
  shared_ptr[device_memory_resource] c_obj; each concrete subclass holds
  unique_ptr[ConcreteType] c_obj
- pylibrmm .pyx implementations: all __cinit__ methods construct via
  unique_ptr then set c_ref via make_device_async_resource_ref; typed
  accessors (pool_size, flush, etc.) use deref(self.c_obj); per-device
  functions use set_per_device_resource_ref
- device_buffer.pyx: pass self.mr.c_ref.value() instead of
  self.mr.get_mr()

Closes rapidsai#2294
@bdice bdice force-pushed the python-remove-dmr branch from bddb97a to 20f2b6a Compare March 17, 2026 07:25
@bdice bdice marked this pull request as ready for review March 17, 2026 07:31
@bdice bdice requested a review from a team as a code owner March 17, 2026 07:31
@bdice bdice requested review from TomAugspurger and removed request for a team March 17, 2026 07:31
@bdice bdice merged commit 1d17c3b into rapidsai:staging Mar 17, 2026
72 of 75 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in RMM Project Board Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change improvement Improvement / enhancement to an existing function

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants