Perf: use GPU-only searchsorted instead of numpy.repeat by eriknw · Pull Request #253 · rapidsai/nx-cugraph

eriknw · 2026-04-02T22:06:18Z

The GPU-only searchsorted for doing indptr-to-COO expansion (which eliminates D2H transfers) is faster than numpy.repeat.

cupy.repeat does not yet support ndarray as repeats argument, so searchsorted is the appropriate recipe to do instead for most data. For very large data, a "cumsum+scatter" approach may be faster. cp.repeat is being updated in cupy/cupy#9828

Example speed improvement:

from_csr (1M nodes, 20M edges): 2.4ms vs 72ms (30x faster)

This PR was motivated by work I am doing in cupy/cupy#9825

The GPU-only searchsorted for doing indptr-to-COO expansion (which eliminates D2H transfers) is faster than numpy.repeat. `cupy.repeat` does not yet support ndarray as `repeats` argument, so searchsorted is the appropriate recipe to do instead for most data. For very large data, a "cumsum+scatter" approach may be faster. `cp.repeat` is being updated in cupy/cupy#9828 Example speed improvement: from_csr (1M nodes, 20M edges): 2.4ms vs 72ms (30x faster) This PR was motivated by work I am doing in cupy/cupy#9825

eriknw added 2 commits April 2, 2026 14:56

Merge branch 'main' into cp_repeat_workaround

cf1e57f

eriknw added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: use GPU-only searchsorted instead of numpy.repeat#253

Perf: use GPU-only searchsorted instead of numpy.repeat#253
eriknw wants to merge 2 commits intorapidsai:mainfrom
eriknw:cp_repeat_workaround

eriknw commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eriknw commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant