Skip to content

Update decode kernel benchmark with new Triton backend interface#3618

Closed
amosyou wants to merge 2 commits intosgl-project:mainfrom
amosyou:fix_kernel_benchmark_triton_interface
Closed

Update decode kernel benchmark with new Triton backend interface#3618
amosyou wants to merge 2 commits intosgl-project:mainfrom
amosyou:fix_kernel_benchmark_triton_interface

Conversation

@amosyou
Copy link
Contributor

@amosyou amosyou commented Feb 17, 2025

Motivation

The Triton kernel for decode attention was updated with a new backend interface in #3292, breaking the benchmark code.

Modifications

Corrected the import for should_use_tensor_core and replaced req_to_token, b_req_idx, b_seq_len with kv_indptr and kv_indices.

Checklist

@amosyou amosyou changed the title Update kernel benchmark with new Triton backend interface Update decode kernel benchmark with new Triton backend interface Feb 17, 2025
@amosyou amosyou force-pushed the fix_kernel_benchmark_triton_interface branch from f512446 to ff2eb16 Compare February 17, 2025 03:09
@amosyou
Copy link
Contributor Author

amosyou commented Feb 17, 2025

cc @zhyncs

@amosyou
Copy link
Contributor Author

amosyou commented Feb 18, 2025

hi @zhyncs ! this should be good to go, let me know if there's any concern

@github-actions
Copy link
Contributor

This pull request has been automatically closed due to inactivity. Please feel free to reopen it if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant