Skip to content

add cuda graph capture failure possible solution#3430

Merged
zhyncs merged 1 commit intomainfrom
zhyncs/doc
Feb 9, 2025
Merged

add cuda graph capture failure possible solution#3430
zhyncs merged 1 commit intomainfrom
zhyncs/doc

Conversation

@zhyncs
Copy link
Collaborator

@zhyncs zhyncs commented Feb 9, 2025

Motivation

When using EAGLE 2 speculative decoding, it may fail to start up due to a CUDA graph capture failure. Consider reducing the CUDA graph max batch size (default is 160).

Modifications

Checklist

  • Format your code according to the Code Formatting with Pre-Commit.
  • Add unit tests as outlined in the Running Unit Tests.
  • Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
  • Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
  • For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

@zhyncs zhyncs merged commit bc72e5b into main Feb 9, 2025
2 of 18 checks passed
@zhyncs zhyncs deleted the zhyncs/doc branch February 9, 2025 14:57
@zhyncs
Copy link
Collaborator Author

zhyncs commented Feb 9, 2025

fix #3395

@zhyncs zhyncs mentioned this pull request Feb 10, 2025
13 tasks
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant