Skip to content

Conversation

@a4zhangfei
Copy link
Contributor

@a4zhangfei a4zhangfei commented Sep 21, 2025

fix #10660

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @a4zhangfei, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug preventing the correct generation and return of output token log probabilities during lookahead speculative decoding. The changes ensure that all necessary log probability metrics are accurately calculated and appended to request objects, providing complete information for users who require logprob outputs. It also adds flexibility for developers to configure the logprob calculation method via an environment variable, specifically concerning the application of temperature scaling.

Highlights

  • Log Probability Fix: Resolved an issue where output_token_logprobs were missing when utilizing lookahead speculative decoding, ensuring accurate log probability reporting.
  • Configurable Logprob Calculation: Introduced a new environment variable, RETURN_ORIGINAL_LOGPROB, allowing control over whether log probabilities are calculated with or without temperature scaling.
  • Enhanced Logprob Detail: Implemented a dedicated internal method to compute and store comprehensive log probability values, including next token logprobs, top-k logprobs, and specific token ID logprobs, for accepted tokens during the decoding process.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an issue where output_token_logprobs were missing when using lookahead speculative decoding. The changes introduce a new method _add_logprob_values to correctly calculate and append log probabilities for the generated tokens. The fix is well-targeted and the logic appears correct. I have one suggestion to improve the code's efficiency and readability.

Copy link
Collaborator

@Qiaolin-Yu Qiaolin-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put this logic into lookahead_worker.py, like what eagle_worker.py did?

@a4zhangfei a4zhangfei changed the title fix missing output_token_logprobs when using lookahead speculative decoding fix missing output_token_logprobs when using ngram speculative decoding Sep 29, 2025
@a4zhangfei
Copy link
Contributor Author

@Qiaolin-Yu When can we merge the code?

@merrymercy merrymercy merged commit 6f08488 into sgl-project:main Nov 10, 2025
13 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] missing output_token_logprobs when using lookahead speculative decoding

4 participants