Skip to content

feat(ui): show live and average tokens per second in shell UI#1868

Open
cdhcrs wants to merge 2 commits intoMoonshotAI:mainfrom
cdhcrs:feat/shell-ui-tps-display
Open

feat(ui): show live and average tokens per second in shell UI#1868
cdhcrs wants to merge 2 commits intoMoonshotAI:mainfrom
cdhcrs:feat/shell-ui-tps-display

Conversation

@cdhcrs
Copy link
Copy Markdown

@cdhcrs cdhcrs commented Apr 14, 2026

Description

Show live tokens per second during streaming and display the average TPS at turn end in the shell UI.

  • Live TPS is shown in the composing and thinking spinners after 0.5s.
  • Average TPS is printed as a summary line when the turn completes.
  • Summary is only shown at turn boundaries, never in the middle of content.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any. (Change is <100 lines)
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

@cdhcrs cdhcrs closed this Apr 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fad57ec910

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

def cleanup(self, is_interrupt: bool) -> None:
"""Cleanup the live view on step end or interruption."""
self.flush_content()
self.flush_content(show_summary=not is_interrupt)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit summaries only at true turn boundaries

cleanup() now forwards show_summary=not is_interrupt, but cleanup() is also used by the StepBegin and SteerInput paths, not just when a turn finishes. In turns that continue across multiple steps (for example after steer injection), this can print a Composed ... tok/s summary before the turn actually ends, which violates the stated behavior and adds mid-turn noise. Summary emission should be restricted to explicit turn-boundary handling.

Useful? React with 👍 / 👎.

Comment on lines +263 to +266
seconds = max(1, int(elapsed))
if tokens_int > 0:
rate = int(tokens_int / seconds)
return rate if rate > 0 else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use fractional elapsed time for TPS calculation

The TPS calculation floors elapsed seconds via int(elapsed), which makes rates materially inaccurate once elapsed time is between integer boundaries (e.g. 100 tokens over 1.9s is reported as 100 tok/s instead of about 52). Since this function powers the new average throughput summary, users can be shown inflated numbers; compute with the real elapsed float (clamped away from zero) and round only for display.

Useful? React with 👍 / 👎.

devin-ai-integration[bot]

This comment was marked as resolved.

@cdhcrs cdhcrs reopened this Apr 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89823b65c3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

match msg:
case TurnBegin():
self.flush_content()
self.flush_content(show_summary=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit end-of-turn TPS summary in non-interactive live view

This summary trigger is wired to TurnBegin, but in _LiveView a run starts with no active content block, so flush_content(show_summary=True) is effectively a no-op there. I checked _LiveView.visualize_loop and its shutdown path still ends with cleanup(is_interrupt=False), which calls flush_content() without summaries; in non-interactive shell usage (visualize(..., prompt_session=None), including replay), the new average TPS line is therefore never shown at turn end.

Useful? React with 👍 / 👎.

@cdhcrs cdhcrs force-pushed the feat/shell-ui-tps-display branch from 89823b6 to 3607d3e Compare April 14, 2026 09:35
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3607d3e489

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

block = _ContentBlock(is_think=False)
block._start_time = time.monotonic() - 2.0
block._token_count = 100.0
assert block._average_tps() == 50
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stabilize average TPS assertion against monotonic drift

This new assertion is effectively brittle on high-resolution clocks: block._start_time = time.monotonic() - 2.0 followed by _average_tps() always measures slightly more than 2.0s, so int(100 / elapsed) commonly evaluates to 49 instead of 50. That makes the test fail (or become platform-dependent) even when the implementation is correct; use a frozen/monkeypatched clock or assert a range/tolerance.

Useful? React with 👍 / 👎.

match msg:
case TurnBegin():
self.flush_content()
self.flush_content(show_summary=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit TPS summary on base live-view shutdown

The summary trigger was moved to TurnBegin, but the base _LiveView shutdown path still calls cleanup(is_interrupt=False) (which flushes without summaries), so the final turn in non-interactive runs never prints the new average TPS line. Fresh evidence: replay_recent_history feeds visualize() event streams that do not include TurnBegin events for each replayed turn (replay.py builds turns by consuming TurnBegin as a delimiter), so this path reliably skips summaries.

Useful? React with 👍 / 👎.

@cdhcrs cdhcrs force-pushed the feat/shell-ui-tps-display branch 2 times, most recently from 4f70eb2 to 3077a78 Compare April 14, 2026 09:55
@cdhcrs cdhcrs force-pushed the feat/shell-ui-tps-display branch from 3077a78 to f65f7b0 Compare April 14, 2026 10:14
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f65f7b09b8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +608 to +610
summary = block.compose_summary()
if summary is not None:
console.print(summary)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Aggregate TPS across the full turn before summarizing

The summary path now emits metrics from only self._current_content_block, so in turns that produce multiple composing blocks (for example text before/after a tool call, think↔compose transitions, or steer-triggered cleanup()), earlier block stats are flushed and dropped before the end-of-turn print. As a result, the new Composed ... tok/s line can report only the last segment rather than the whole turn, which makes the advertised average throughput misleading in common multi-step agent runs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. This is intentional per-segment behavior. The Composed ... tok/s summary reflects the throughput of a single continuous composing stream, which matches the live spinner's real-time rate. Aggregating across tool calls, thinking pauses, or steer injections would include idle time and produce a misleadingly low number that doesn't represent the actual streaming speed users experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant