Eliminate double LLM call for non-tool messages (#672)#727
Conversation
When the tool-calling orchestrator returns a text response without invoking any tools, reuse that content directly instead of discarding it and making a redundant CompleteAsync call. This eliminates the double LLM call for non-tool messages on live providers, halving latency and cost for conversational replies in board-scoped sessions. The proposal creation path is preserved: the reused response flows through the same shouldAttemptProposal logic as before.
When the orchestrator's no-tool text response is reused, run the local LlmIntentClassifier on the user message to determine actionability and extract instructions. This preserves the proposal creation path for messages like "create card X" without making a second LLM call.
Five new tests verify: - Non-tool responses only make one LLM call (CompleteAsync never called) - Tool-calling responses still work correctly - Degraded orchestrator with null content falls back to CompleteAsync - Non-board-scoped sessions still use single-turn CompleteAsync - Reused response still triggers proposal creation when RequestProposal set
Adversarial Self-ReviewResponse shape coverageAll orchestrator response shapes are handled:
Proposal creation pathPreserved. The local Known acceptable differenceThe reused text content is shaped by the tool-calling system prompt (which describes available tools), while the previously-used SignalR tool status eventsUnaffected. Quota trackingUsage is recorded once per orchestrator call, in the correct branch (either the tool-calling branch or the reuse branch). No double-counting. No actionable findingsAll edge cases are covered. No further changes needed. |
There was a problem hiding this comment.
Pull request overview
This PR updates ChatService to avoid an unnecessary second LLM provider call when the tool-calling orchestrator returns a plain-text response without invoking any tools, while preserving proposal/actionability detection by running the local intent classifier on the user message.
Changes:
- Reuse orchestrator text responses (no-tool-call) instead of discarding them and calling
CompleteAsyncagain. - Run
LlmIntentClassifier+NaturalLanguageInstructionExtractorlocally for reused orchestrator responses to keep proposal creation behavior. - Add/extend unit tests to ensure
CompleteAsyncis not called in the reused-response path and that fallback behavior remains correct.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| backend/src/Taskdeck.Application/Services/ChatService.cs | Reuses orchestrator content for no-tool responses and runs local intent classification to preserve proposal creation behavior. |
| backend/tests/Taskdeck.Application.Tests/Services/ChatServiceTests.cs | Adds coverage for single-call behavior (no-tool, tool-calls, degraded fallback, non-board sessions, request-proposal). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else if (!toolResult.IsDegraded && toolResult.Content != null) | ||
| { | ||
| // The LLM responded with plain text (no tool calls). Reuse | ||
| // this content instead of making a redundant CompleteAsync | ||
| // call (#672). The response still flows through proposal |
There was a problem hiding this comment.
The reuse branch treats any non-null orchestrator Content as reusable. ToolCallingChatOrchestrator can emit an empty string when the provider returns IsComplete=true but Content=null, which would now be persisted as a blank assistant message instead of falling back to CompleteAsync. Consider requiring !string.IsNullOrWhiteSpace(toolResult.Content) before reusing; otherwise fall back to single-turn (and/or treat whitespace as degraded).
| @@ -1,8 +1,10 @@ | |||
| using FluentAssertions; | |||
| using Microsoft.Extensions.Logging; | |||
There was a problem hiding this comment.
using Microsoft.Extensions.Logging; appears unused in this test file (the logger type is referenced via fully-qualified name). If the solution treats warnings as errors or runs analyzers, this can break builds; please remove the unused using or use the imported namespace consistently.
| using Microsoft.Extensions.Logging; |
| /// Helper to build a ChatService with a tool-calling orchestrator injected. | ||
| /// Uses a mock orchestrator that returns the provided ToolCallingResult. |
There was a problem hiding this comment.
The XML doc on BuildServiceWithOrchestrator says it "uses a mock orchestrator that returns the provided ToolCallingResult", but the helper actually accepts a real ToolCallingChatOrchestrator instance and just injects it into ChatService. Please update/remove this doc to avoid misleading future test edits.
| /// Helper to build a ChatService with a tool-calling orchestrator injected. | |
| /// Uses a mock orchestrator that returns the provided ToolCallingResult. | |
| /// Helper to build a ChatService with the provided tool-calling orchestrator injected. |
| callSequence++; | ||
| if (callSequence == 1) | ||
| { | ||
| var args = System.Text.Json.JsonDocument.Parse("{}").RootElement; |
There was a problem hiding this comment.
JsonDocument.Parse("{}").RootElement creates a JsonDocument that should be disposed. Even in tests, this can leak unmanaged memory and may trigger analyzers. Prefer using var doc = JsonDocument.Parse("{}"); var args = doc.RootElement; (or use JsonSerializer.Deserialize<JsonElement>("{}")).
| var args = System.Text.Json.JsonDocument.Parse("{}").RootElement; | |
| var args = System.Text.Json.JsonSerializer.Deserialize<System.Text.Json.JsonElement>("{}"); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a36cfabd80
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| var (classifiedActionable, classifiedIntent) = | ||
| LlmIntentClassifier.Classify(dto.Content); | ||
| List<string>? classifiedInstructions = null; | ||
| if (classifiedActionable) | ||
| { |
There was a problem hiding this comment.
Keep LLM instruction extraction for reused no-tool replies
This branch now synthesizes LlmCompletionResult from LlmIntentClassifier + NaturalLanguageInstructionExtractor instead of calling CompleteAsync, which removes the structured instruction-extraction path used by live providers. For board-scoped messages where tool-calling returns plain text (no tool calls), intents like column.reorder are marked actionable but have no extractor support, so Instructions is empty and chat falls back to parsing the raw user sentence with ParseInstructionAsync (strict syntax), causing proposal creation to fail for natural-language requests that previously depended on provider-extracted instructions.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request eliminates redundant LLM calls in ChatService by reusing the plain-text response from the tool-calling orchestrator when no tools are invoked. It introduces logic to perform local intent classification and instruction extraction on these reused responses to maintain proposal creation functionality. Comprehensive unit tests have been added to verify the elimination of double calls across various scenarios, including board-scoped and non-board-scoped sessions, as well as fallback behavior for degraded orchestrator results. I have no feedback to provide as no review comments were submitted.
Adversarial Review (Round 2)Severity: Medium -- Empty-string content reuse creates blank assistant messagesLocation: else if (!toolResult.IsDegraded && toolResult.Content != null)The orchestrator coalesces Content: llmResult.Content ?? "",When the LLM provider returns Fix: Change the guard to Severity: Low -- Indentation inconsistencyLines 288-407 inside the Fix: Re-indent the block contents to the correct nesting level. Severity: Low -- Redundant fully-qualified name in testsLines 1147 and 1250 of Fix: Use the short form Severity: Info -- Copilot P2 comment on LLM instruction extraction (not a regression)Copilot flagged that the reuse path uses the static Severity: Info -- JsonDocument disposal in testLine 1224 creates a SummaryOne actionable bug (empty string reuse), two code quality fixes (indentation, redundant FQN). Addressing now. |
The orchestrator coalesces null Content to "" when IsComplete is true. The previous != null check would reuse this empty string as a blank assistant message. Use IsNullOrWhiteSpace instead so empty/whitespace content falls through to CompleteAsync. Also fix indentation of the !usedToolCalling block to match the project's 4-space indent conventions.
- Add test verifying empty orchestrator content falls back to CompleteAsync instead of persisting a blank assistant message. - Fix misleading XML doc on BuildServiceWithOrchestrator helper. - Replace fully-qualified ILogger<> with short form (using already imported). - Replace undisposed JsonDocument.Parse with JsonSerializer.Deserialize.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary
CompleteAsynccallLlmIntentClassifieron the user message to preserve actionability detection and proposal creation for messages like "create card X"Closes #672
Test plan
Times.NeveronCompleteAsync)CompleteAsyncCompleteAsync(orchestrator not invoked)RequestProposalis set