Summary
Fix issues with slow chat responsiveness and less than ideal display of tool call results
Problem statement
I am building a web UI for the ZeroClaw gateway which will have a full workspace, chat, skill and apps etc. I've been working on chat, and implementing different card types for streamingContent, reasoningContent and tool calls.
While doing this I have had two issues:
- streamingContent card type seemingly not used; agent responses arrive in one large chunk instead of being streamed as it is being generated, which makes the same model feel 10x slower in ZeroClaw compared to VSCode or other clients (same chunking issue observed in terminal)
- tool call content being displayed content inside regular messages
Here is my audit of current behavior:
Root Causes (Summary)
- RC1: ZeroClaw compat streaming emits one content delta; client must re-chunk for visible progressive rendering.
- RC2:
/v1/chat/completions does not convey structured tool telemetry; client needs /api/events bridging to render tool cards for gateway tool use.
F1: “Streaming” can legitimately look non-streaming
Cause:
- ZeroClaw compat
POST /v1/chat/completions emits the full assistant text in one SSE delta chunk.
Effect:
- Client sees exactly one large
delta event, so the UI renders the entire answer in one update.
Conclusion:
- This is not a frontend bug by itself; it is parity with ZeroClaw compat behavior. Client must re-chunk large deltas if it wants visible incremental rendering.
F2: Tool use often cannot render as tool cards when using /v1/chat/completions
Cause:
- ZeroClaw’s
/v1/chat/completions compat response does not include structured tool-call events.
- ZeroClaw sanitizes tool-tag markup from assistant text (so the backend cannot reliably recover
<tool_call>...</tool_call> from content).
Effect:
- Even when ZeroClaw uses tools internally, Client may only get plain assistant text describing tool steps, which the UI renders as text.
Conclusion:
- To render tool-use UI consistently, Client needs a structured tool signal. The only stable source in ZeroClaw is
/api/events (SSE broadcast tool telemetry).
Proposed solution
- Implement streaming of deltas
- Expose structured tool telemetry
Non-goals / out of scope
N/A
Alternatives considered
No response
Acceptance criteria
- Client can stream agent thinking and reasoning process step by step in web ui
- Client can display tool call progress and results in a separate toolCall component
Architecture impact
/v1/chat/completions
Risk and rollback
Risk: May break existing frontend implementations; however this change is needed to deliver the expected user experience from an agent
Breaking change?
No
Data hygiene checks
Summary
Fix issues with slow chat responsiveness and less than ideal display of tool call results
Problem statement
I am building a web UI for the ZeroClaw gateway which will have a full workspace, chat, skill and apps etc. I've been working on chat, and implementing different card types for streamingContent, reasoningContent and tool calls.
While doing this I have had two issues:
Here is my audit of current behavior:
Root Causes (Summary)
/v1/chat/completionsdoes not convey structured tool telemetry; client needs/api/eventsbridging to render tool cards for gateway tool use.F1: “Streaming” can legitimately look non-streaming
Cause:
POST /v1/chat/completionsemits the full assistant text in one SSE delta chunk.Effect:
deltaevent, so the UI renders the entire answer in one update.Conclusion:
F2: Tool use often cannot render as tool cards when using
/v1/chat/completionsCause:
/v1/chat/completionscompat response does not include structured tool-call events.<tool_call>...</tool_call>from content).Effect:
Conclusion:
/api/events(SSE broadcast tool telemetry).Proposed solution
Non-goals / out of scope
N/A
Alternatives considered
No response
Acceptance criteria
Architecture impact
/v1/chat/completions
Risk and rollback
Risk: May break existing frontend implementations; however this change is needed to deliver the expected user experience from an agent
Breaking change?
No
Data hygiene checks