Skip to content

feat(tts): implement tts#965

Merged
DavdGao merged 26 commits intoagentscope-ai:mainfrom
qbc2016:bc/tts
Dec 7, 2025
Merged

feat(tts): implement tts#965
DavdGao merged 26 commits intoagentscope-ai:mainfrom
qbc2016:bc/tts

Conversation

@qbc2016
Copy link
Copy Markdown
Member

@qbc2016 qbc2016 commented Nov 25, 2025

AgentScope Version

1.0.9dev

Description

As the title says.

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has been formatted with pre-commit run --all-files command
  • All tests are passing
  • Docstrings are in Google style
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

@qbc2016 qbc2016 changed the title [WIP] feat(tts): implement tts feat(tts): implement tts Nov 25, 2025
@qbc2016 qbc2016 changed the title feat(tts): implement tts [WIP]feat(tts): implement tts Nov 25, 2025
@qbc2016 qbc2016 changed the title [WIP]feat(tts): implement tts feat(tts): implement tts Nov 26, 2025
Copy link
Copy Markdown
Member

@DavdGao DavdGao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz see inline comments

@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Dec 2, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Member

@DavdGao DavdGao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qbc2016

  1. Please test more model and voice combinations, e.g. the combination of RealtimeTTSModel * qwen3-tts-flash-realtime * Serena doesn't work at all.
  2. When we ask the agent to generate a long response (e.g. 背诵长恨歌), the realtime API will be blocked for unknown reason in the example.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements comprehensive Text-to-Speech (TTS) functionality for AgentScope, introducing a unified interface for multiple TTS API providers including OpenAI, Gemini, and DashScope. The implementation supports both realtime (streaming input) and non-realtime TTS models, with seamless integration into the ReActAgent class.

Key Changes:

  • Added TTS base class and model implementations for 4 different TTS APIs with streaming support
  • Integrated TTS capability into ReActAgent with automatic speech synthesis
  • Extended the Message class with a speech field for audio content

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 24 comments.

Show a summary per file
File Description
src/agentscope/tts/_tts_base.py Base class defining the TTS model interface with support for both realtime and non-realtime models
src/agentscope/tts/_tts_response.py Data structures for TTS responses and usage tracking
src/agentscope/tts/_openai_tts_model.py OpenAI TTS implementation with streaming output support
src/agentscope/tts/_gemini_tts_model.py Gemini TTS implementation with streaming output support
src/agentscope/tts/_dashscope_tts_model.py DashScope non-realtime TTS implementation
src/agentscope/tts/_dashscope_realtime_tts_model.py DashScope realtime TTS with streaming input and output support
src/agentscope/tts/__init__.py TTS module exports and public API
src/agentscope/agent/_react_agent.py Integration of TTS into ReActAgent for automatic speech synthesis
src/agentscope/agent/_agent_base.py Updated agent base to handle audio playback from speech field
src/agentscope/agent/_utils.py Added async null context manager utility for optional TTS
src/agentscope/message/_message_base.py Added speech field to Message class and updated get_text_content method
src/agentscope/hooks/_studio_hooks.py Updated hook field names for studio integration
tests/tts_openai_test.py Comprehensive unit tests for OpenAI TTS model
tests/tts_gemini_test.py Comprehensive unit tests for Gemini TTS model
tests/tts_dashscope_test.py Comprehensive unit tests for both DashScope TTS models
examples/functionality/tts/main.py Example demonstrating TTS integration with ReActAgent
examples/functionality/tts/README.md Documentation for the TTS example
docs/tutorial/en/src/task_tts.py English tutorial for using TTS in AgentScope
docs/tutorial/zh_CN/src/task_tts.py Chinese tutorial for using TTS in AgentScope
docs/tutorial/en/index.rst Updated English documentation index
docs/tutorial/zh_CN/index.rst Updated Chinese documentation index
README.md Added TTS feature announcement
README_zh.md Added TTS feature announcement in Chinese

# Conflicts:
#	src/agentscope/pipeline/_functional.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 20 comments.

@DavdGao DavdGao merged commit a000954 into agentscope-ai:main Dec 7, 2025
11 checks passed
rayrayraykk pushed a commit to rayrayraykk/AgentScope that referenced this pull request Dec 22, 2025
)

---------

Co-authored-by: DavdGao <gaodawei.gdw@alibaba-inc.com>
Co-authored-by: DavdGao <gaodawei.gdw@gmail.com>
Kanaricc pushed a commit to Kanaricc/agentscope that referenced this pull request Feb 24, 2026
)

---------

Co-authored-by: DavdGao <gaodawei.gdw@alibaba-inc.com>
Co-authored-by: DavdGao <gaodawei.gdw@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants