Skip to content

Implement Duplex Speech-to-text model and rebase#15092

Merged
Edresson merged 188 commits intoNVIDIA-NeMo:mainfrom
kevinhu-nv:duplex-stt-rebased
Feb 26, 2026
Merged

Implement Duplex Speech-to-text model and rebase#15092
Edresson merged 188 commits intoNVIDIA-NeMo:mainfrom
kevinhu-nv:duplex-stt-rebased

Conversation

@kevinhu-nv
Copy link
Collaborator

@kevinhu-nv kevinhu-nv commented Nov 19, 2025

What does this PR do ?

Merge duplex STT changes to NeMo main.

Collection: speechlm2

Changelog

  • Added training support for using nano-9b as LLM backbone
  • Added prompt tokens support
  • Added streaming ASR support
  • Added Refactoring, unit tests, and other minor changes

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • [X ] Did you write any new necessary tests?
  • [ X ] Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

pzelasko
pzelasko previously approved these changes Feb 24, 2026
Copy link
Collaborator

@pzelasko pzelasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your work on this!

Signed-off-by: kevinhu <kevinhu@nvidia.com>
@kevinhu-nv
Copy link
Collaborator Author

Thank you @pzelasko ! Rebased to latest main again. Can you please approve again?

kevinhu-nv and others added 2 commits February 24, 2026 18:27
Signed-off-by: kevinhu <kevinhu@nvidia.com>
Signed-off-by: kevinhu-nv <kevinhu-nv@users.noreply.github.com>

DuplexS2SSpeechDecoderModel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
=======
Copy link
Collaborator

@pzelasko pzelasko Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Looks like issue during conflict resolution

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally added - removed now. Now fixing CI tests. Will let you know when all is ready.

Signed-off-by: kevinhu <kevinhu@nvidia.com>
Signed-off-by: kevinhu <kevinhu@nvidia.com>
…Will do point fix later.

Signed-off-by: kevinhu <kevinhu@nvidia.com>
Copy link
Collaborator

@pzelasko pzelasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Edresson Edresson merged commit b9f2054 into NVIDIA-NeMo:main Feb 26, 2026
214 of 216 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants