Skip to content

xiaomi-research/dasheng-denoiser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dasheng Denoiser

Official PyTorch inference code for the Interspeech 2025 paper:
Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders

version version python mit

System Framework

This paper introduces an efficient and extensible speech enhancement method. Our approach involves initially extracting audio embeddings from noisy speech using a pre-trained audioencoder, which are then denoised by a compact encoder network. Subsequently, a vocoder synthesizes the clean speech from denoised embeddings. An ablation study substantiates the parameter efficiency of the denoise encoder with a pre-trained audioencoder and vocoder.

system framework

Pre-trained Model and Demos

model # parametars demos
dasheng-denoiser 118.5 M demo page

Installation and Usage

pip install git+https://github.com/xiaomi-research/dasheng-denoiser.git

# if you haven't downloaded the pre-trained model checkpoint, just gave the model name, it will download the model automatically
dasheng-denoiser -i path/to/noisy_wav_dir -o path/to/output_dir
# if you have downloaded the pre-trained model checkpoint, you can use the checkpoint on your desk
dasheng-denoiser -i path/to/noisy_wav_dir -o path/to/output_dir -m path/to/xx.pt

Acknowledgements

We referred to Dasheng and Vocos to implement this.

Citation

@inproceedings{xingwei2025dashengdenoiser,
  title={Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders},
  author={Xingwei Sun, Heinrich Dinkel, Yadong Niu, Linzhang Wang, Junbo Zhang, Jian Luan},
  booktitle={Interspeech 2025},
  year={2025}
}

About

Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages