Guidance Needed: Orpheus-TTS + TensorRT-LLM Optimization for Concurrency

Hi, I have successfully continued the pre training and then finetuned the model on high quality voices.

the next major challenge, using the model for production and serve the multiple concurrent users. my use case is to serve it for telephony customer support agent, I tried various method could not set up a inference pipeline which serves the purpose.

I tried vllm, tensorrt, tried to quantise but everything seems to fail

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance Needed: Orpheus-TTS + TensorRT-LLM Optimization for Concurrency #283

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Guidance Needed: Orpheus-TTS + TensorRT-LLM Optimization for Concurrency #283

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions