Skip to content

Guidance Needed: Orpheus-TTS + TensorRT-LLM Optimization for Concurrency #283

@shekharmeena2896

Description

@shekharmeena2896

Hi, I have successfully continued the pre training and then finetuned the model on high quality voices.

the next major challenge, using the model for production and serve the multiple concurrent users. my use case is to serve it for telephony customer support agent, I tried various method could not set up a inference pipeline which serves the purpose.

I tried vllm, tensorrt, tried to quantise but everything seems to fail

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions