Hi authors,
Thank you for the impressive work presented in your paper. We are very interested in your research and are currently working on reproducing the experimental results mentioned in the paper.
In the text, you referenced MuxServe++ as part of your methodology. However, when we attempted to run the publicly available version of MuxServe, we encountered limitations regarding support for newer models like Llama3-8B.
To ensure our reproduction aligns with your findings, we would appreciate it if you could clarify:
How were the experiments involving newer models conducted in your study?
Is the MuxServe++ implementation or the specific code used for these experiments available (or planned to be released)?
Any guidance or access to the relevant codebase would be greatly appreciated.