Can SGlang load model weights from an existing instance with Ckpt Engine? #12262
-
|
I'm trying #11755 and #12216 to speed up loading model weights. And the workflow of init sglang instance for ckpt engine worker is clear. But i'm confused as to whether P2P: Used when new inference instances are dynamically added (due to restarts or dynamic availability) while the existing instances are already serving requests.And i know |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
It is resolved. |
Beta Was this translation helpful? Give feedback.
It is resolved.
MoonshotAI/checkpoint-engine#42