Hi, I think this looks like a very interesting project.
I'm wondering, however, if vLLM with kvcached shows lower performance than vanilla vLLM due to dynamic memory allocation overhead.
Have you measured the overhead by comparing vanilla vLLM and vLLM with kvcached when running a single model?