Support batch procesing for openai api compatible requests#659
Support batch procesing for openai api compatible requests#659ravi03071991 wants to merge 2 commits intoEvolvingLMMs-Lab:mainfrom
Conversation
kcz358
left a comment
There was a problem hiding this comment.
Hi @mickqian @ravi03071991 , thank you for your contribution. Do you guys think it is more appropriate to put the changes in this file
https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/lmms_eval/models/batch_gpt4.py
instead of the open_compatible.py one? Because when we use this file, it is possible that we are testing some self-hosted server such as using vllm or sglang or any openai compatible which may not necessarily implement the batch api
Thanks @kcz358. Are you suggesting that we create a new model file called Alternatively, we could update the existing |
|
Also, @kcz358, the OpenAI client supports batch requests So any self-hosted serving solution—like vLLM or sglang—that is OpenAI-compatible should support batch requests by default. |
|
I think vllm lacks of an endpoint batch_input_file = client.files.create(
file=open("batchinput.jsonl", "rb"),
purpose="batch"
)I kind of investigate this a while ago and I just tried again with vllm and found this still does not work. I remember that this works with sglang because they create that endpoint. So I believe the best way is still change the code in |
Yeah, that makes sense. I just tested it and realized it’s not supported on their end. I’ll go ahead and update the code in |
|
Hi @kcz358 , The default output format from the OpenAI Batch API seems quite different from the SGLang OpenAI client batch output. You can check the OpenAI Batch output here. SGLang OpenAI client batch output:
To extract the result, we currently need to do something like: Just wondering — would it make sense to update |
|
Hi, @ravi03071991 do you think it is okay to do an if else check here in |
Yeah, I think so. I raised a PR on sglang to fix it. Once its merged, I will update this PR accordingly. |
|
@ravi03071991 Hi ravi, can you link your pr to here? |
PR - Fixes batch with single request error |
|
Hi @ravi03071991 how's the PR in Sglang? If it will not be merged, do you think it would be better to do it hard coding? |
SGLang PR is taking time. I will close this PR for now and reopen when things are ready. Thank you. |
PR to support batch processing with OpenAI API Compatible requests.
Currently, we are processing with a batch_size=1 and this PR helps to compute metrics with different batch sizes.