- [2025/05/23] See our preprint paper in ArXiv.
- Specify your model configs in
./configs/X-MAS_Bench_config.json:
"gpt-4o-mini-2024-07-18": {
"model_list": [
{"model_name": "gpt-4o-mini-2024-07-18", "model_url": "http://a.b.c.d:e/v1", "api_key": "xyz"}
],
"max_workers_per_model": 10
}
- Inference on a dataset (the outputs will be saved under "./X-MAS-Bench/results/")
# bash scripts/infer_X-MAS_Bench.sh
python X-MAS-Bench/infer_direct.py --model_name <model_name> --model_config <config_path> --test_dataset_name <dataset_name>
- Evaluate on a dataset (the outputs will be saved under "./X-MAS-Bench/results/")
# bash scripts/eval_X-MAS_Bench.sh
python X-MAS-Bench/eval_bench.py --model_name <eval_model_name> --model_config <config_path> --dataset_name <dataset_name> --infer_name <infer_name> --eval_mode bench-test
# We use llama-3.1-70b-instruct as <eval_model_name>
Note that we release the experimental results of the X-MAS-Bench in Google Drive. You can download the .zip file named results.zip to the "./X-MAS-Bench/results/" path and unzip it.
- Specify your model configs in
./configs/X-MAS_Design_config.json:
"gpt-4o-mini-2024-07-18": {
"model_list": [
{"model_name": "gpt-4o-mini-2024-07-18", "model_url": "http://a.b.c.d:e/v1", "api_key": "xyz"}
],
"max_workers_per_model": 10
}
- Inference on a dataset (the outputs will be saved under "./X-MAS-Design/results/")
# bash scripts/infer_X-MAS_Design.sh
# (Parallel)
python X-MAS-Design/inference_X-MAS.py --method_name <method_name> --model_name <model_name> --test_dataset_name <test_dataset_name> --model_api_config <model_api_config>
# Or (Sequential)
python X-MAS-Design/inference_X-MAS.py --method_name <method_name> --model_name <model_name> --test_dataset_name <test_dataset_name> --model_api_config <model_api_config> --sequential
- Evaluate on a dataset (the outputs will be saved under "./X-MAS-Design/results/")
bash scripts/eval_X-MAS_Design.sh
@article{ye2025x,
title={X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs},
author={Ye, Rui and Liu, Xiangrui and Wu, Qimin and Pang, Xianghe and Yin, Zhenfei and Bai, Lei and Chen, Siheng},
journal={arXiv preprint arXiv:2505.16997},
year={2025}
}
