Skip to content

MASWorks/X-MAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs

X-MAS

🔥 News

  • [2025/05/23] See our preprint paper in ArXiv.

X-MAS-Bench

  1. Specify your model configs in ./configs/X-MAS_Bench_config.json:
"gpt-4o-mini-2024-07-18": {
        "model_list": [
            {"model_name": "gpt-4o-mini-2024-07-18", "model_url": "http://a.b.c.d:e/v1", "api_key": "xyz"}
        ],
        "max_workers_per_model": 10
    }
  1. Inference on a dataset (the outputs will be saved under "./X-MAS-Bench/results/")
# bash scripts/infer_X-MAS_Bench.sh
python X-MAS-Bench/infer_direct.py --model_name <model_name> --model_config <config_path> --test_dataset_name <dataset_name>
  1. Evaluate on a dataset (the outputs will be saved under "./X-MAS-Bench/results/")
# bash scripts/eval_X-MAS_Bench.sh
python X-MAS-Bench/eval_bench.py --model_name <eval_model_name> --model_config <config_path> --dataset_name <dataset_name> --infer_name <infer_name> --eval_mode bench-test
# We use llama-3.1-70b-instruct as <eval_model_name>

Note that we release the experimental results of the X-MAS-Bench in Google Drive. You can download the .zip file named results.zip to the "./X-MAS-Bench/results/" path and unzip it.

X-MAS-Design

  1. Specify your model configs in ./configs/X-MAS_Design_config.json:
"gpt-4o-mini-2024-07-18": {
        "model_list": [
            {"model_name": "gpt-4o-mini-2024-07-18", "model_url": "http://a.b.c.d:e/v1", "api_key": "xyz"}
        ],
        "max_workers_per_model": 10
    }
  1. Inference on a dataset (the outputs will be saved under "./X-MAS-Design/results/")
# bash scripts/infer_X-MAS_Design.sh

# (Parallel)
python X-MAS-Design/inference_X-MAS.py --method_name <method_name> --model_name <model_name> --test_dataset_name <test_dataset_name> --model_api_config <model_api_config>


# Or (Sequential)
python X-MAS-Design/inference_X-MAS.py --method_name <method_name> --model_name <model_name> --test_dataset_name <test_dataset_name> --model_api_config <model_api_config> --sequential
  1. Evaluate on a dataset (the outputs will be saved under "./X-MAS-Design/results/")
bash scripts/eval_X-MAS_Design.sh

Citation

@article{ye2025x,
  title={X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs},
  author={Ye, Rui and Liu, Xiangrui and Wu, Qimin and Pang, Xianghe and Yin, Zhenfei and Bai, Lei and Chen, Siheng},
  journal={arXiv preprint arXiv:2505.16997},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published