Skip to content

init commit avalon#60

Merged
Longin-Yu merged 18 commits intoTHUDM:mainfrom
HenryCai11:main
Nov 7, 2023
Merged

init commit avalon#60
Longin-Yu merged 18 commits intoTHUDM:mainfrom
HenryCai11:main

Conversation

@HenryCai11
Copy link
Copy Markdown
Contributor

@HenryCai11 HenryCai11 commented Oct 22, 2023

AvalonBench

Quick Start

Start the task server and the assigner

Start the game (3 is the number of workers)

python -m src.start_task -a --start avalon-dev-single 3

Start the assigner

python -m src.assigner --config ./configs/assignments/test_avalon.yaml

Customize configurations and data

  1. You can modify the file configs/tasks/avalon.yaml to configure the agent list. A config file looks like this:
default:
  module: "src.server.tasks.avalon.AvalonBench"
  parameters:
    num_players: 5
    discussion: False

avalon-dev-naive:
  parameters:
    name: "AvalonBench-dev-naive"
    data_file: "data/avalon/dev.json"
    agent_list: ["naive", "naive", "naive", "naive", "naive"]

avalon-dev-single:
  parameters:
    name: "AvalonBench-dev-single"
    data_file: "data/avalon/dev.json"
    agent_list: ["llm", "naive", "naive", "naive", "naive"]

where naive stands for the naive bots. Agents will play the roles with the same index in the data file (see following).

Note: There should only be one "llm" in the `agent_list`
  1. You can also add data in data/avalon/dev.json (Note: Currently we only support the 5-player game setting, which includes 1 Merlin, 2 Servants, 1 Minion and 1 Assassin). A data item looks like this:
 {
     "num_players": 5,
     "quest_leader": 0,
     "role_names": ["Assassin", "Servant", "Servant", "Merlin", "Minion"]
 }

where quest_leader is the id of the initial quest leader in this game. You can change the game setup by altering quest_leader with number from 0 to 4, and by permuting role_names.

Naive experiment

You can also start a naive experiment using:

python -m src.start_task -a --start avalon-dev-naive 3

where all the agents are naive bots. For details of the naive strategies, please refer to the paper.

Prompts

All the prompts are maintained in src/server/tasks/avalon/prompt.py. You can find the respective prompts in src/server/tasks/avalon/agents/llm_with_discussion.py and src/server/tasks/avalon/wrapper.py.

Results

Results of single-setting games

{
    "total": 20,
    "validation": {
        "running": 0.0,
        "completed": 0.95,
        "agent context limit": 0.0,
        "agent validation failed": 0.05,
        "agent invalid action": 0.0,
        "task limit reached": 0.0,
        "unknown": 0.0,
        "task error": 0.0,
        "average_history_length": 11.0,
        "max_history_length": 14,
        "min_history_length": 2
    },
    "custom": {
        "Win rate of Player 0": 0.15,
        "Avg deduction acc of Player 0": 0.5399999999999998,
        "Valid number of games": 19
    }
}

@Longin-Yu Longin-Yu merged commit adc728e into THUDM:main Nov 7, 2023
@Xiao9905 Xiao9905 mentioned this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants