🦄 Unicorn: A Universal and Collaborative Reinforcement Learning Approach Toward Generalizable Network-Wide Traffic Signal Control

Official implementation of Unicorn, accepted in IEEE Transactions on Intelligent Transportation Systems (T-ITS).
Yifeng Zhang, Yilin Liu, Ping Gong, Peizhuo Li, Mingfeng Fan, Guillaume Sartoretti
MARMot Lab @ National University of Singapore

📋 Table of Contents

Highlights
Requirements
Installation
Project Structure
Supported Datasets
Configuration
Training
- Single-Scenario Training
- Multi-Scenario Co-Training
Testing
Evaluation with Non-RL Baselines
Citation
License

Highlights

Unified Traffic Movement Representation: A traffic movement-based state-action representation that unifies intersection states and signal phases across different intersection topologies.
Universal Traffic Representation (UTR) Module: A decoder-only feature extraction architecture with cross-attention, designed to capture general traffic features across different intersections.
Intersection-Specific Representation (ISR) Module: A feature extraction module combining a Variational Autoencoder (VAE) and contrastive learning to capture intersection-specific characteristics.
Collaborative Multi-Intersection Learning: An attention-based coordination mechanism that adaptively models state-action dependencies among neighboring intersections for scalable network-level signal control.
Evaluation on Diverse Traffic Networks: Experiments conducted on eight traffic datasets in SUMO, including three synthetic traffic networks and five real-world city-scale networks, supporting both single-scenario training and multi-scenario joint training.

Requirements

Dependency	Version
Python	≥ 3.8
SUMO	≥ 1.16.0
PyTorch	1.13.0 (CUDA 11.7)
Ray	2.3.1
Gym	0.26.2
SciPy	1.10.1
einops	0.6.0
NumPy	1.24.2
TensorBoard	2.13.0

Note

Different PyTorch and CUDA versions may affect training performance and reproducibility. The code is tested with PyTorch 1.13.0 + CUDA 11.7.

Installation

Step 1: Clone the Repository

git clone https://github.com/marmotlab/Unicorn.git
cd Unicorn

Step 2: Create Conda Environment

# Create a new conda environment
conda create -n unicorn python=3.8 -y
conda activate unicorn

Step 3: Install PyTorch

Install PyTorch first, selecting the command that matches your CUDA version:

# CUDA 11.6
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116

# CUDA 11.7
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117

# CPU only
pip install torch==1.13.0+cpu torchvision==0.14.0+cpu torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cpu

Step 4: Install Other Dependencies

pip install -r requirements.txt

Step 5: Install SUMO Traffic Simulator

Download and install SUMO by following the official instructions at: 👉 https://sumo.dlr.de/docs/Downloads.php
Set the environment variable SUMO_HOME. Refer to the SUMO Basic Computer Skills guide for detailed instructions on setting SUMO environment variables.
Verify the installation:

sumo --version

Project Structure

Unicorn/
├── 📄 driver_unicorn.py         # Main training script (gradient updates & PPO optimization)
├── 📄 runner_unicorn.py         # Distributed experience collection via Ray workers
├── 📄 evaluator_rl.py           # Evaluation script for RL-based models (Unicorn)
├── 📄 evaluator_non_rl.py       # Evaluation script for non-RL baselines (Fixed, Greedy, Pressure)
├── 📄 parameters.py             # All training, simulation & experiment configurations
├── 📄 utils.py                  # Utility functions
├── 📄 requirements.txt          # Python dependencies
│
├── 📂 models/
│   └── Unicorn.py               # Unicorn network architecture (Actor-Critic)
│
├── 📂 env/
│   ├── matsc.py                 # Multi-Agent TSC Gym environment (SUMO interface)
│   └── tls.py                   # Traffic light signal controller module
│
├── 📂 maps/                     # SUMO network datasets & configuration files
│   ├── grid_network_5_5/        # Synthetic 5×5 Grid (MA2C)
│   ├── monaco_network_30/       # Real-world Monaco 30 intersections (MA2C)
│   ├── cologne_network_8/       # Real-world Cologne 8 intersections (RESCO)
│   ├── ingolstadt_network_21/   # Real-world Ingolstadt 21 intersections (RESCO)
│   ├── grid_network_4_4/        # Synthetic 4×4 Grid (RESCO)
│   ├── arterial_network_4_4/    # Synthetic 4×4 Arterial (RESCO)
│   ├── shaoxing_network_7/      # Real-world Shaoxing 7 intersections (GESA)
│   ├── shenzhen_network_29/     # Real-world Shenzhen 29 intersections (GESA)
│   └── shenzhen_network_55/     # Real-world Shenzhen 55 intersections (GESA)
│
└── 📂 images/
    └── framework.png            # Framework overview figure

Supported Datasets

Unicorn is evaluated on 8 SUMO traffic network scenarios from three benchmark suites:

Benchmark	Network	# Intersections	Type
MA2C	`grid_network_5_5`	25	Synthetic
MA2C	`monaco_network_30`	30	Real-world
RESCO	`cologne_network_8`	8	Real-world
RESCO	`ingolstadt_network_21`	21	Real-world
RESCO	`grid_network_4_4`	16	Synthetic
RESCO	`arterial_network_4_4`	16	Synthetic
GESA	`shaoxing_network_7`	7	Real-world
GESA	`shenzhen_network_29`	29	Real-world

References:

MA2C: Chu, T., Wang, J., Codecà, L., & Li, Z. (2020). Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control. IEEE T-ITS.

RESCO: Ault, J., & Sharon, G. (2021). Reinforcement Learning Benchmarks for Traffic Signal Control. NeurIPS Datasets & Benchmarks.

GESA: Jiang, H., et al. (2024). A General Scenario-Agnostic Reinforcement Learning for Traffic Signal Control. IEEE T-ITS.

⚙️ Configuration

All configurations are centralized in parameters.py. Below are the key parameters:

🔹 Input Parameters (`INPUT_PARAMS`)

Parameter	Description	Default
`MAX_EPISODES`	Total number of training episodes	`3000`
`NUM_META_AGENTS`	Number of parallel Ray worker processes	`6`
`LOAD_MODEL`	Whether to resume training from a checkpoint	`False`
`EXPERIMENT_PATH`	Path to the checkpoint experiment (for resume)	`None`
`CO_TRAIN`	Enable multi-scenario co-training	`False`

Important

When switching between single-scenario and multi-scenario training modes, make sure to set CO_TRAIN accordingly in INPUT_PARAMS and configure the appropriate dataset(s).

Training

🔸 Single-Scenario Training

In single-scenario mode, the model trains on one specific traffic network at a time. This is the default mode.

Step 1: Configure parameters.py:

class INPUT_PARAMS:
    MAX_EPISODES    = 3000        # Total training episodes
    NUM_META_AGENTS = 6           # Number of parallel workers
    CO_TRAIN        = False       # ⬅️ Set to False for single-scenario

class SUMO_PARAMS:
    NET_NAME        = 'grid_network_5_5'  # ⬅️ Choose your target dataset

Step 2: Launch training:

python driver_unicorn.py

Tip

Recommended configurations by dataset:

Dataset	Green / Yellow	Teleport Time
MA2C networks	10s / 3s	300
RESCO networks	15s / 5s	-1
GESA networks	15s / 5s	600

🔸 Multi-Scenario Co-Training

In multi-scenario co-training mode, the model trains simultaneously across multiple traffic networks using different workers for different scenarios. This enables cross-domain generalization.

Step 1: Configure parameters.py:

class INPUT_PARAMS:
    MAX_EPISODES    = 3000
    NUM_META_AGENTS = 6           # Each worker trains on a different scenario
    CO_TRAIN        = True        # ⬅️ Set to True for multi-scenario

class SUMO_PARAMS:
    ALL_DATASETS    = [           # ⬅️ Define the scenarios to co-train on
        'cologne_network_8',
        'ingolstadt_network_21',
        'arterial_network_4_4',
        'grid_network_4_4',
        'shaoxing_network_7',
        'shenzhen_network_29',
    ]

Note

In co-training mode:

Each worker (indexed by server_number) is assigned a different dataset from ALL_DATASETS.
The number of NUM_META_AGENTS should match the number of datasets in ALL_DATASETS.
The observation and action spaces are automatically padded to the maximum dimensions across all scenarios (max movement dim = 36, max phase dim = 8, agent space = 97).

Step 2: Launch training:

python driver_unicorn.py

Monitoring Training

Training logs are automatically recorded with TensorBoard:

tensorboard --logdir ./Train_MATSC/<EXPERIMENT_NAME>/train

Key metrics tracked:

Policy Loss, Value Loss, Entropy Loss
Actor/Critic VAE Loss & Contrastive Loss
Episode Reward, Episode Length, Action Change Rate

Resume Training from Checkpoint

To resume training from a saved checkpoint:

class INPUT_PARAMS:
    LOAD_MODEL      = True
    EXPERIMENT_PATH = './Train_MATSC/<YOUR_EXPERIMENT_NAME>'  # ⬅️ Path to saved experiment

Testing

After training, evaluate the trained model on specific test scenarios.

Step 1: Configure the test settings in evaluator_rl.py:

# Set the experiment directory and model path
exp_dir = './Test'
agent_name_list = ['UNICORN']
model_path_list = ['./Train_MATSC/<EXPERIMENT_NAME>/model/checkpoint<EPISODE>.pkl']

Step 2: Ensure the corresponding map and flow settings in parameters.py match the training configuration:

class SUMO_PARAMS:
    NET_NAME = 'grid_network_5_5'  # ⬅️ Must match the map used during training

Step 3: Run evaluation:

python evaluator_rl.py

Step 4: After testing, the results (traffic data & trip data) will be saved in:

./Test/eval_data/
├── <map_name>_UNICORN_traffic.csv    # Traffic metrics per timestep
└── <map_name>_UNICORN_trip.csv       # Individual vehicle trip info

Evaluation with Non-RL Baselines

Unicorn includes built-in non-RL baseline evaluators for comparison:

Baseline	Description
`FIXED`	Fixed-time signal plan
`GREEDY`	Greedy policy based on queue length
`PRESSURE`	Max-pressure based control

Run baseline evaluations:

python evaluator_non_rl.py

Configure the baselines in evaluator_non_rl.py:

agent_name_list = ['FIXED', 'GREEDY', 'PRESSURE']

Citation

If you find this code useful in your research, please consider citing our paper:

@ARTICLE{11360985,
  author={Zhang, Yifeng and Liu, Yilin and Gong, Ping and Li, Peizhuo and Fan, Mingfeng and Sartoretti, Guillaume},
  journal={IEEE Transactions on Intelligent Transportation Systems}, 
  title={Unicorn: A Universal and Collaborative Reinforcement Learning Approach Toward Generalizable Network-Wide Traffic Signal Control}, 
  year={2026},
  volume={},
  number={},
  pages={1-17},
  keywords={Collaboration;Topology;Network topology;Feature extraction;Vectors;Urban areas;Real-time systems;Training;Reinforcement learning;Scalability;Generalizable adaptive traffic signal control;multi-agent reinforcement learning;contrastive learning},
  doi={10.1109/TITS.2026.3653478}}

You may also find our related work useful:

@INPROCEEDINGS{10801524,
  author={Zhang, Yifeng and Li, Peizhuo and Fan, Mingfeng and Sartoretti, Guillaume},
  booktitle={2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, 
  title={HeteroLight: A General and Efficient Learning Approach for Heterogeneous Traffic Signal Control}, 
  year={2024},
  volume={},
  number={},
  pages={1010-1017},
  keywords={Measurement;Network topology;Urban areas;Reinforcement learning;Feature extraction;Vectors;Robustness;Topology;Optimization;Intelligent robots},
  doi={10.1109/IROS58592.2024.10801524}}

License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ If you find this project useful, please consider giving it a star! ⭐

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦄 Unicorn: A Universal and Collaborative Reinforcement Learning Approach Toward Generalizable Network-Wide Traffic Signal Control

📋 Table of Contents

Highlights

Requirements

Installation

Step 1: Clone the Repository

Step 2: Create Conda Environment

Step 3: Install PyTorch

Step 4: Install Other Dependencies

Step 5: Install SUMO Traffic Simulator

Project Structure

Supported Datasets

⚙️ Configuration

🔹 Input Parameters (`INPUT_PARAMS`)

Training

🔸 Single-Scenario Training

🔸 Multi-Scenario Co-Training

Monitoring Training

Resume Training from Checkpoint

Testing

Evaluation with Non-RL Baselines

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
env		env
images		images
maps		maps
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
driver_heterolight.py		driver_heterolight.py
driver_unicorn.py		driver_unicorn.py
evaluator_non_rl.py		evaluator_non_rl.py
evaluator_rl.py		evaluator_rl.py
parameters.py		parameters.py
requirements.txt		requirements.txt
runner_heterolight.py		runner_heterolight.py
runner_unicorn.py		runner_unicorn.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

🦄 Unicorn: A Universal and Collaborative Reinforcement Learning Approach Toward Generalizable Network-Wide Traffic Signal Control

📋 Table of Contents

Highlights

Requirements

Installation

Step 1: Clone the Repository

Step 2: Create Conda Environment

Step 3: Install PyTorch

Step 4: Install Other Dependencies

Step 5: Install SUMO Traffic Simulator

Project Structure

Supported Datasets

⚙️ Configuration

🔹 Input Parameters (INPUT_PARAMS)

Training

🔸 Single-Scenario Training

🔸 Multi-Scenario Co-Training

Monitoring Training

Resume Training from Checkpoint

Testing

Evaluation with Non-RL Baselines

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

🔹 Input Parameters (`INPUT_PARAMS`)

Packages