Name	Name	Last commit message	Last commit date
parent directory ..
custom_leaderboard	custom_leaderboard
docs	docs
original_leaderboard	original_leaderboard
results	results
team_code	team_code
tools	tools
.gitignore	.gitignore
.style.yapf	.style.yapf
README.md	README.md
environment.yml	environment.yml
evaluate_routes_slurm.py	evaluate_routes_slurm.py
max_num_jobs.txt	max_num_jobs.txt
pylintrc	pylintrc
run_evaluation_slurm.sh	run_evaluation_slurm.sh
setup_carla.sh	setup_carla.sh

CaRL - CARLA

This folder contains the code to train and evaluate RL agents with the CARLA leaderboard 2.0. In general, we recommend reading the appendix of the paper if you want to use the code, since it explains many technical details necessary to understand the code.

Structure:

custom_leaderboard contains a modified version of the CARLA leaderboard 2.0 that is much faster for RL training than the original. It is used to train models.
original_leaderboard contains the original version of the CARLA leaderboard 2.0. It is used for evaluating models.
results contains model files
team_code contains the training and evaluation files.
tools contains various scripts for generating training routes or aggregating results.

News

[09/Oct/2025] Released the CaRL version 1.1 model. See release notes.
[10/Aug/2025] Initial code release v1.0 finished

Performance on longest6v2

Method	Type	DS ↑	RC ↑	Ped ↓	Veh ↓	MS ↓	Time ↓
PDM-Lite	Rule	73	100	0.00	0.18	7.67	18
—	—	—	—	—	—	—	—
PlanT	IL	62	96	0.07	0.43	3.18	18
Think2Drive	RL	7	39	0.97	2.55	18.05	6
Roach	RL	22	77	0.45	2.42	7.12	7
CaRL v1.0	RL	64	82	0.01	0.36	1.71	8
CaRL v1.1	RL	73	83	0.01	0.17	1.67	8

Setup
Pre-Trained Models
Local Debugging
Benchmarking
Training
C++ code
Scenario Generation

Setup

Clone the repo, setup CARLA 0.9.15, and build the conda environment:

git clone https://github.com/autonomousvision/CaRL.git
cd CaRL
chmod +x setup_carla.sh
./setup_carla.sh
conda env create --file=environment.yml
conda activate carl

Before running the code, you will need to add the following paths to PYTHONPATH on your system:

export CARLA_ROOT=/path/to/CARLA/root
export WORK_DIR=/path/to/CaRL/CARLA
export PYTHONPATH=$PYTHONPATH:${CARLA_ROOT}/PythonAPI/carla
export SCENARIO_RUNNER_ROOT=${WORK_DIR}/scenario_runner
export LEADERBOARD_ROOT=${WORK_DIR}/leaderboard
export PYTHONPATH="${CARLA_ROOT}/PythonAPI/carla/":"${SCENARIO_RUNNER_ROOT}":"${LEADERBOARD_ROOT}":${PYTHONPATH}

You can add this in your shell scripts or directly integrate it into your favorite IDE.
E.g. in PyCharm: Settings -> Project -> Python Interpreter -> Show all -> garage (need to add from existing conda environment first) -> Show Interpreter Paths -> add all the absolute paths above (without pythonpath).

Pre-Trained Models

We provide a set of pretrained model weights. For CaRL, we provide 2 seeds for a Python PyTorch model and 1 seed for a C++ libtorch model. For Roach, we provide 5 python seeds. The last number indicates the seed. All models are licensed under the same license as the code.

Each folder has an config.json containing all hyperparameters, which will automatically be loaded and override default hyperparameters. The folder also contains a model_final.pth which is the model from the last PPO iteration, and an optimizer_final.pth which are the Adam parameters from the last iteration.

Local Debugging

To debug evaluation or training, you need to start a CARLA server (or multiple):

cd /path/to/CARLA/root
./CarlaUE4.sh

CaRL does not use sensor data, so if you do not want to use the spectator camera of CARLA, you can start CARLA in CPU mode to save compute using the -nullrhi option. Additionally, you might want to set the -carla-rpc-port=2000, -nosound -carla-streaming-port=6000 options in particular if you run multiple servers or non-default ports. By default, CARLA allocates a number of threads that is proportional to the number of cores on the CPU. For high-end server CPUs this can be way too many threads. Fortunately CARLA has features to control this albeit undocumented at the moment. For training, we use these additional options: -RPCThreads=2 -StreamingThreads=2 -SecondaryThreads=2 -nothreading that limit the number of parallel threads.

Evaluation Debugging

To evaluate a model, run leaderboard_evaluator.py as the main Python file.

Set the --agent-config option to a folder containing a config.json and model_final.pth files, e.g. CaRL.
Set the --agent to eval_agent.py.
The --routes option should be set to a route file, for example, debug.xml.
Set --checkpoint to /path/to/results/result.json

The inference model code can be configured using the following environment variables. The default values are set for CaRL. To evaluate Roach, set SAMPLE_TYPE=roach. The CPP and singularity values are only needed when running a C++ model.

export DEBUG_ENV_AGENT=0 # Produce debug outputs
export SAVE_PATH=/path/to/save/dir # Folder to save debug output in
export RECORD=0 # Record infraction clips
export SAVE_PNG=0 # Save higher quality individual debug frames in PNG. Otherwise video is saved. 
export UPSCALE_FACTOR=1  # Render higher resolution debug for paper
export SCENARIO_RUNNER_ROOT=/path/to/original_leaderboard/scenario_runner # Set to the scenario runner root. Important to set otherwise scenarios don't work.
export CUBLAS_WORKSPACE_CONFIG=:4096:8  # Needed for deterministic model forward passes. Increases GPU memory consumption.

export HIGH_FREQ_INFERENCE=0 # Can run model at 20 Hz, didn't see any benefit
export NO_CARS=0 # Removes all other cars for debugging, do not use during evaluation.
export SAMPLE_TYPE=mean # How to make action deterministic. Options: mean, sample, roach. Use mean for CaRL and roach for roach
export ROUTES=debug  # Name for visualization folder
export CPP=0  # Set to 1 when evaluating a C++ model.
export CPP_PORT=5555 # Port over which to do communication with C++ code
export PPO_CPP_INSTALL_PATH=/path/to/build_folder # Path to folder containing c++ binary (ppo_carla_inference)
export PATH_TO_SINGULARITY=/path/to/ppo_cpp.sif # Path to singularity container file ppo_cpp.sif
export PYTORCH_KERNEL_CACHE_PATH=~/.cache/torch # Path to pytorch cache

Training Debugging

To train a model, you need to start two Python processes one for the leaderboard and one for the training code; they communicate via message passing. CARLA leaderboard:

python ${WORK_DIR}/custom_leaderboard/leaderboard/leaderboard/leaderboard_evaluator.py --routes ${WORK_DIR}/custom_leaderboard/leaderboard/data/debug_routes_with_scenarios/route_Town03_00.xml.gz --agent ${WORK_DIR}/team_code/env_agent.py --resume 0 --checkpoint ${WORK_DIR}/results/debug_00.json --track MAP --port 2000 --traffic-manager-port 8000 --agent-config /home/jaeger/ordnung/internal/CaRL/CARLA/results/PPO_debug --gym_port 5555 --debug 0 --repetitions 100 --frame_rate 10.0 --no_rendering_mode False --timeout 900 --skip_next_route False --runtime_timeout 900

training script dd_ppo.py: The training script is highly configurable with many parameters. You can find the documentation at the start of dd_ppo.py. The parameters here do not correspond to any particular model and are just meant to get the code running for debugging.

torch.distributed.run --nnodes=1 --nproc_per_node=1 --max_restarts=0 --rdzv-backend=c10d --rdzv-endpoint=localhost:0 ${WORK_DIR}/team_code/dd_ppo.py --num_envs_per_gpu 1 --use_dd_ppo_preempt False --exp_name DD_PPO_debug --tcp_store_port 7000 --logdir ${WORK_DIR}/results/ --total_batch_size 512 --total_minibatch_size 128 --update_epochs 3 --total_timesteps 10000000 --reward_type simple_reward --debug 1 --debug_type save --ports 5555

Benchmarking

CaRL is currently trained with the 6 scenarios for the longest6 v2 benchmark. To evaluate CaRL efficiently, we parallelize evaluation with multiple GPUs using the evaluate_routes_slurm.py script. To use it you need to change the various paths in the command line. It is built for a SLURM cluster with many cheap consumer GPUs and started using the run_evaluation_slurm.sh. If your cluster has a different structure or job scheduler, you can use this script to write your own. In particular, for clusters with few expensive GPUs, it can be beneficial to evaluate multiple models per GPU at the same time instead. Since CARLA can be run in CPU mode and CaRL is quite a fast model, longest6 v2 evaluates much faster than is typical with sensorimotor agents.

Longest6 v2

Longest6 is a benchmark consisting of 36 medium-length routes (~1-2 km) from leaderboard 1.0 in towns 1-6. We have adapted the benchmark to the new CARLA version 0.9.15 and leaderboard/scenario runner code. The benchmark features the 7 scenario types from leaderboard 1.0 (now 6 scenarios implemented with the leaderboard 2.0 logic). The scenario descriptions were created by converting the leaderboard 1.0 scenarios with the CARLA route bridge converter. It can serve as a benchmark with intermediate difficulty. Note that the results of models on Longest6 v2 are not directly comparable to the leaderboard 1.0 longest6 numbers. The benchmark can be found here and the individual route files here. Unlike the leaderboard 1.0 version, there are no modifications to the CARLA leaderboard code. Longest6 is a training benchmark, so training on Town 01-06 is allowed.

Result aggregation

To aggregate the results of a parallel evaluation into a single CSV we provide the result_parser.py script.

python ${WORK_DIR}/tools/result_parser.py --results /path/to/folder/containing_json_files --xml ${WORK_DIR}/custom_leaderboard/leaderboard/data/longest6.xml

Training

The main training algorithm is in dd_ppo.py. It contains a custom DD_PPO implementation that is based on the PPO code in CleanRL. We are using an RL-optimized leaderboard in this project. The official CARLA 0.9.15 server has various problems for RL training, such as that it occasionally crashes due to bugs. For that reason we use the train_parallel.py script to start the training, which starts up the CARLA servers, leaderboard clients, and training code. The script monitors the training for CARLA crashes and restarts everything if something crashes. This allows for training with the pre-build CARLA version but slows down training due to the restarts.

Custom CARLA

We have created a CARLA fork to solve these problems, where we fixed two endless loops a race condition and removed a significant amount of CPU overhead. With our fork we observe no more CARLA crashes and are able to run 2x more parallel servers leading to significantly faster training at scale. The downside is that you need to compile CARLA from scratch witch takes quite a while. We recommend the custom fork for users that plan to do large scale training runs (e.g. 8 GPUs+). To build CARLA follow the official instructions. At the end compile a CARLA distribution with make package ARGS="--no-zip --packages=Carla,AdditionalMaps" copy the additional maps into the CARLA folder and install client:

conda activate carl
pip uninstall carla
pip install /path/to/custom/CARLA/PythonAPI/carla/dist/carla-0.9.15-cp37-cp37m-manylinux_2_27_x86_64.whl

Starting > 100 CARLA servers on shared file systems can sometimes hang up servers. Starting the servers with sleeps inbetween can work around this. The train_parallel.py script supports this with the --ml_cloud 1 option.

Training scripts

We provide 5 training configs to train different RL models. The training code is highly configurable. You can find the right hyperparameters for each model in these scripts.

train_roach.sh Reproduces the Roach approach for the CARLA leaderboard 2.0.
train_carl_tiny_cpp.sh Reproduces the 10M samples ablation in Table 4. It can be a good start to play around with the repo since it only needs 1 GPU and trains within a day. The script uses the C++ training code described later.
train_carl_py.sh Trains the CaRL v1.0 method (300M samples run) using the Python training code.
train_carl_cpp.sh Trains the CaRL v1.0 method (300M samples run) using the C++ training code.
train_carl_v1_1_py.sh Trains the CaRL v1.1 using the Python training code with 300M samples. Compared to the 1.0 paper version this one uses our custom CARLA server which enabled 2x more parallel servers (256) as well as various smaller bugfixes reducing noise in the training routes. The model achieves 73 DS (+9) on longest6 v2.

Every parameter has a one-sentence description about what it does in the argument parser. But in general this is a research repo, so we do not have extensive documentation. I recommend using Ctrl+Shift+f (or your editor's equivalent) to find the parameter in the code and just read the code to learn what it does.

The config system works such that every parameter has a default value that is loaded from rl_config.py. Most relevant parameters can be changed by using command line arguments --parametername value. The parameter is then automatically overwritten, and the new config is saved in a config.json file alongside the model. During inference or restarting of training, the values in the config.json file are automatically loaded, overwriting the default values. The nice thing about this config system is that one can easily add features while ensuring backwards compatibility with models trained with older code versions. For that, I set the default value in the default config such that the new feature is turned off by default. If an old model is then run with the new code, the default value of the parameter is then loaded because the parameter is not in config.json, ensuring that the new feature is turned off and the old model still runs as intended.

Viewing training logs

The training logs are stored as TensorBoard files for both the C++ and Python code. They can be viewed locally by starting TensorBoard with tensorboard --logdir /path/to/model/dir --load_fast=false. The Python code additionally implements uploading the log files to Weights and Biases by setting the --track 1 option.

CPP Training and Evaluation Code

The training evaluation code for the C++ implementation of CaRL is hosted in the ppo.cpp repository. To use it, clone the repository and follow the instructions to build the singularity container and compile the binaries. The code can also be built to run natively on your system, but I recommend using the singularity container option since it is much more convenient and doesn't seem to affect performance.

Evaluating a model works the same way as with the Python code, but you additionally have to specify the following environment variables:

export PPO_CPP_INSTALL_PATH=/path/to/folder/with/binaries # The C++ binaries you built
export PATH_TO_SINGULARITY=/path/to/ppo_cpp.sif # The singularity container
export PYTORCH_KERNEL_CACHE_PATH=~/.cache/torch  # Path to the PyTorch cache folder, makes it visible inside the container

Training a model works similarly to training with the Pytorch code and is started using the train_parallel.py script. You additionally have to set the following options. For an example, see here.

--train_cpp 1
--PYTORCH_KERNEL_CACHE_PATH /path/to/.cache
--ppo_cpp_install_path /path/to/folder/with/cpp/binaries
--cpp_singularity_file_path /path/to/ppo_cpp.sif 
--cpp_system_lib_path_1 /path/to/missing/system/libs  # In case there is some system library that the code can't find you can link the path here. You will most likely not need it, everything should be included inside the container. In that case set it to a random folder.

The C++ training code is written in a way so that model configuration works mostly the same as the Python code. Both codes store and load parameters from a config.json. They start with default parameters loaded from a default config class, which can be overwritten using arguments from the command line with the --parametername value and the parameters have the same name.

There are a few differences, though. The C++ code implements a subset of the options available in the Python code. All features used in CaRL are implemented, but the Python code has some additional features that I did not end up using in CaRL, which are not implemented in C++. Also, I have not implemented the use_exploration_suggest feature from the Roach baseline in the C++ code.

Lastly, PPO has a few hyperparameters that are dependent on each other. For example, you either set --total_minibatch_size and --total_batch_size, or you set --num_steps and --num_minibatches. Unfortunately, I have changed my mind about which one is more convenient to set, and now the Python code implements the former, and the C++ code implements the latter. This has no effect on the algorithm, but you need to familiarize yourself with the parameters and set them correctly. You can compare train_carl_py.sh with train_carl_cpp.sh to have an example. The C++ code is only compatible with the CARLA environment for now.

Scenario generation

To train an RL algorithm, you need to generate many training scenarios. CARLA does not provide them, so we try to automatically generate them using the generate_long_routes_with_scenarios.py script, described in the Appendix. I usually run it via SLURM arrays, see here. The script generates random routes and scenarios in them, each environment that you train with gets its own file which are each specialized to a particular town to avoid reloading. The particular routes we used can be found here. We have tested and used the 6 scenarios of the longest6 benchmark with the scenario generation script. The other scenarios of the CARLA leaderboard 2.0 are also implemented but not well tested. There might be many bugs (the generated scenarios are very noisy in general). Additionally, we were not able to generate the 6 scenarios that involve highway entries and exits because the highway entries and exits are not labelled in CARLA's HD-Map, so we can't automatically find them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

CaRL - CARLA

Structure:

News

Performance on longest6v2

Contents

Setup

Pre-Trained Models

Local Debugging

Evaluation Debugging

Training Debugging

Benchmarking

Longest6 v2

Result aggregation

Training

Custom CARLA

Training scripts

Viewing training logs

CPP Training and Evaluation Code

Scenario generation

FilesExpand file tree

CARLA

Directory actions

More options

Directory actions

More options

Latest commit

History

CARLA

Folders and files

parent directory

README.md

CaRL - CARLA

Structure:

News

Performance on longest6v2

Contents

Setup

Pre-Trained Models

Local Debugging

Evaluation Debugging

Training Debugging

Benchmarking

Longest6 v2

Result aggregation

Training

Custom CARLA

Training scripts

Viewing training logs

CPP Training and Evaluation Code

Scenario generation