SLAM-Former: Putting SLAM into One Transformer

Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, Hang Zhao

IIIS, Tsinghua University

@article{slam-former,
      title={SLAM-Former: Putting SLAM into One Transformer}, 
      author={Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, and Hang Zhao},
      journal={arXiv preprint arXiv:2509.16909},
      year={2025}
}

Updates

[May 11, 2026] Added two ConvHead checkpoint variants. V1.1.pth is recommended with --target_size 518; V1.1-long.pth is recommended with --target_size 224. V1.1-long.pth is trained with scaled sequence lengths and supports longer-sequence inference. ConvHead mainly fixes the grid artifact issue. Thanks to Pi3X for the insight.
[Mar 11, 2026] Released training code. See the training branch for details.
[Mar 4, 2026] Released SLAM code with KV pruning available.
[Feb 26, 2026] Provides the training data.
[Sep 24, 2025] Some good blogs can help you read SLAM-Former: here and here.
[Sep 23, 2025] Preprint release.

Getting Started

1. Clone SLAM-Former

git clone https://github.com/Tsinghua-MARS-Lab/SLAM-Former.git
cd SLAM-Former

2. Create conda environment

conda create -n SLAM-Former python=3.11
conda activate SLAM-Former

3. Install requirements

pip install -r requirements.txt
pip install -e .

Running SLAM Demo

Download checkpoint: v1

Prepare a folder containing your image sequence, then run:

python slam/demo.py \
    --ckpt_path .ckpt/checkpoint-10.pth.model \
    --image_folder /path/to/your/images/ \
    --output_dir ./output/result \
    --target_size 518 \
    --retention_ratio 0.5

For evaluation, please use 100% kv: --retention_ratio 1. See issue #15 for details.

Visualization

Real-time visualization during inference: add --vis to the command above. The 3D reconstruction process can be viewed interactively in Rerun. This mode is intended for local machines with a desktop session; it is not recommended on remote servers because it depends on launching an interactive viewer during inference.

Static visualization of saved results: first run slam/demo.py without --vis to save final.ply, final_traj.txt, and final_pc/, then start the browser-based viewer:

python slam/visualize_results.py \
    --result_dir /path/to/output_dir \
    --port 8080

The static viewer serves an HTTP page at http://localhost:8080. This mode is recommended for remote servers: forward the port to your local machine, then open the forwarded URL in your browser.

ssh -L 8080:localhost:8080 user@remote-server

Training Data

Links:
- Hugging Face (ARKitScenes, MVS-Synth, ScanNet, ScanNet++, Blended-MVS, MegaDepth)
- Hugging Face (Hypersim)

Checkpoint List

v1 — recommended to use --target_size 518 for inference.
V1.1.pth — ConvHead checkpoint, recommended to use --target_size 518 for inference.
V1.1-long.pth — ConvHead checkpoint, recommended to use --target_size 224 for inference; trained with scaled sequence lengths to support longer-sequence inference.

License

This project adopts a dual-licensing strategy:

Component	License	Commercial Use
Code	BSD 3-Clause	Permitted
Model Weights (checkpoints)	CC BY-NC 4.0	Strictly Non-Commercial

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
slam		slam
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SLAM-Former: Putting SLAM into One Transformer

Updates

Getting Started

1. Clone SLAM-Former

2. Create conda environment

3. Install requirements

Running SLAM Demo

Visualization

Training Data

Checkpoint List

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SLAM-Former: Putting SLAM into One Transformer

Updates

Getting Started

1. Clone SLAM-Former

2. Create conda environment

3. Install requirements

Running SLAM Demo

Visualization

Training Data

Checkpoint List

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages