Skip to content

Tsinghua-MARS-Lab/SLAM-Former

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SLAM-Former: Putting SLAM into One Transformer

arXiv Project Page

Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, Hang Zhao

IIIS, Tsinghua University

@article{slam-former,
      title={SLAM-Former: Putting SLAM into One Transformer}, 
      author={Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, and Hang Zhao},
      journal={arXiv preprint arXiv:2509.16909},
      year={2025}
}

Updates

  • [May 11, 2026] Added two ConvHead checkpoint variants. V1.1.pth is recommended with --target_size 518; V1.1-long.pth is recommended with --target_size 224. V1.1-long.pth is trained with scaled sequence lengths and supports longer-sequence inference. ConvHead mainly fixes the grid artifact issue. Thanks to Pi3X for the insight.
  • [Mar 11, 2026] Released training code. See the training branch for details.
  • [Mar 4, 2026] Released SLAM code with KV pruning available.
  • [Feb 26, 2026] Provides the training data.
  • [Sep 24, 2025] Some good blogs can help you read SLAM-Former: here and here.
  • [Sep 23, 2025] Preprint release.

Getting Started

1. Clone SLAM-Former

git clone https://github.com/Tsinghua-MARS-Lab/SLAM-Former.git
cd SLAM-Former

2. Create conda environment

conda create -n SLAM-Former python=3.11
conda activate SLAM-Former 

3. Install requirements

pip install -r requirements.txt
pip install -e .

Running SLAM Demo

Download checkpoint: v1

Prepare a folder containing your image sequence, then run:

python slam/demo.py \
    --ckpt_path .ckpt/checkpoint-10.pth.model \
    --image_folder /path/to/your/images/ \
    --output_dir ./output/result \
    --target_size 518 \
    --retention_ratio 0.5

For evaluation, please use 100% kv: --retention_ratio 1. See issue #15 for details.

Visualization

Real-time visualization during inference: add --vis to the command above. The 3D reconstruction process can be viewed interactively in Rerun. This mode is intended for local machines with a desktop session; it is not recommended on remote servers because it depends on launching an interactive viewer during inference.

Static visualization of saved results: first run slam/demo.py without --vis to save final.ply, final_traj.txt, and final_pc/, then start the browser-based viewer:

python slam/visualize_results.py \
    --result_dir /path/to/output_dir \
    --port 8080

The static viewer serves an HTTP page at http://localhost:8080. This mode is recommended for remote servers: forward the port to your local machine, then open the forwarded URL in your browser.

ssh -L 8080:localhost:8080 user@remote-server

Training Data

Checkpoint List

  • v1 — recommended to use --target_size 518 for inference.
  • V1.1.pth — ConvHead checkpoint, recommended to use --target_size 518 for inference.
  • V1.1-long.pth — ConvHead checkpoint, recommended to use --target_size 224 for inference; trained with scaled sequence lengths to support longer-sequence inference.

License

This project adopts a dual-licensing strategy:

Component License Commercial Use
Code BSD 3-Clause Permitted
Model Weights (checkpoints) CC BY-NC 4.0 Strictly Non-Commercial

About

SLAM-Former: Putting SLAM into One Transformer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors