Bootstrapping ViTs

Towards liberating vision Transformers from pre-training.

Official code for paper Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training

Authors: Haofei Zhang, Jiarui Duan, Mengqi Xue, Jie Song, Li Sun, Mingli Song

Results (Top-1 Accuracy)

1. CIFAR

Model	Method	CIFAR-10	CIFAR-100
CNNs	EfficientNet-B2 ResNet50 Agent-S Agent-B	94.14 94.92 94.18 94.83	75.55 77.57 74.62 74.78
ViTs	ViT-S ViT-S-SAM ViT-S-Sparse ViT-B ViT-B-SAM ViT-B-Sparse	87.32 87.77 87.43 79.24 86.57 83.87	61.25 62.60 62.29 53.07 58.18 57.22
Pre-trained ViTs	ViT-S ViT-B	95.70 97.17	80.91 84.95
Ours Joint	Agent-S ViT-S Agent-B ViT-B	94.90 95.14 95.06 95.00	74.06 76.19 76.57 77.83
Ours Shared	Agent-S ViT-S Agent-B ViT-B	93.22 93.72 92.66 93.34	74.06 75.50 74.11 75.71

2. ImageNet

Method	5% images	10% images	50% images
ResNet50 Agent-B	35.43 35.28	50.86 47.46	70.05 68.13
ViT-B ViT-B-SAM ViT-B-Sparse	16.60 16.67 10.39	28.11 28.66 28.92	63.40 64.37 66.01
Ours-Joint Ours-Shared	36.01 33.06	49.73 45.75	71.36 66.48

Quick Start

1. Prepare dataset

CIFAR: download cifar dataset to folder ~/datasets/cifar (you may specify this in configuration files).
ImageNet: download ImageNet dataset to folder ~/datasets/ILSVRC2012 and pre-process with this script.
We also support other datasets such as CUB200, Sketches, Stanford Cars, TinyImageNet.

2. Prepare cv-lib-PyTorch

Our code requires cv-lib-PyTorch. You should download this repo and checkout to tag bootstrapping_vits.

cv-lib-PyTorch is an open source repo currently maintained by me.

3. Requirements

torch>=1.10.2
torchvision>=0.11.3
tqdm
timm
tensorboard
scipy
PyYAML
pandas
numpy

4. Train from scratch

In dir config, we provide some configurations for training, including CIFAR100 and ImageNet-10%. The following script will start training agent-small from scratch on CIFAR100.

For training with SAM optimizer, the option --worker should be set to sam_train_worker.

export PYTHONPATH=/path/to/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9872
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg cls \
    --cfg-filepath config/cifar100/cnn/agent-small.yaml \
    --log-dir run/cifar100/cnn/agent-small \
    --worker worker

5. Ours Joint

export PYTHONPATH=/path/to/project/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9873
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg joint \
    --cfg-filepath config/cifar100/joint/agent-small-vit-small.yaml \
    --log-dir run/cifar100/joint/agent-small-vit-small \
    --use-amp \
    --worker mutual_worker

6. Ours Shared

export PYTHONPATH=/path/to/project/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9873
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg shared \
    --cfg-filepath config/cifar100/shared/agent-base-res_like-vit-base.yaml \
    --log-dir run/cifar100/shared/agent-base-res_like-vit-base \
    --use-amp \
    --worker mutual_worker

After training, the accuracy of the final epoch is reported instead of the best one.

Citation

If you found this work useful for your research, please cite our paper:

@article{zhang2021bootstrapping,
  title={Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training},
  author={Zhang, Haofei and Duan, Jiarui and Xue, Mengqi and Song, Jie and Sun, Li and Song, Mingli},
  journal={arXiv preprint arXiv:2112.03552},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
fig		fig
vit_mutual		vit_mutual
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dist_engine.py		dist_engine.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bootstrapping ViTs

Results (Top-1 Accuracy)

1. CIFAR

2. ImageNet

Quick Start

1. Prepare dataset

2. Prepare cv-lib-PyTorch

3. Requirements

4. Train from scratch

5. Ours Joint

6. Ours Shared

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Languages

License

zhfeing/Bootstrapping-ViTs-pytorch

Folders and files

Latest commit

History

Repository files navigation

Bootstrapping ViTs

Results (Top-1 Accuracy)

1. CIFAR

2. ImageNet

Quick Start

1. Prepare dataset

2. Prepare cv-lib-PyTorch

3. Requirements

4. Train from scratch

5. Ours Joint

6. Ours Shared

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Languages

Packages