This repository contains the official PyTorch implementation for the ICLR 2026 paper "LogART: Pushing The Limit of Efficient Logarithmic Post-training Quantization".
-
Clone this repository:
git clone https://github.com/logart-lab/logart.git
Then navigate to the desired directory:
cd LogART/LogART-LLMOr:
cd LogART/LogART-CNNOr:
cd LogART/LogART-ViT -
Install PyTorch:
Ensure you have PyTorch installed. You can install PyTorch 1.10.0 with the following command:
pip install torch==1.10.0 torchvision --index-url https://download.pytorch.org/whl/cu113
Pretrained models for LogART-LLM can be obtained using the transformers library (version 4.43.2). Ensure transformers is installed:
pip install transformers==4.43.2For dataset loading and preprocessing, this project uses the datasets library, tested on version 2.20.0. Install it with:
pip install datasets==2.20.0The pretrained models for LogART-CNN are sourced from BRECQ and can be accessed via torch.hub. For example, to load a pretrained ResNet-18 model:
res18 = torch.hub.load('yhhhli/BRECQ', model='resnet18', pretrained=True)If you encounter a URLError during download (likely due to network issues), you can manually download the model using wget and place it in the appropriate directory:
wget https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet50_imagenet.pth.tar
mv resnet50_imagenet.pth.tar ~/.cache/torch/checkpointsThe load_state_dict_from_url function will check the ~/.cache/torch/checkpoints directory before attempting to download.
Pretrained models for LogART-ViT can be obtained using the timm library (version 0.9.2). Ensure timm is installed:
pip install timm==0.9.2Alternatively, you can directly download checkpoints provided by AdaLog. For example:
wget https://github.com/GoatWu/AdaLog/releases/download/v1.0/deit_tiny_patch16_224.bin
mkdir -p ./checkpoint/vit_raw/
mv deit_tiny_patch16_224.bin ./checkpoint/vit_raw/The code for LogART-LLM was modified based on aespa. To quantize and evaluate the LLM, use the following command:
python main.py --model_path facebook/opt-125m --calib_data c4 --nsamples 32 --seqlen 2048 --seed 0 --w_bits 3 --num_iters 500 --lr 0.05 --w_method log_dynamic -scale_search --rrweight 1 [--hardware_approx] --learn_rounding [--test_with_hardware_approx]
-
--model_path: Path to the LLM model. Choices:facebook/opt-125m,facebook/opt-1.3b,facebook/opt-6.7b,meta-llama/Llama-2-7b,meta-llama/Llama-3.1-8B. -
--calib_data: Name of the calibration dataset used for quantization. -
--nsamples: Number of samples selected from the calibration dataset for quantization. -
--seqlen: Sequence length of the input text processed by the model. -
--seed: Random seed for ensuring reproducibility of experiments. -
--w_bits: Bitwidth for weight quantization. -
--num_iters: Number of iterations in learnable rounding reconstruction. -
--lr: Adaptive rounding learning rate. -
--w_method: Quantization method. Choices:log_2,log_sqrt2,log_dynamic. -
--scale_search: Whether to search for the best scaling factor. -
--rrweight: Weight of rounding cost vs the reconstruction loss. -
--hardware_approx: Apply1+1/2hardware approximation method to$\sqrt{2}$ . -
--learn_rounding: Whether to learn weight-rounding policy based on AdaRound. -
--test_with_hardware_approx: Whether to test the quantized model with hardware approximation.
The code for LogART-CNN was modified based on BRECQ. To quantize and evaluate a single CNN model, use the following command:
python main.py --data_path DATA_DIR --arch resnet18 -seed 0 --n_bits_w 4 --channel_wise --scale_method log_dynamic --search_method layer_wise --search_samples 32 [--test_before_calibration] --iters_w 2000 --calib_lr 0.05 --weight 1 [--hardware_approx] [--test_with_hardware_approx]
-
--data_path: Path to ImageNet data. -
--arch: Model architecture to use. Choices:resnet18,resnet50,mobilenetv2. -
--seed: Random seed for ensuring reproducibility of experiments. -
--n_bits_w: Bitwidth for weight quantization. -
--channel_wise: Enable channel-wise quantization for weights. -
--scale_method: Linear quantization method or logarithmic base. Choices:log_2,log_sqrt2,log_dynamic. -
--search_method: Dynamic base search method of the hyperparameter searching process for dynamic logarithmic quantization. Choices: 'tensor_wise', 'layer_wise'. -
--search_samples: Size of the hyperparameter searching dataset. -
--test_before_calibration: Test the quantization accuracy after hyperparameter searching and before reconstruction. -
--iters_w: Number of iterations in learnable rounding reconstruction. -
--calib_lr: Adaptive rounding learning rate. -
--weight: Weight of rounding cost vs the reconstruction loss. -
--hardware_approx: Apply hardware-level log$\sqrt{2}$ approximation to1+1/2. -
--test_with_hardware_approx: Whether to test the quantized model with hardware approximation.
The code for LogART-ViT was modified based on APHQ. To quantize and evaluate a single ViT model, use the following command:
python main.py --dataset DATA_DIR --model vit_base -seed 0 --config ./configs/4bit/best.py --iters_w 2000 --lr 0.05 --scale-method log_dynamic -search_method block_wise --rrweight 1 [--hardware_approx] [--optimize] [--test_with_hardware_approx]
-
--dataset: Path to the dataset. -
--model: Model architecture. Choices:vit_tiny,vit_small,vit_base,vit_large,deit_tiny,deit_small,deit_base. -
--seed: Random seed for ensuring reproducibility of experiments. -
--config: File path to import the Config class. -
--optimize: Perform learnable rounding reconstruction to the model. -
--iters_w: Number of iterations in learnable rounding reconstruction. -
--lr: Adaptive rounding learning rate. -
--scale-method: Quantization method. Choices:log_2,log_sqrt2,log_dynamic. -
--search_method: Dynamic base search method of the hyperparameter searching process for dynamic logarithmic quantization. Choices: 'tensor_wise', 'block_wise'. -
--rrweight: Weight of rounding cost vs the reconstruction loss. -
--hardware_approx: Apply1+1/2hardware approximation method to$\sqrt{2}$ . -
--test_with_hardware_approx: Whether to test the quantized model with hardware approximation.
Results will be stored in ./results.csv. The ablation results of LogART's key components on LLMs with 3-bit channel-wise weight quantization are shown in the table below:
| DBS | SFS | ABS | LLR | Calib. Data | OPT-125M | LLaMA2-7B | ||||
| PPL | Time | Memory | PPL | Time | Memory | |||||
| × | × | × | × | - | 170.64 | 0.7 s | 0.40 GB | 60.16 | 13.0 s | 9.8 GB |
| × | × | × | ✓ | 32 | 38.55 | 61.3 s | 0.75 GB | 9.74 | 58.6 min | 20.9 GB |
| × | × | ✓ | × | - | 79.70 | 0.7 s | 0.40 GB | 8.28 | 13.2 s | 9.8 GB |
| × | × | ✓ | ✓ | 32 | 36.39 | 61.3 s | 0.75 GB | 6.44 | 58.2 min | 20.9 GB |
| × | ✓ | × | × | 32 | 38.41 | 12.1 s | 0.75 GB | 6.66 | 6.6 min | 20.9 GB |
| × | ✓ | × | ✓ | 32 | 33.21 | 64.6 s | 0.75 GB | 6.24 | 63.1 min | 20.9 GB |
| × | ✓ | ✓ | × | 32 | 35.15 | 12.2 s | 0.75 GB | 6.55 | 6.6 min | 20.9 GB |
| × | ✓ | ✓ | ✓ | 32 | 32.55 | 64.6 s | 0.75 GB | 6.23 | 63.5 min | 20.9 GB |
| ✓ | × | × | × | 32 | 66.63 | 3.8 s | 0.75 GB | 18.49 | 83.2 s | 20.9 GB |
| ✓ | × | × | ✓ | 32 | 35.46 | 62.7 s | 0.75 GB | 9.26 | 59.0 min | 20.9 GB |
| ✓ | × | ✓ | × | 32 | 47.92 | 3.8 s | 0.75 GB | 7.82 | 82.2 s | 20.9 GB |
| ✓ | × | ✓ | ✓ | 32 | 33.68 | 62.9 s | 0.75 GB | 6.38 | 59.1 min | 20.9 GB |
| ✓ | ✓ | × | × | 32 | 36.10 | 16.8 s | 0.75 GB | 6.56 | 17.9 min | 20.9 GB |
| ✓ | ✓ | × | ✓ | 32 | 32.37 | 75.0 s | 0.75 GB | 6.19 | 73.7 min | 20.9 GB |
| ✓ | ✓ | ✓ | × | 32 | 34.29 | 17.0 s | 0.75 GB | 6.45 | 17.9 min | 20.9 GB |
| ✓ | ✓ | ✓ | ✓ | 32 | 31.15 | 75.1 s | 0.75 GB | 6.14 | 74.2 min | 20.9 GB |
If you find this work is useful for your research, please cite our paper:
@inproceedings{xu2026logart,
title={Log{ART}: Pushing the Limit of Efficient Logarithmic Post-Training Quantization},
author={Xu, Jiawei and Zheng, Yi and Sun, Chenghe and Zhou, Taiyu and Zhang, Zuqi and Li, Jie and Zheng, Lirong and Zou, Zhuo},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=V85HbymBLW}
}