-
Notifications
You must be signed in to change notification settings - Fork 50
Closed
Description
Hi,
I was going to upload a quip 2 bit version of llama2 model which I took as a chance as an experiment to this method.
https://huggingface.co/Yhyu13/Xwin-Math-7B-V1.0-QUIP-2bit
But as I mentioned in its readme, the hessian pass took quite long, about 6 hours, and the final ppl for quip 2bit is not so ideal. The model perfomance also download grade noticeably.
The conversion process does not throw any error though, and the evlaution process is smooth, too. What could went wrong. I might spent some time to re-run the whole process as double check
Here is my script
#!/bin/bash
eval "$(conda shell.bash hook)"
conda activate quip
MODEL_NAME=Xwin-Math-7B-V1.0
#MODEL_NAME=ShareGPT4V-7B
BASE_MODEL_DIR=/media/hangyu5/Home/Documents/Hugging-Face/$MODEL_NAME/
SAVE_MODEL_DIR=/media/hangyu5/Home/Documents/Hugging-Face/$MODEL_NAME-QUIP/
TMP_MODEL_DIR=./$MODEL_NAME
BATCH_SIZE=2
CTX_LEN=4096
cd repo/quip-sharp/
if [! -d "$TMP_MODEL_DIR" ]; then
mkdir -p $TMP_MODEL_DIR
fi
TRANSFORMERS_VERBOSITY=debug CUDA_VISIBLE_DEVICES=0 python ./hessian_offline_llama.py \
--seed 34 \
--base_model $BASE_MODEL_DIR \
--save_path $TMP_MODEL_DIR/Hessian/ \
--ctx_size $CTX_LEN \
--batch_size $BATCH_SIZE \
| tee $TMP_MODEL_DIR/Hessian-quip.log
TRANSFORMERS_VERBOSITY=debug CUDA_VISIBLE_DEVICES=0 python ./quantize_llama.py \
--seed 34 \
--base_model $BASE_MODEL_DIR \
--hessian_path $TMP_MODEL_DIR/Hessian \
--save_path $TMP_MODEL_DIR/Ckpt \
--ctx_size $CTX_LEN \
--batch_size $BATCH_SIZE \
--codebook E8P12 \
--scale_override 0.9 \
| tee $TMP_MODEL_DIR/Ckpt-quip.log
TRANSFORMERS_VERBOSITY=debug CUDA_VISIBLE_DEVICES=0 python hfize_llama.py \
--quantized_path $TMP_MODEL_DIR/Ckpt \
--hf_output_path $SAVE_MODEL_DIR \
| tee $TMP_MODEL_DIR/HFize-quip.log
TRANSFORMERS_VERBOSITY=debug CUDA_VISIBLE_DEVICES=0 python eval_ppl.py \
--seed 34 \
--hf_path $BASE_MODEL_DIR \
--seqlen $CTX_LEN \
| tee $TMP_MODEL_DIR/PPl-quip.log
cd ../../
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels