🏰 CASAL is a method for activation steering training of large language models.
- Activation Steering: Implementation of activation steering during inference
- Training: Support for CASAL training
- MoE Model Support: Support for training Mixture-of-Experts (MoE) models
- Visual Model Support: Support for training vision-language models
- PCA Visualization: Visualization tools for analyzing activation patterns before and after training
CASAL/
├── CONFIG/ # Configuration files for different training methods
├── DATA/ # Data loading and preprocessing utilities
├── CASAL/ # Core CASAL implementation
├── EVAL/ # Evaluation scripts and utilities
├── MODEL/ # Model utilities that support various model families
├── UTILS/ # General utility functions
├── ACTIVATION_PCA/ # PCA analysis tools for activation visualization
├── ANALYSIS/ # Analysis and plotting tools
Make sure you have conda installed on your system.
conda create --name casal python=3.11.9
conda activate casal- Clone the code base:
git clone https://github.com/facebookresearch/CASAL.git-
stall dependencies:
-
For AWS user:
pip install -r requirements.txt- Use my huggingface and wandb token.
- Create a .env file in the root folder and include the following keys:
# Hugging Face API Token
HF_TOKEN=
# Weights & Biases API Token
WANDB_API_KEY=
To get started, run inference-time activation steering:
python run_casal_steering.pypython run_casal_post_training.pypython run_baseline_eval.pypython run_post_casal_eval.pyIf you use CASAL in your research, please cite:
@inproceedings{
yang2026hallucination,
title={Hallucination Reduction with CASAL: Contrastive Activation Steering for Amortized Learning},
author={Wannan Yang and Xinchi Qiu and Lei Yu and Yuchen Zhang and Aobo Yang and Narine Kokhlikyan and Nicola Cancedda and Diego Garcia-Olano},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YM3RcI3q0E}
}CASAL is MIT licensed, as found in the LICENSE file.
