Skip to content

NEUIR/LegalDelta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LegalΔ: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain

Source code for our paper :
LegalΔ: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain

Click the links below to view our papers, checkpoints:

If you find this work useful, please cite our paper and give us a shining star 🌟

@misc{dai2025legaldeltaenhancinglegalreasoning,
      title={Legal$\Delta$: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain}, 
      author={Xin Dai and Buqiang Xu and Zhenghao Liu and Yukun Yan and Huiyuan Xie and Xiaoyuan Yi and Shuo Wang and Ge Yu},
      year={2025},
      eprint={2508.12281},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.12281}, 
}

Overview

LegalΔ is a reinforcement learning framework designed to enhance legal reasoning through COT-guided information gain. During training, LegalΔ employs a dual-mode input setup—comprising direct answer and reasoning-augmented modes—and maximizes the information gain between them. This encourages the model to acquire meaningful reasoning patterns rather than generating superficial or redundant explanations. LegalΔ follows a two-stage approach: (1) distilling latent reasoning capabilities from a powerful Large Reasoning Model (LRM), DeepSeek-R1, and (2) refining reasoning quality via differential comparisons, combined with a multidimensional reward mechanism that assesses both structural coherence and legal-domain specificity.

Set Up

Use git clone to download this project

git clone https://github.com/NEUIR/LegalDelta.git
cd LegalDelta

To prevent conflicts between packages, we mainly use two virtual environment management packages, one for model inference and one for model training.

for model inference, please:
conda env create -n qwen_inf -f inference_environment.yml

for model training, please:
conda create -n legal_delta python=3.11
conda activate legal_delta
pip install -r requirements.txt --force-reinstall --no-deps --no-cache-dir

Using LegalΔ model

(1) Use git clone to download the model: ❗️Note: These are lora checkpoints of legal models, please merge it before use.

git clone https://huggingface.co/Xubqpanda/LegalDelta;

(2) Evaluating For different tasks, you need to use different metrics for evaluating.

  • for Lawbench, please visit here

  • for Lexeval, please visit here

  • for Disclaw, please visit here

Training the model

After constructing the training data, you can start training the LegalΔ model.

(1) First step: You need to download Qwen2.5-14B-Instruct model as Zero-shot Model.

(2) Second step: use lora to train the model

conda activate legal_delta
bash scripts/train.sh

(3) Third step: Select the checkpoint with the lowest eval loss, and combine the weights of the LegalΔ model trained using lora in Second step.

python src/merge_lora.py

Contact

If you have questions, suggestions, and bug reports, please email:

daix1@mails.neu.edu.cn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages