layer-cot

Setup

conda activate coconut
pip install -r requirements.txt

Run training

python train.py \
    --model_name gpt2 \
    --batch_size 128 \
    --learning_rate 1e-4 \
    --num_epochs 5 \
    --max_length 512 \
    --latent_thoughts_per_step 1 \
    --max_latent_length 20 \
    --num_training_stages 4 \
    --warmup_steps 100

Run Eval

python eval.py \
    --model <model_path> \
    --data_path <test_data_path> \
    --layers_to_delete <1 or more layers to delete> \
    --early_exit_bound <confidence bound for ealy exit>

Model paths: llama-3.2-1b-gsm8k-stepscot/checkpoint-16000/ for the one we trained on full CoT. checkpoints/llama-3.2.1b_gsm8k/checkpoint_75 for the one traned using the Internalized CoT script

Test data paths: To evaluate the internalized model trained using the Internalize COT train script, use Internalize_CoT_Step_by_Step/data/gsm8k/test.txt as the test data. To evaluate the model trained explicitly to output CoT, use implicit_chain_of_thought/data/gsm8k/test.txt as the test data.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
data.py		data.py
eval.py		eval.py
model.py		model.py
post_train.py		post_train.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

layer-cot

Setup

Run training

Run Eval

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

layer-cot

Setup

Run training

Run Eval

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages