Skip to content

ratcht/llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM

Complete implementations of large language models including all sub-components. Also includes training/fine-tuning implementations (i.e LoRA, QLoRA).

Repository Structure

finetuning/
├── lora/              # LoRA implementation & intuition
└── qlora/             # QLoRA implementation & intuition

models/
├── gpt/               # GPT-1 style implementation
└── llama/             # LLaMA-1/2 implementation

What's Implemented

GPT (models/gpt/):

  • Multi-head self-attention with causal masking
  • Learned positional embeddings
  • LayerNorm, feedforward blocks
  • Training loop with loss estimation

LLaMA (models/llama/):

  • Multi-head attention with Rotary Position Embeddings (RoPE)
  • RMSNorm (instead of LayerNorm)
  • SwiGLU feedforward network
  • Top-p sampling for generation
  • SentencePiece tokenizer

Usage

GPT:

cd models/gpt
python train.py

LLaMA:

cd models/llama
python generate.py

Default Configurations

Parameter GPT LLaMA
Embedding dim 384 4096
Hidden dim - 11008
Heads 6 32
Layers 6 32
Context length 256 2048
Dropout 0.2 0.0

References

About

LLM implementations (GPT, LLaMA)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published