[TinyLoRA]tinylora implementation by kashif · Pull Request #3024 · huggingface/peft

kashif · 2026-02-06T17:41:14Z

Adds TinyLoRA, a new PEFT method based on "TinyLoRA: Learning to Reason in 13 Parameters". TinyLoRA achieves extreme parameter efficiency by replacing LoRA's trainable low-rank matrices with a tiny trainable vector projected through fixed random bases.

The key idea: given a frozen SVD decomposition W ≈ B @ A (where B = U @ sqrt(S) and A = sqrt(S) @ V^T), the weight update is delta_W = B @ R @ A where R is an r x r trainable matrix (following LoRA-XS). TinyLoRA takes this further by parameterizing R as a linear combination of fixed random projection matrices:

  R = sum_i(v[i] * P[i])

where v is the only trainable parameter (as small as 13 values) and P_i are fixed random matrices seeded deterministically.

Features

Extreme efficiency: trainable parameter count is u per target module (or even less with weight tying), compared to r * (in + out) for LoRA
Weight tying: configurable sharing of v vectors across layers via weight_tying (0.0 = no sharing, 1.0 = all layers share one v)
SVD initialization: frozen A and B matrices computed from truncated SVD of pretrained weights, with singular values distributed equally via sqrt(S)
Full layer support: nn.Linear, Conv1D, and nn.Embedding
Merge/unmerge: full support including safe merge with NaN checking
LoRA conversion: supports_lora_conversion() -> True — delta weights can be converted to standard LoRA format via get_delta_weight
Deterministic projections: P matrices are seeded per-layer for reproducibility; optionally saved in checkpoints (save_projection=True)

Config

  from peft import TinyLoraConfig, get_peft_model

  config = TinyLoraConfig(
      r=2,              # SVD rank (frozen)
      u=64,             # trainable vector dimension
      weight_tying=0.0, # 0.0=no sharing, 1.0=full sharing
      target_modules="all-linear",
  )
  model = get_peft_model(base_model, config)

Architecture

TinyLoraLayer (base): SVD decomposition, projection init, get_delta_weight, supports_lora_conversion
Linear / Embedding: forward pass, merge/unmerge
TinyLoraModel: weight tying groups, shared v parameter management via nested ModuleDict/ParameterDict
update_layer follows LoRA's config-object pattern: (adapter_name, tinylora_v, v_key, r, config, **kwargs)

HuggingFaceDocBuilderDev · 2026-02-06T17:45:26Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

kashif · 2026-02-10T15:19:20Z

cc @jxmorris12 I have an implementation of TinyLoRA if you can kindly have a look?

githubnemo

Hey @kashif :)

Thanks for the PR, this is already solid.
Merging with main should hopefully resolve the CI errors.

Some questions and comments below.

docs/source/package_reference/tinylora.md

src/peft/tuners/tinylora/__init__.py

src/peft/tuners/tinylora/config.py

tests/test_decoder_models.py

tests/test_encoder_decoder_models.py

tests/test_feature_extraction_models.py

tests/test_seq_classifier.py

kashif · 2026-02-23T15:57:02Z

thanks fixing

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

kashif · 2026-02-24T14:09:00Z

@githubnemo should be ready for another review thanks

githubnemo

Thanks for the quick response :)

I think implementation-wise this is, except for two nits, good to go.

Let's add an example that showcases the primary use-case and add the method to the method comparison suite (maybe copy from method_comparison/MetaMathQA/experiments/lora/... and see where it takes us).

It'd also be stellar to have a commit message / PR description that is meaningful in the commit history.

src/peft/tuners/tinylora/model.py

src/peft/tuners/tinylora/layer.py

kashif · 2026-02-24T17:40:39Z

ready @githubnemo

BenjaminBossan · 2026-02-25T11:37:23Z

Thanks for the PR Kashif. I ran the experiments on my machine and for got a test accuracy of 0% and 0.002% :)
This isn't really surprising with 3584 and 64 trainable parameters, respectively, and basically confirms the paper results. Still good to include them, but without context users could be drawing the wrong conclusion.

kashif · 2026-02-25T12:35:06Z

yes @BenjaminBossan i will test with the RL setup, we can wait if its ok, I want to also double check that nothing is wrong

BenjaminBossan · 2026-02-26T16:32:54Z

Out of curiosity, I wanted to check if TinyLoRA can achieve better scores if we increase the number of trainable parameters. So I took the default* setting and increased u and indeed, we can get decent results.

u	accelerator memory max	num trainable params	test_accuracy
512	20310917120	28672	0.30477634571645185
1024	20434649088	57344	0.3434420015163002
2024	20673724416	113344	0.31766489764973466

Given the still tiny number of trainable parameters, this result is quite respectable. This is also a nice confirmation that there is no major bug in the implementation.

I wonder if it would make sense to have a "maximalist" and a "minimalist" config, i.e. one with more trainable parameters and better score and one with extremely few trainable parameters (basically the current llama-3.2-3B-weight-tying) and low score.

*One more change I did was to increase r to 32 as 2 seemed pretty small to me, but that was just on a whim.

githubnemo · 2026-02-26T17:21:38Z

I wonder if it would make sense to have a "maximalist" and a "minimalist" config, i.e. one with more trainable parameters and better score and one with extremely few trainable parameters (basically the current llama-3.2-3B-weight-tying) and low score.

I think that's a good thing to have!

I also wondered if it would make sense to extend the target modules to all-linear since it is so cheap, but then again this would diverge from other experiments quite a lot and be not as comparable.

kashif · 2026-03-01T08:59:42Z

should we just document this? or add it somewhere else?

BenjaminBossan · 2026-03-02T14:22:57Z

I also wondered if it would make sense to extend the target modules to all-linear since it is so cheap, but then again this would diverge from other experiments quite a lot and be not as comparable.

I'd rather target either the attention xor the MLP part for consistence with other experiments.

should we just document this? or add it somewhere else?

What does "this" reference here?

kashif · 2026-03-02T14:24:34Z

ah sorry, i meant the minimal/maximal config?

githubnemo · 2026-03-03T09:53:55Z

ah sorry, i meant the minimal/maximal config?

Yes, let's add a 'maximalist' config with (possibly) r > 2 and u >= 2048. VeRA has ~128k parameters with 37.6% task accuracy according to https://huggingface.co/spaces/peft-internal-testing/PEFT-method-comparison - maybe it makes sense to match that setting (u ~= 2300) to see where the ceiling is within a reasonable memory budget (~+0.5GB).

initial tinylora

e85e43c

kashif added 9 commits February 6, 2026 21:11

support Embedding

79ee3c9

add tests

13e6527

proper init

76255ca

use _init_tinylora_v

23044c6

fix links and reference

03d841d

sample for torch.normal

09e164c

update docstring

5e03f26

formatting

58c4f5f

fix abstract

913b5fc

kashif requested a review from githubnemo February 8, 2026 08:21

githubnemo reviewed Feb 23, 2026

View reviewed changes

kashif and others added 13 commits February 23, 2026 19:44

Merge branch 'main' into tinylora

473f316

fix merge

282e969

Update src/peft/tuners/tinylora/layer.py

f12dbad

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

Update tests/test_decoder_models.py

ef698c1

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

Update tests/test_seq_classifier.py

8151ac6

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

Update tests/test_feature_extraction_models.py

b0990bb

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

Update tests/test_encoder_decoder_models.py

9b1216a

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

Update src/peft/peft_model.py

b8b7f96

Co-authored-by: githubnemo <githubnemo@users.noreply.github.com>

address comments

19b626b

define u as size of vector

50d2982

use nested ModuleDict/ParameterDict

8c2c55a

use weight_tying float

36ab788

update

d69ba7e

githubnemo reviewed Feb 24, 2026

View reviewed changes

src/peft/tuners/tinylora/model.py Outdated Show resolved Hide resolved

src/peft/tuners/tinylora/layer.py Show resolved Hide resolved

kashif added 2 commits February 24, 2026 17:31

r stays position, everything else from config

8b9c2d9

fix formatting

8d4fe22

kashif changed the title ~~[TinyLoRA] initial tinylora implementation~~ [TinyLoRA]tinylora implementation Feb 24, 2026

kashif added 2 commits February 25, 2026 09:52

fix test

200f798

added to methon_comparison

c52c26a

Merge branch 'main' into tinylora

831b1e8

Merge branch 'main' into tinylora

b0dbbb4

Conversation

kashif commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Config

Architecture

Uh oh!

HuggingFaceDocBuilderDev commented Feb 6, 2026

Uh oh!

kashif commented Feb 10, 2026

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kashif commented Feb 23, 2026

Uh oh!

kashif commented Feb 24, 2026

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kashif commented Feb 24, 2026

Uh oh!

BenjaminBossan commented Feb 25, 2026

Uh oh!

kashif commented Feb 25, 2026

Uh oh!

BenjaminBossan commented Feb 26, 2026

Uh oh!

githubnemo commented Feb 26, 2026

Uh oh!

kashif commented Mar 1, 2026

Uh oh!

BenjaminBossan commented Mar 2, 2026

Uh oh!

kashif commented Mar 2, 2026

Uh oh!

githubnemo commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kashif commented Feb 6, 2026 •

edited

Loading

githubnemo commented Mar 3, 2026 •

edited

Loading