Neural Foam

Dynamic neuron growth for transformers. Grow new neurons during training instead of using fixed architectures. Preserves existing knowledge while adding new capabilities.

Results

Trained on Qwen2.5-1.5B-Instruct, adding tool use + identity + autonomous reasoning while preserving science reasoning:

Capability	Neural Foam	Raw Qwen 1.5B	Delta
ARC-Easy (log-likelihood)	74.0%	77.5%	-3.5
ARC-Challenge	65.0%	70.0%	-5.0
Tool Use	10/10	0/10	+10
Custom Identity	5/5	0/5	+5
Autonomous Reasoning	5/5	0/5	+5

Only 3.5% reasoning drop while adding 4 new capability dimensions. Standard fine-tuning on the same data drops ARC-Easy to 29.5%.

Trained model: spartan8806/chimera-v3-qwen-1.5b

How It Works

Neural Foam replaces standard nn.Linear layers with GrowableLinear layers that can dynamically add neurons during training:

Percentile-based triggers — Only the top 10% of gradient pressure triggers growth (not every layer, not every step)
GradMax initialization — New neurons are initialized from high-gradient sources, not random
Contrastive growth — New neurons are orthogonalized against existing ones to ensure diversity
Neuron maturation — Young neurons get higher learning rates, can't be pruned until mature
Growth cooldown — Minimum 200 steps between growth events (growth should be rare and deliberate)
Memory replay — Buffer of old examples mixed into training to prevent catastrophic forgetting

The core insight: growth should be like neurogenesis, not cancer. Rare, deliberate, and targeted.

Install

pip install neural-foam

# With training dependencies
pip install neural-foam[training]

Or from source:

git clone https://github.com/spartan8806/neural-foam.git
cd neural-foam
pip install -e .

Reproducing Results

Run the ablation study to verify growth vs non-growth on your hardware:

cd examples
python ablation_study.py

This trains two models on identical data (tool use, identity, autonomy) with ARC memory replay:

Baseline: Standard fine-tuning (growth OFF)
Neural Foam: Growth enabled

Results saved to ablation_results/ablation_comparison.json with side-by-side metrics.

Expected runtime: ~30-40 min on RTX 3060 12GB.

Quick Start

Replace any nn.Linear with a growable version

from neural_foam import GrowableLinear

# Drop-in replacement for nn.Linear
layer = GrowableLinear(768, 3072, enable_replacement=True)

# During training, periodically check for growth
should_grow, source_indices = layer.check_growth()
if should_grow:
    result = layer.perform_growth(source_indices)
    print(f"Grew {result['born']} neurons, replaced {result['replaced']}")

Wrap a full Qwen model

from neural_foam import GrowableQwen

model = GrowableQwen(
    "Qwen/Qwen2.5-1.5B-Instruct",
    enable_growth=True,
    enable_replacement=True,   # V3: recycle dead neurons
    freeze_attention=True,     # Only train FFN (where growth happens)
)

# In training loop:
model.update_loss(loss.item())

if step % 100 == 0:
    result = model.check_and_grow()
    # result = {'born': 5, 'replaced': 2, 'died': 0}

Memory replay to prevent forgetting

from neural_foam import MemoryReplayBuffer

buffer = MemoryReplayBuffer(max_size=10000)

# Add examples you want to preserve
buffer.add("What is photosynthesis?", "Photosynthesis is...", loss=0.3)

# Mix old and new training data
mixed_batch = buffer.get_replay_batch(new_examples, replay_ratio=0.2)

Curriculum learning (easy to hard)

from neural_foam import CurriculumLearning

curriculum = CurriculumLearning()
sorted_data = curriculum.sort_by_difficulty(training_examples)
easy, medium, hard = curriculum.get_phases(training_examples)

V3 vs Grow-Only Mode

V3 (replacement ON): Best for single-domain or when starting from a fine-tuned base. Recycles dead neurons. First version to beat standard fine-tuning.

model = GrowableQwen(enable_replacement=True)  # V3

Grow-only (replacement OFF, default): Better for multi-domain continual learning. Never replaces existing neurons, only adds new ones.

model = GrowableQwen(enable_replacement=False)  # Grow-only

Key Parameters

Parameter	Default	Description
`gradient_percentile`	90.0	Only top 10% gradients trigger growth
`max_growth_per_step`	10	Neurons added per growth event
`growth_cooldown`	200	Min steps between growth events
`maturation_age`	500	Steps before neuron is "mature"
`enable_replacement`	False	Recycle dead neurons (V3 mode)

How Growth Works

Growth is triggered when all of these conditions are met:

Cooldown passed: steps_since_last_growth ≥ growth_cooldown (default: 200 steps)
High gradient pressure: gradient_ema[i] > quantile(gradient_ema, 0.90) — only top 10% of neurons by gradient
Optional plateau: Loss change over last 100 steps < threshold (disabled by default)

The gradient EMA is updated per-step:

gradient_ema[i] = 0.9 × gradient_ema[i] + 0.1 × |∇w[i]|

When triggered, new neurons are initialized via GradMax:

w_new = w_source + noise
w_new = orthogonalize(w_new, existing_neurons)  # contrastive growth

Young neurons (age < 500 steps) get 2× learning rate and cannot be pruned/replaced.

This ensures growth is:

Rare (cooldown + percentile threshold)
Targeted (high-gradient sources only)
Diverse (orthogonalization prevents duplicates)

Research References

RigL (Google, 2020) — gradient-based regrowth
GradMax (2022) — SVD initialization for new neurons
NICE (CVPR 2024) — neuron maturation
Wanda (2024) — pruning criterion
NeurRev (ICLR 2024) — dormant neuron prevention

Hardware

The Chimera V3 model (1.5B params) trains in ~13 minutes on a single RTX 3060 12GB using bfloat16 + 8-bit Adam.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
neural_foam		neural_foam
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Foam

Results

How It Works

Install

Reproducing Results

Quick Start

Replace any nn.Linear with a growable version

Wrap a full Qwen model

Memory replay to prevent forgetting

Curriculum learning (easy to hard)

V3 vs Grow-Only Mode

Key Parameters

How Growth Works

Research References

Hardware

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Neural Foam

Results

How It Works

Install

Reproducing Results

Quick Start

Replace any nn.Linear with a growable version

Wrap a full Qwen model

Memory replay to prevent forgetting

Curriculum learning (easy to hard)

V3 vs Grow-Only Mode

Key Parameters

How Growth Works

Research References

Hardware

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages