Skip to content

[Refactor] Reduce per-model code duplication in modelconfig package #587

@pallasathena92

Description

@pallasathena92

Summary

The pkg/hfutil/modelconfig package has 36 separate .go model files (6,187 lines) that each define a struct and implement the same HuggingFaceModel interface with near-identical methods. No external code depends on specific model types — all consumers use the interface only.

Key Findings

  • GetModelSizeBytes(), GetQuantizationType() are character-for-character identical across 30+ files
  • GetParameterCount() follows the exact same 3-phase pattern (safetensors → hardcoded lookup → estimation) in every file
  • GetContextLength() differs meaningfully in only 4 models
  • HasVision() returns false in 30+ files; IsEmbedding() returns false in 34 files
  • 9 different parameter estimation functions scattered across 6 files
  • Zero type assertions to specific model configs in production code

Bugs

  • mistral.go:144: IsEmbedding() returns true — wrong for a LLM
  • phi.go, llama4.go: shadow ConfigPath field already in BaseModelConfig

Refactoring Plan

Phase 1: Infrastructure (no behavior change)

Step 1.1 — Consolidate estimation functions in interface.go

9 estimation functions scattered across 6 files:

Function Location Action
estimateModelParams phi3_v.go:111 Move to interface.go as canonical EstimateModelParams()
estimateGenericParams interface.go:265 Replace with more accurate phi3_v version
estimateParamsFromArchitecture llama.go:141 Replace with call to EstimateModelParams
estimateTextParams mllama.go:132 Move to interface.go as shared helper
estimateVisionParams mllama.go:138 Deduplicate (identical to estimateTextParams)
estimateMoEParamCount llama4.go:258 Consolidate into shared EstimateMoEParams
estimateMoEParams deepseek_vl.go:204 Consolidate into shared EstimateMoEParams
estimateQwen3VLMoEParams qwen3_vl.go:169 Deduplicate with shared MoE estimator
estimateQwen3VLVisionParams qwen3_vl.go:190 Move to interface.go as shared helper

Step 1.2 — Fix mistral.go IsEmbedding bug

Step 1.3 — Add StandardModelConfig to interface.go

New struct embedding BaseModelConfig with common transformer fields. Provides default implementations for GetParameterCount(), GetContextLength(), GetModelSizeBytes(), GetQuantizationType() with a per-model paramLookupTable.

Phase 2: Consolidate simple text models (~20 files → 1 file)

Create models_text.go with thin wrapper structs embedding StandardModelConfig. Use generic loader factory (Go 1.25).

Models: llama, mistral, gemma, gemma2, gemma3_text, phi3, phi3small, exaone, command_r, internlm, internlm2, stablelm, xverse, minicpm, minicpm3

Create models_text_special.go for models with custom GetContextLength(): Qwen family (qwen, qwen2, qwen3), Baichuan.

Phase 3: Consolidate MoE models (~5 files → 1 file)

Create models_moe.go with MoEModelConfig embedding StandardModelConfig + MoE fields. Override GetParameterCount() with MoE-aware estimation.

Models: mixtral, phimoe, qwen3_moe, gpt_oss, kimi_k2, deepseek_v3

Phase 4: Clean up vision models (keep separate, reduce duplication)

Vision models keep individual files (genuinely unique nested configs). But:

  • Remove re-implemented methods that BaseModelConfig already provides
  • Use shared estimation helpers from Phase 1

Phase 5: Clean up standalone special models

Keep as individual files due to non-standard JSON field names:

  • chatglm.gonum_layers, ffn_hidden_size, padded_vocab_size
  • dbrx.god_model, n_heads, n_layers
  • bert.goIsEmbedding() = true, BERT-specific estimation
  • phi.go — doesn't embed BaseModelConfig

Target File Structure

modelconfig/
  interface.go              # Interface, BaseModelConfig, StandardModelConfig, utilities
  safetensors.go            # Safetensors parsing (unchanged)
  diffusion.go              # Diffusion pipelines (unchanged)
  models_text.go            # ~15 simple text models (consolidated)
  models_text_special.go    # Qwen/Baichuan with custom GetContextLength
  models_moe.go             # MoE models (consolidated)
  chatglm.go                # Standalone (non-standard fields)
  dbrx.go                   # Standalone (non-standard fields)
  bert.go                   # Standalone (embedding model)
  phi.go                    # Standalone (non-standard structure)
  mllama.go                 # Vision (standalone)
  llava.go                  # Vision (standalone)
  gemma3.go                 # Vision (standalone)
  qwen2_vl.go               # Vision (standalone)
  qwen3_vl.go               # Vision (standalone)
  phi3_v.go                 # Vision (standalone)
  deepseek_vl.go            # Vision (standalone)
  llama4.go                 # Vision + MoE (standalone)

Result: 36 model files → ~18 files, significant code deduplication

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions