Skip to content

futhgar/homelab-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

The Production Homelab Guide

License: MIT

A comprehensive, opinionated guide for building a production-grade homelab using Proxmox VE and Kubernetes. Born from years of running a multi-node cluster at home, this guide covers everything from hardware selection to GPU passthrough for AI inference — including the hard-won lessons you won't find in official documentation.

Who This Guide Is For

  • Infrastructure engineers who want a real lab environment at home
  • DevOps/SRE practitioners looking to sharpen Kubernetes skills on bare metal
  • AI enthusiasts who want local GPU inference without cloud costs
  • Self-hosters ready to graduate from a single Raspberry Pi to a proper cluster

You should be comfortable with Linux, basic networking, and the command line. Prior Kubernetes experience helps but isn't required — the guide explains concepts as they come up.

What This Guide Covers

We build a full production-grade homelab stack:

Hardware (Mini PCs) → Proxmox VE Cluster → Kubernetes (kubeadm)
    → Calico CNI → Longhorn Storage → MetalLB Load Balancer
    → Traefik Ingress → ArgoCD GitOps → Prometheus/Grafana Monitoring
    → Velero Backups → GPU Passthrough → AI Inference (Ollama)

This is not a "click through the GUI" tutorial. It's a guide for people who want to understand why things are configured a certain way, not just how.

Table of Contents

Chapter Topic What You'll Learn
01 - Hardware Selection Choosing the right hardware Mini PCs vs servers, CPU/RAM/storage sizing, power budgets
02 - Proxmox Setup Proxmox VE cluster Installation, clustering, storage pools, VM vs LXC
03 - Kubernetes K8s on Proxmox Control plane, workers, CNI, storage, load balancing
04 - GitOps ArgoCD and CI/CD App-of-apps, repo structure, sealed secrets, GitHub Actions
05 - Monitoring Observability stack Prometheus, Grafana, alerting, dashboards
06 - Backups Backup strategy Velero, PBS, Longhorn snapshots, restore testing
07 - GPU Passthrough GPU for AI inference PCI passthrough, NVIDIA drivers, Ollama on K8s
08 - Gotchas Lessons learned The stuff that cost hours of debugging

Hardware Budget Tiers

Starter ($500) — Single Node

Component Recommendation Est. Cost
Mini PC AMD Ryzen 5 (5500U/5600U), 6C/12T $180-220
RAM 32 GB DDR4 SO-DIMM (2x16 GB) $60-80
NVMe 1 TB NVMe Gen3 $60-70
Network Built-in 1 GbE (sufficient for single node) $0
UPS APC BE425M (basic surge + battery) $50-60
Total ~$350-430

Good for: learning Proxmox, running a few VMs/LXCs, single-node K8s (k3s or kubeadm). Won't do real clustering but teaches the fundamentals.

Recommended ($1,500) — 3-Node Cluster

Component Recommendation Est. Cost
3x Mini PCs AMD Ryzen 7 (5700U/5800H), 8C/16T each $550-700
RAM 64 GB per node (2x32 GB DDR4) $300-400
NVMe 1 TB NVMe per node $180-210
Network 5-port unmanaged gigabit switch $20-30
UPS CyberPower CP1500AVRLCD $150-180
Total ~$1,200-1,520

Good for: full Proxmox cluster with quorum, 3-node K8s (1 control + 2 workers or combined), running 20-30 containers comfortably. This is the sweet spot for most homelabbers.

Production ($3,000+) — 5-7 Node Cluster

Component Recommendation Est. Cost
5-7x Mini PCs Mix of Ryzen 7/9 models $1,000-2,000
RAM 32-128 GB per node $600-1,200
NVMe 1-2 TB per node $300-700
GPU Node Desktop with PCIe slot + used Tesla P40/P100 $400-800
Network Managed switch + optional 2.5/10 GbE $100-300
UPS Rack UPS, 1500VA+ $200-300
NAS/Backup Dedicated NAS or PBS node with large drives $200-500
Total ~$2,800-5,800

Good for: production-grade homelab with proper HA, dedicated GPU inference, comprehensive monitoring, and a setup that mirrors real enterprise infrastructure.

Philosophy

  1. Cattle, not pets — VMs and containers should be reproducible. If you can't rebuild it from config, it's technical debt.
  2. GitOps everything — the Git repo is the source of truth. Manual kubectl apply is for emergencies only.
  3. Security by default — unprivileged LXCs, RBAC everywhere, secrets encrypted at rest, no wildcard admin tokens.
  4. Document the pain — the gotchas chapter exists because every "obvious" fix cost someone hours. Write it down.
  5. Budget-conscious — used enterprise hardware and mini PCs beat new rack servers for homelab use. The cloud bill for equivalent compute would be $500+/month.

Contributing

Found an error? Have a gotcha to add? PRs are welcome. Please keep the tone practical and include specific commands/configs where applicable.

License

MIT License — Use this however you want. Attribution appreciated but not required.

About

A comprehensive guide to building a production-grade homelab with Proxmox VE and Kubernetes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors