Memory Modes, Quantization & ROCm 7.2
This release adds memory management and quantization options, giving more control over performance and VRAM usage across different hardware configurations.
Features
-
INT8 Quantization Support
Reduce VRAM usage by up to ~30–40% with automatic INT8 quantization during model load. Quantization is applied transparently and requires no manual configuration. -
Memory Modes
New memory modes simplify how models are distributed across GPU(s) and CPU memory, ranging from maximum performance to minimum VRAM usage. This makes Diffuse more accessible on low-memory systems while still scaling well on high-end hardware. -
ROCm 7.2 Support
Added support for ROCm 7.2, improving compatibility and stability on supported AMD GPUs. -
Bug Fixes & Stability Improvements
Various fixes and internal cleanups to improve reliability during model loading and inference.
Full Changelog: v0.3.8...v0.4.0
Installation
1. Installer Version
- Uninstall Diffuse v0.3.X
- Download and run Diffuse_v0.4.0.exe
- Follow the on-screen instructions
2. Standalone Version
-
Download and extract Diffuse_v0.4.0.zip
A fast SSD with plenty of free space is recommended, as model downloads can be large. -
Run Diffuse.exe
Diffuse will automatically:
- Install an isolated portable Python runtime
- Create the required virtual environment
- Download the selected model from Hugging Face
First-run notice
On first launch or when loading a model for the first time, setup may take several minutes while Python, dependencies, and model files are downloaded and initialized. This is expected behavior.
No manual Python setup is required.
Device Support
Supports CUDA and ROCM based devices
Important
Important: This release includes many large changes. A full uninstall and reinstall is recommended (your downloaded models are safe to keep).
This version will overwrite your existing Settings.json. If you have added custom environments or models, you may want to back up this file before upgrading.