Local AI application with multiple state-of-the-art models for image and video generation. Features universal GPU support (NVIDIA, AMD, CPU) with an intuitive web interface.
- Install Python 3.8+ from python.org (check "Add to PATH")
- Clone this repository
git clone https://github.com/TriggeredBanana/Multi-Model-AI-Generator.git cd Multi-Model-AI-Generator - Run the application
start.bat
- Open your browser at
http://localhost:7860
The launcher automatically installs dependencies and configures your GPU!
| Model | Type | Download | Disk | Speed | Best For | Auth |
|---|---|---|---|---|---|---|
| Flux Schnell | Image | ~50-70GB | 23GB | Fast | Photorealistic images | โ |
| SDXL | Image | ~140-150GB | 72GB | Very Fast | Artistic styles | โ |
| SD3 Medium | Image | ~15-20GB | 10GB | Fast | Text/logos | โ |
| Stable Video | Video | ~15-20GB | 10GB | Slow | ImageโVideo | โ |
| AnimateDiff | Video | ~25-30GB | 15GB | Slow | TextโVideo | โ |
Models auto-download from Hugging Face when first used. No manual download needed!
Hugging Face Model Links:
- Flux Schnell: black-forest-labs/FLUX.1-schnell (requires HF account & accepting license)
- SDXL: stabilityai/stable-diffusion-xl-base-1.0 (no auth required)
- SD3 Medium: stabilityai/stable-diffusion-3-medium-diffusers (requires HF account & accepting license)
- Stable Video Diffusion: stabilityai/stable-video-diffusion-img2vid-xt (no auth required)
- AnimateDiff: guoyww/animatediff-motion-adapter-v1-5-2 (no auth required)
Cache Location: Models save to ~/.cache/multi_model_ai/ by default (configurable in app's Advanced Management tab)
NVIDIA (CUDA):
- Auto-detected for all CUDA-capable GPUs
- Fastest performance (seconds to minutes)
- GTX 1060 6GB+ or RTX series recommended
- All models fully accelerated
AMD (DirectML):
- RX 5000+ series with 8GB+ VRAM supported
- Auto-configured on first run
- 2-5x faster than CPU mode
- Run
python setup_amd_gpu.pyif issues occur
CPU Mode:
- Works on any system as fallback
- Auto-optimized for performance
- Slower but reliable (15-60 minutes per image)
- Start the app: Double-click
start.bator run in terminal - GPU detection: Automatic - check console for GPU status
- For gated models (Flux Schnell, SD3 Medium):
- Create account at Hugging Face
- Get access token: Settings โ Access Tokens
- Visit model pages (links above) and click "Agree and access repository"
- Enter token in app's "๐ Authentication" tab
- Download models: Go to "๐ฅ Model Management" tab, select models to download
- Start creating: Select model in generation tab, load it, and generate!
Image Generation:
- Navigate to "๐จ Image Generation" tab
- Choose model based on your needs:
- SDXL - Fast, artistic, no auth (great first choice!)
- Flux Schnell - Photorealistic, detailed scenes
- SD3 Medium - Best for text/logos in images
- Click "Load Model" (takes 30-60 seconds from cache)
- Enter your prompt and adjust settings
- Click "Generate Image"
- Images automatically save to
generated_content/folder
Video Generation:
- Navigate to "๐ฌ Video Generation" tab
- Choose model:
- Stable Video Diffusion - Animate existing images (upload an image)
- AnimateDiff - Generate videos from text prompts
- Click "Load Model" and wait for initialization
- Configure your input (image upload or text prompt)
- Click "Generate Video"
- Videos automatically save to
generated_content/folder
Minimum:
- Python 3.8 or higher
- 16GB RAM
- 50GB free disk space
- Internet connection for model downloads
- 8GB VRAM (recommended) or CPU mode
Recommended:
- Python 3.10+
- 32GB RAM
- 100GB SSD storage
- 12GB+ VRAM (RTX 3080/4070 or RX 6800XT/6950XT)
- High-speed internet for faster downloads
Dependencies (auto-installed by start.bat):
- torch, diffusers, transformers, accelerate
- gradio, Pillow, numpy, huggingface-hub
- opencv-python, imageio, imageio-ffmpeg
- For AMD: onnxruntime-directml, optimum[onnxruntime]
Download/Network Issues:
- Run
network_setup_helper.batfor comprehensive diagnostics - Use "๐ Network Diagnostics" tab in the app
- Try SDXL first (smallest download, no authentication)
- Run
start.batas administrator if permission errors occur - Check firewall settings for Python
AMD GPU Not Detected:
python setup_amd_gpu.py- Update AMD drivers from AMD.com
- Verify GPU in Device Manager (Display adapters)
- Ensure 8GB+ VRAM available
- Check for DirectML support in app console
Out of Memory Errors:
- Try smaller model (SDXL uses less memory than Flux)
- Lower image resolution (512x512 instead of 1024x1024)
- Reduce number of inference steps
- Close other GPU-intensive applications
- Restart app to clear GPU memory
Authentication Errors:
- Get token from Hugging Face Settings
- Accept model licenses on HuggingFace (visit model pages)
- Enter token in "๐ Authentication" tab
- Verify token has read permissions
Model Download Stuck:
- Check internet connection stability
- App includes automatic retry with exponential backoff
- Monitor progress in Model Management tab
- Large models can take 30-60 minutes on slow connections
- Try downloading during off-peak hours
Model Won't Load:
- Ensure model is fully downloaded (check Model Management tab)
- Verify enough disk space in cache directory
- Check GPU memory available (close other apps)
- Restart application to clear cached memory
- Try CPU mode if GPU memory insufficient
start.bat- Main launcher, handles all setupsetup_amd_gpu.py- AMD GPU detection and DirectML configurationnetwork_setup_helper.bat- Network and system diagnosticssetup_environment.bat- One-time Windows environment optimization
Choosing the Right Model:
- Quick iterations/artistic: Use SDXL (fastest, no auth)
- Photorealistic portraits: Use Flux Schnell
- Text/logos in images: Use SD3 Medium
- Animate photos: Use Stable Video Diffusion
- Text-to-video: Use AnimateDiff
Prompt Writing:
- Be specific and detailed for Flux Schnell
- Include art style for SDXL (e.g., "digital art", "oil painting")
- Describe text content explicitly for SD3 Medium
- Use negative prompts to avoid unwanted elements
- Keep video prompts simple and focused
Performance Optimization:
- Start with SDXL (smallest, fastest)
- Store models on SSD for faster loading
- Lower steps (20-30) for faster generation during testing
- Use GPU mode for best performance
- Close unnecessary applications to free up VRAM
Image Quality:
- Use 1024x1024 resolution for best results
- Increase steps (50+) for higher quality
- Adjust guidance scale (7-15) to control prompt adherence
- Use same seed for reproducible results
- Generate multiple variations to find best output
Multi-Model-AI-Generator/
โโโ multi_model_generator.py # Main application with all models
โโโ requirements.txt # Python dependencies
โโโ start.bat # Universal Windows launcher
โโโ setup_amd_gpu.py # AMD GPU setup utility
โโโ network_setup_helper.bat # Network diagnostics
โโโ network_diagnostics.py # Python network testing
โโโ setup_environment.bat # Environment optimization
โโโ LICENSE # MIT License
โโโ generated_content/ # Output directory (auto-created)
Flux Schnell:
- โ Photorealistic portraits and landscapes
- โ Complex scenes with multiple objects
- โ Professional-quality outputs
- โ Excellent prompt following
- โ Requires authentication
- โ Larger download size
SDXL:
- โ Fast generation for iteration
- โ Artistic styles and concept art
- โ Great for beginners (no auth)
- โ Versatile and reliable
- โ Less photorealistic than Flux
SD3 Medium:
- โ Best for text rendering in images
- โ Logos, signs, typography
- โ Technical illustrations
- โ High-quality output
- โ Requires authentication
Stable Video Diffusion:
- โ High-quality image animation
- โ Smooth motion effects
- โ Professional video quality
- โ Requires input image
- โ Limited to 4-second clips
- โ Slow generation
AnimateDiff:
- โ Text-to-video generation
- โ Character animations
- โ Creative storytelling
- โ Lower quality than Stable Video
- โ Simple motions work best
AI Models:
- Flux Schnell by Black Forest Labs
- SDXL by Stability AI
- Stable Diffusion 3 Medium by Stability AI
- Stable Video Diffusion by Stability AI
- AnimateDiff by GuoYuWei
Powered By:
- ๐ค Hugging Face Diffusers - Model pipelines
- ๐๏ธ Gradio - Web interface
- ๐ฅ PyTorch - Deep learning framework
Contributions welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
MIT License - see LICENSE file for details.
- Issues: GitHub Issues
- Repository: https://github.com/TriggeredBanana/Multi-Model-AI-Generator
- Discussions: GitHub Discussions
Enjoy creating amazing AI-generated content! ๐จโจ