Skip to content

TriggeredBanana/Multi-Model-AI-Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

46 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽจ Multi-Model AI Content Generator

Local AI application with multiple state-of-the-art models for image and video generation. Features universal GPU support (NVIDIA, AMD, CPU) with an intuitive web interface.

๐Ÿš€ Quick Start

  1. Install Python 3.8+ from python.org (check "Add to PATH")
  2. Clone this repository
    git clone https://github.com/TriggeredBanana/Multi-Model-AI-Generator.git
    cd Multi-Model-AI-Generator
  3. Run the application
    start.bat
  4. Open your browser at http://localhost:7860

The launcher automatically installs dependencies and configures your GPU!

๐Ÿ“Š Available Models

Model Type Download Disk Speed Best For Auth
Flux Schnell Image ~50-70GB 23GB Fast Photorealistic images โœ…
SDXL Image ~140-150GB 72GB Very Fast Artistic styles โŒ
SD3 Medium Image ~15-20GB 10GB Fast Text/logos โœ…
Stable Video Video ~15-20GB 10GB Slow Imageโ†’Video โŒ
AnimateDiff Video ~25-30GB 15GB Slow Textโ†’Video โŒ

Where to Find/Download Models

Models auto-download from Hugging Face when first used. No manual download needed!

Hugging Face Model Links:

  1. Flux Schnell: black-forest-labs/FLUX.1-schnell (requires HF account & accepting license)
  2. SDXL: stabilityai/stable-diffusion-xl-base-1.0 (no auth required)
  3. SD3 Medium: stabilityai/stable-diffusion-3-medium-diffusers (requires HF account & accepting license)
  4. Stable Video Diffusion: stabilityai/stable-video-diffusion-img2vid-xt (no auth required)
  5. AnimateDiff: guoyww/animatediff-motion-adapter-v1-5-2 (no auth required)

Cache Location: Models save to ~/.cache/multi_model_ai/ by default (configurable in app's Advanced Management tab)

๐ŸŽฎ GPU Support

NVIDIA (CUDA):

  • Auto-detected for all CUDA-capable GPUs
  • Fastest performance (seconds to minutes)
  • GTX 1060 6GB+ or RTX series recommended
  • All models fully accelerated

AMD (DirectML):

  • RX 5000+ series with 8GB+ VRAM supported
  • Auto-configured on first run
  • 2-5x faster than CPU mode
  • Run python setup_amd_gpu.py if issues occur

CPU Mode:

  • Works on any system as fallback
  • Auto-optimized for performance
  • Slower but reliable (15-60 minutes per image)

๐Ÿ› ๏ธ Setup Instructions

First Time Setup

  1. Start the app: Double-click start.bat or run in terminal
  2. GPU detection: Automatic - check console for GPU status
  3. For gated models (Flux Schnell, SD3 Medium):
  4. Download models: Go to "๐Ÿ“ฅ Model Management" tab, select models to download
  5. Start creating: Select model in generation tab, load it, and generate!

Using the Application

Image Generation:

  1. Navigate to "๐ŸŽจ Image Generation" tab
  2. Choose model based on your needs:
    • SDXL - Fast, artistic, no auth (great first choice!)
    • Flux Schnell - Photorealistic, detailed scenes
    • SD3 Medium - Best for text/logos in images
  3. Click "Load Model" (takes 30-60 seconds from cache)
  4. Enter your prompt and adjust settings
  5. Click "Generate Image"
  6. Images automatically save to generated_content/ folder

Video Generation:

  1. Navigate to "๐ŸŽฌ Video Generation" tab
  2. Choose model:
    • Stable Video Diffusion - Animate existing images (upload an image)
    • AnimateDiff - Generate videos from text prompts
  3. Click "Load Model" and wait for initialization
  4. Configure your input (image upload or text prompt)
  5. Click "Generate Video"
  6. Videos automatically save to generated_content/ folder

๐Ÿ’ป System Requirements

Minimum:

  • Python 3.8 or higher
  • 16GB RAM
  • 50GB free disk space
  • Internet connection for model downloads
  • 8GB VRAM (recommended) or CPU mode

Recommended:

  • Python 3.10+
  • 32GB RAM
  • 100GB SSD storage
  • 12GB+ VRAM (RTX 3080/4070 or RX 6800XT/6950XT)
  • High-speed internet for faster downloads

Dependencies (auto-installed by start.bat):

  • torch, diffusers, transformers, accelerate
  • gradio, Pillow, numpy, huggingface-hub
  • opencv-python, imageio, imageio-ffmpeg
  • For AMD: onnxruntime-directml, optimum[onnxruntime]

๐Ÿ†˜ Troubleshooting

Common Issues & Solutions

Download/Network Issues:

  • Run network_setup_helper.bat for comprehensive diagnostics
  • Use "๐ŸŒ Network Diagnostics" tab in the app
  • Try SDXL first (smallest download, no authentication)
  • Run start.bat as administrator if permission errors occur
  • Check firewall settings for Python

AMD GPU Not Detected:

python setup_amd_gpu.py
  • Update AMD drivers from AMD.com
  • Verify GPU in Device Manager (Display adapters)
  • Ensure 8GB+ VRAM available
  • Check for DirectML support in app console

Out of Memory Errors:

  • Try smaller model (SDXL uses less memory than Flux)
  • Lower image resolution (512x512 instead of 1024x1024)
  • Reduce number of inference steps
  • Close other GPU-intensive applications
  • Restart app to clear GPU memory

Authentication Errors:

  • Get token from Hugging Face Settings
  • Accept model licenses on HuggingFace (visit model pages)
  • Enter token in "๐Ÿ” Authentication" tab
  • Verify token has read permissions

Model Download Stuck:

  • Check internet connection stability
  • App includes automatic retry with exponential backoff
  • Monitor progress in Model Management tab
  • Large models can take 30-60 minutes on slow connections
  • Try downloading during off-peak hours

Model Won't Load:

  • Ensure model is fully downloaded (check Model Management tab)
  • Verify enough disk space in cache directory
  • Check GPU memory available (close other apps)
  • Restart application to clear cached memory
  • Try CPU mode if GPU memory insufficient

Helper Scripts

  • start.bat - Main launcher, handles all setup
  • setup_amd_gpu.py - AMD GPU detection and DirectML configuration
  • network_setup_helper.bat - Network and system diagnostics
  • setup_environment.bat - One-time Windows environment optimization

๐ŸŽฏ Tips for Best Results

Choosing the Right Model:

  • Quick iterations/artistic: Use SDXL (fastest, no auth)
  • Photorealistic portraits: Use Flux Schnell
  • Text/logos in images: Use SD3 Medium
  • Animate photos: Use Stable Video Diffusion
  • Text-to-video: Use AnimateDiff

Prompt Writing:

  • Be specific and detailed for Flux Schnell
  • Include art style for SDXL (e.g., "digital art", "oil painting")
  • Describe text content explicitly for SD3 Medium
  • Use negative prompts to avoid unwanted elements
  • Keep video prompts simple and focused

Performance Optimization:

  • Start with SDXL (smallest, fastest)
  • Store models on SSD for faster loading
  • Lower steps (20-30) for faster generation during testing
  • Use GPU mode for best performance
  • Close unnecessary applications to free up VRAM

Image Quality:

  • Use 1024x1024 resolution for best results
  • Increase steps (50+) for higher quality
  • Adjust guidance scale (7-15) to control prompt adherence
  • Use same seed for reproducible results
  • Generate multiple variations to find best output

๐Ÿ“ Project Structure

Multi-Model-AI-Generator/
โ”œโ”€โ”€ multi_model_generator.py       # Main application with all models
โ”œโ”€โ”€ requirements.txt               # Python dependencies
โ”œโ”€โ”€ start.bat                      # Universal Windows launcher
โ”œโ”€โ”€ setup_amd_gpu.py               # AMD GPU setup utility
โ”œโ”€โ”€ network_setup_helper.bat       # Network diagnostics
โ”œโ”€โ”€ network_diagnostics.py         # Python network testing
โ”œโ”€โ”€ setup_environment.bat          # Environment optimization
โ”œโ”€โ”€ LICENSE                        # MIT License
โ””โ”€โ”€ generated_content/             # Output directory (auto-created)

๐ŸŽจ Model Usage Guide

Flux Schnell:

  • โœ… Photorealistic portraits and landscapes
  • โœ… Complex scenes with multiple objects
  • โœ… Professional-quality outputs
  • โœ… Excellent prompt following
  • โŒ Requires authentication
  • โŒ Larger download size

SDXL:

  • โœ… Fast generation for iteration
  • โœ… Artistic styles and concept art
  • โœ… Great for beginners (no auth)
  • โœ… Versatile and reliable
  • โŒ Less photorealistic than Flux

SD3 Medium:

  • โœ… Best for text rendering in images
  • โœ… Logos, signs, typography
  • โœ… Technical illustrations
  • โœ… High-quality output
  • โŒ Requires authentication

Stable Video Diffusion:

  • โœ… High-quality image animation
  • โœ… Smooth motion effects
  • โœ… Professional video quality
  • โŒ Requires input image
  • โŒ Limited to 4-second clips
  • โŒ Slow generation

AnimateDiff:

  • โœ… Text-to-video generation
  • โœ… Character animations
  • โœ… Creative storytelling
  • โŒ Lower quality than Stable Video
  • โŒ Simple motions work best

๐Ÿ“‹ Credits & Acknowledgments

AI Models:

Powered By:

๐Ÿค Contributing

Contributions welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ“ž Support


Enjoy creating amazing AI-generated content! ๐ŸŽจโœจ

About

A Multi-Model AI Content (Image & Video) Generator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors