How to Self-Host Stable Diffusion WebUI
What Is Stable Diffusion WebUI?
Stable Diffusion WebUI (by AUTOMATIC1111) is the most popular web interface for running Stable Diffusion image generation models locally. It provides txt2img, img2img, inpainting, upscaling, and dozens of other image generation features through a Gradio-based web interface. It’s a self-hosted alternative to Midjourney and DALL-E.
Prerequisites
- A Linux server (Ubuntu 22.04+ recommended)
- Docker and Docker Compose installed (guide)
- NVIDIA GPU with 8+ GB VRAM (4 GB minimum, 12+ GB recommended)
- 16 GB+ system RAM
- 20 GB+ free disk space (models are 2-7 GB each)
- NVIDIA Container Toolkit installed
Docker Compose Configuration
There’s no official Docker image. The recommended approach is a custom Dockerfile or using the source install method. Here’s a Docker Compose setup using a community image:
services:
stable-diffusion:
image: universonic/stable-diffusion-webui:latest # No versioned Docker tags published — :latest is the only option
container_name: stable-diffusion
ports:
- "7861:7861"
volumes:
- sd_models:/app/stable-diffusion-webui/models
- sd_outputs:/app/stable-diffusion-webui/outputs
- sd_extensions:/app/stable-diffusion-webui/extensions
environment:
- CLI_ARGS=--listen --api --xformers
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
volumes:
sd_models:
sd_outputs:
sd_extensions:
Alternative: Source installation (more reliable, recommended):
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh --listen --api --xformers
The startup script handles all dependencies, creates a venv, and installs PyTorch with CUDA support.
Start the stack:
docker compose up -d
Initial Setup
- Open
http://your-server:7861in your browser - The first start downloads the default Stable Diffusion v1.5 model (~4 GB)
- Type a prompt in the txt2img tab and click Generate
Downloading Better Models
Download models from CivitAI or HuggingFace and place them in the models/Stable-diffusion/ directory:
# Example: Download SDXL base model
wget -P models/Stable-diffusion/ \
https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
Refresh the model list in the web UI dropdown after adding new models.
Configuration
Key CLI Arguments
| Argument | Description |
|---|---|
--listen | Listen on 0.0.0.0 (required for Docker/remote access) |
--api | Enable the REST API |
--xformers | Enable xFormers for faster generation and lower VRAM |
--medvram | Optimize for 8 GB VRAM GPUs |
--lowvram | Optimize for 4 GB VRAM GPUs (slower) |
--share | Create a public Gradio link |
--port PORT | Custom port (default: 7861) |
--no-half | Disable FP16 (for GPUs without FP16 support) |
Model Types
| Type | Directory | Purpose |
|---|---|---|
| Checkpoints | models/Stable-diffusion/ | Base image generation models |
| VAE | models/VAE/ | Variational autoencoders for color/detail |
| LoRA | models/Lora/ | Fine-tuned adapters for specific styles |
| Embeddings | embeddings/ | Textual inversions for concepts/styles |
| ControlNet | models/ControlNet/ | Pose/depth/edge-guided generation |
Advanced Configuration
ControlNet
ControlNet allows pose-guided, edge-guided, and depth-guided image generation:
- Install the ControlNet extension from the Extensions tab
- Download ControlNet models to
models/ControlNet/ - Enable ControlNet in the generation parameters
API Usage
Generate images via the REST API:
curl -X POST http://localhost:7861/sdapi/v1/txt2img \
-H "Content-Type: application/json" \
-d '{
"prompt": "a cat in space, digital art",
"negative_prompt": "low quality, blurry",
"steps": 20,
"width": 512,
"height": 512,
"cfg_scale": 7
}'
The response contains base64-encoded images.
SDXL Support
SDXL models require 12+ GB VRAM. Use --xformers and consider --medvram if you’re at the VRAM limit.
Reverse Proxy
Configure your reverse proxy to forward to port 7861. See Reverse Proxy Setup.
Backup
Back up these volumes:
sd_models/— Downloaded models (large, can be re-downloaded)sd_outputs/— Generated images (irreplaceable)sd_extensions/— Installed extensions (can be re-downloaded)
Priority: sd_outputs/ is irreplaceable. Models and extensions can be re-downloaded. See Backup Strategy.
Troubleshooting
Out of VRAM
Symptom: RuntimeError: CUDA out of memory
Fix: Add --medvram or --lowvram to CLI_ARGS. Reduce resolution (start with 512x512). Use --xformers if not already enabled. Generate one image at a time (batch size 1).
Slow Generation
Symptom: Images take 60+ seconds.
Fix: Enable --xformers. Use a smaller model (SD 1.5 instead of SDXL). Reduce steps (20 is usually sufficient). Ensure GPU is being used (check with nvidia-smi).
Extension Installation Fails
Symptom: Extension install from URL fails.
Fix: Check that the container has internet access. Install extensions manually by cloning into the extensions/ directory. Restart the container after manual installation.
Black Images Generated
Symptom: Output images are completely black.
Fix: This usually indicates an incompatible model format or a safety checker triggering. Try a different model. Add --no-half-vae to CLI_ARGS. Disable the safety checker in settings.
Resource Requirements
- VRAM: 4 GB minimum (SD 1.5 with
--lowvram), 8 GB recommended, 12+ GB for SDXL - RAM: 8-16 GB
- CPU: Low-medium (GPU does the work)
- Disk: 4-7 GB per model, plus generated images
Verdict
Stable Diffusion WebUI (AUTOMATIC1111) is the most feature-rich image generation interface. The extension ecosystem, model compatibility, and community are unmatched. The trade-off is a more complex setup compared to cloud alternatives, and it requires a decent NVIDIA GPU.
Choose Stable Diffusion WebUI for a feature-rich, traditional image generation workflow. Choose ComfyUI for node-based workflows with more control over the generation pipeline.
Frequently Asked Questions
Do I need an NVIDIA GPU to run Stable Diffusion?
An NVIDIA GPU with 8+ GB VRAM is strongly recommended. AMD GPUs work via ROCm but with limited support. CPU-only generation is possible but extremely slow (minutes per image vs seconds on GPU). A GPU is effectively required for practical use.
How does Stable Diffusion WebUI compare to ComfyUI?
Stable Diffusion WebUI (AUTOMATIC1111) has a traditional UI with tabs and settings — easier for beginners. ComfyUI uses a node-based workflow graph that gives more control over the generation pipeline. Most beginners start with WebUI; power users often migrate to ComfyUI for complex workflows.
What’s the difference between SD 1.5, SDXL, and SD 3?
SD 1.5 generates 512x512 images and needs 4-8 GB VRAM. SDXL generates 1024x1024 images with better quality but needs 12+ GB VRAM. SD 3 and newer models offer improved text rendering and composition but require even more VRAM. Start with SD 1.5 on limited hardware.
Can I use Stable Diffusion without Docker?
Yes. The recommended approach is actually the source installation: clone the GitHub repo and run ./webui.sh. This handles all Python dependencies, venv creation, and PyTorch/CUDA setup automatically. Docker is an alternative, but the source install is more commonly used.
How much disk space do models need?
SD 1.5 checkpoints are ~2-4 GB each. SDXL models are ~6-7 GB. LoRA adapters are 50-300 MB. A typical setup with 3-5 models plus LoRAs uses 15-30 GB. Generated images are small (200 KB-2 MB each) and accumulate over time.
Is Stable Diffusion WebUI still actively maintained?
Development has slowed compared to 2023-2024, but the project still receives updates. The community has largely shifted attention to ComfyUI and newer interfaces like Forge (a WebUI fork optimized for lower VRAM usage). AUTOMATIC1111’s WebUI remains the most feature-complete option with the largest extension ecosystem.
Related
Get self-hosting tips in your inbox
Get the Docker Compose configs, hardware picks, and setup shortcuts we don't put in articles. Weekly. No spam.
Comments