ComfyUI Beginners Guide
ComfyUI is the most powerful open-source backend for AI image and video generation. It supports every major model architecture -- Stable Diffusion 1.5, SDXL, FLUX, Wan 2.1, HunyuanVideo -- and offers fine-grained control over every step of the generation process. The catch: its default interface is a node-based graph editor that overwhelms most beginners. This guide explains what ComfyUI is, how it works, and how to generate your first image without touching a single node.
What Is ComfyUI?
ComfyUI is a modular inference engine for AI image and video generation. Think of it as the engine under the hood. You give it a model, a prompt, and a set of parameters. It runs the diffusion process and produces the output.
What makes ComfyUI different from alternatives like Automatic1111 or Fooocus:
- Node-based architecture -- Every operation (loading a model, encoding a prompt, sampling, decoding) is a separate node. Nodes connect together to form a workflow.
- Universal model support -- One backend for SD 1.5, SDXL, FLUX, video models, and more. No need to switch tools when you switch architectures.
- 600+ built-in nodes -- From basic generation to advanced operations like ControlNet, IP-Adapter, regional prompting, and upscaling.
- Lightweight and fast -- Written in Python with efficient VRAM management. It loads and unloads models as needed.
The node editor is powerful but not beginner-friendly. For most people, the learning curve is the only real barrier to using ComfyUI.
The Easy Path: Skip the Node Editor
Locally Uncensored solves the complexity problem by providing a conventional UI on top of ComfyUI. It builds the workflows automatically based on which model you select. You interact with a prompt box, a parameter panel, and a generate button -- the same interface pattern you know from Midjourney or DALL-E, but running entirely on your machine.
Here is how the abstraction works:
- You select a model (e.g., Juggernaut XL, FLUX schnell, Wan 2.1)
- The app classifies the model type (SDXL, FLUX, Wan, HunyuanVideo)
- It queries ComfyUI's
/object_infoendpoint to discover available nodes - It builds the correct workflow graph automatically, selecting the right loader, sampler, VAE, and text encoders
- It submits the workflow to ComfyUI and polls for results
You get the full power of ComfyUI without needing to understand the node graph. If you later want to create custom workflows, that option is available through the Workflow Finder.
Installing ComfyUI
Windows (Portable Release)
The easiest method. Download the latest portable release from the ComfyUI GitHub repository. Extract the archive to any location. The portable release bundles Python, PyTorch, and all dependencies. No separate installation needed.
Manual Installation (Any OS)
If you prefer a manual setup or are on Linux/macOS:
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
python main.py
Ensure you have Python 3.10+ and PyTorch with CUDA support installed. ComfyUI starts a local web server on port 8188 by default.
Automatic Detection
When you launch Locally Uncensored, it scans common paths on your system to find ComfyUI. If it is already installed, the app connects to it automatically. If not found, you can specify the path manually in Settings, or let the app guide you through setup.
Your First Image in 5 Minutes
Follow these steps to go from zero to your first generated image:
Minute 1-2: Install
git clone https://github.com/PurpleDoubleD/locally-uncensored.git
cd locally-uncensored
npm install
npm run dev
Minute 2-3: Download a Model
Open the Model Manager tab. Under Image models, click Install All on the Juggernaut XL V9 bundle. This is a single file (about 7 GB) that includes everything needed for high-quality SDXL generation. It works on GPUs with 6+ GB VRAM.
Wait for the download to complete. The progress bar shows download speed and remaining time.
Minute 3-4: Generate
Switch to the Create tab. Your newly downloaded model appears in the model dropdown on the right panel. Select it.
Type a prompt in the text box at the bottom:
portrait of an astronaut in a lush garden, golden hour lighting, detailed, sharp focus
Click Generate. The app builds the workflow, sends it to ComfyUI, and shows a progress indicator. On a modern GPU, a 1024x1024 SDXL image takes about 10-15 seconds.
Minute 4-5: Iterate
Your image appears in the gallery above the prompt box. Try changing the prompt, adjusting the seed (for variation), or tweaking the step count in the right panel. Each generation adds to your gallery, so you can compare results side by side.
Understanding Workflows (Conceptually)
Even if you never edit a workflow manually, understanding the concept helps you make better decisions about parameters. Every AI image generation follows this pipeline:
| Step | Node Type | What It Does |
|---|---|---|
| 1 | Model Loader | Loads the diffusion model weights into GPU memory |
| 2 | Text Encoder (CLIP) | Converts your text prompt into numerical embeddings the model understands |
| 3 | Latent Image | Creates the initial noise pattern (controlled by the seed) |
| 4 | Sampler | Iteratively denoises the latent image using the model and prompt embeddings |
| 5 | VAE Decoder | Converts the final latent representation into a visible pixel image |
Parameters you control in the UI map directly to these nodes:
- Steps -- How many denoising iterations the sampler runs. More steps generally means more detail, with diminishing returns past 25-30.
- CFG Scale -- How strongly the model follows your prompt vs exploring freely. Low values (1-3) produce natural but loose results. High values (7-12) produce tight prompt adherence but can look artificial.
- Seed -- The random starting point. Same seed + same parameters = same image. Change the seed to explore variations.
- Sampler -- The algorithm used for denoising.
euleris the default and works well for most cases.dpmpp_2mis a common alternative with slightly different characteristics. - Resolution -- The output image size. Stick to the model's training resolution (1024x1024 for SDXL and FLUX) for best results.
When You Might Want Custom Workflows
The automatic workflow builder covers standard text-to-image and text-to-video generation. For advanced use cases, you can import custom ComfyUI workflows:
- ControlNet -- Guide generation with reference images (poses, edges, depth maps)
- IP-Adapter -- Use a reference image to influence the style or subject
- Inpainting -- Edit specific regions of an existing image
- Upscaling -- Increase resolution of generated images with dedicated upscaling models
- LoRA -- Apply fine-tuned style or subject modifications on top of base models
Locally Uncensored includes a Workflow Finder that lets you search and import workflows from CivitAI. The app parses the workflow, identifies which parameters can be adjusted, and presents them in the UI. This gives you the flexibility of custom workflows without manually editing the node graph.
ComfyUI vs Other Backends
| Feature | ComfyUI | Automatic1111 | Fooocus |
|---|---|---|---|
| Model Support | SD 1.5, SDXL, FLUX, Video | SD 1.5, SDXL | SDXL only |
| Interface | Node graph (or frontend apps) | Web UI | Simplified Web UI |
| VRAM Efficiency | Excellent | Good | Good |
| Custom Workflows | Full node editor | Extensions/scripts | Limited |
| Video Generation | Yes | No | No |
| Active Development | Very active | Slowing | Moderate |
ComfyUI has become the standard backend for a reason. It supports more model architectures than any alternative, receives updates faster when new models release, and its modular design means new capabilities are added as nodes without breaking existing workflows.
Next Steps
Once you are comfortable generating images, explore these topics:
- Run FLUX locally for state-of-the-art image quality
- Generate AI videos with Wan 2.1 and HunyuanVideo
- Best uncensored models in 2026 for a complete overview of available models