ComfyUI Beginners Guide

Published April 2, 2026 · PurpleDoubleD · 8 min read

ComfyUI is the most powerful open-source backend for AI image and video generation. It supports every major model architecture -- Stable Diffusion 1.5, SDXL, FLUX, Wan 2.1, HunyuanVideo -- and offers fine-grained control over every step of the generation process. The catch: its default interface is a node-based graph editor that overwhelms most beginners. This guide explains what ComfyUI is, how it works, and how to generate your first image without touching a single node.

What Is ComfyUI?

ComfyUI is a modular inference engine for AI image and video generation. Think of it as the engine under the hood. You give it a model, a prompt, and a set of parameters. It runs the diffusion process and produces the output.

What makes ComfyUI different from alternatives like Automatic1111 or Fooocus:

Node-based architecture -- Every operation (loading a model, encoding a prompt, sampling, decoding) is a separate node. Nodes connect together to form a workflow.
Universal model support -- One backend for SD 1.5, SDXL, FLUX, video models, and more. No need to switch tools when you switch architectures.
600+ built-in nodes -- From basic generation to advanced operations like ControlNet, IP-Adapter, regional prompting, and upscaling.
Lightweight and fast -- Written in Python with efficient VRAM management. It loads and unloads models as needed.

The node editor is powerful but not beginner-friendly. For most people, the learning curve is the only real barrier to using ComfyUI.

The Easy Path: Skip the Node Editor

Locally Uncensored solves the complexity problem by providing a conventional UI on top of ComfyUI. It builds the workflows automatically based on which model you select. You interact with a prompt box, a parameter panel, and a generate button -- the same interface pattern you know from Midjourney or DALL-E, but running entirely on your machine.

Here is how the abstraction works:

You select a model (e.g., Juggernaut XL, FLUX schnell, Wan 2.1)
The app classifies the model type (SDXL, FLUX, Wan, HunyuanVideo)
It queries ComfyUI's /object_info endpoint to discover available nodes
It builds the correct workflow graph automatically, selecting the right loader, sampler, VAE, and text encoders
It submits the workflow to ComfyUI and polls for results

You get the full power of ComfyUI without needing to understand the node graph. If you later want to create custom workflows, that option is available through the Workflow Finder.

Installing ComfyUI

Windows (Portable Release)

The easiest method. Download the latest portable release from the ComfyUI GitHub repository. Extract the archive to any location. The portable release bundles Python, PyTorch, and all dependencies. No separate installation needed.

Manual Installation (Any OS)

If you prefer a manual setup or are on Linux/macOS:

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
python main.py

Ensure you have Python 3.10+ and PyTorch with CUDA support installed. ComfyUI starts a local web server on port 8188 by default.

Automatic Detection

When you launch Locally Uncensored, it scans common paths on your system to find ComfyUI. If it is already installed, the app connects to it automatically. If not found, you can specify the path manually in Settings, or let the app guide you through setup.

Your First Image in 5 Minutes

Follow these steps to go from zero to your first generated image:

Minute 1-2: Install

git clone https://github.com/PurpleDoubleD/locally-uncensored.git
cd locally-uncensored
npm install
npm run dev

Minute 2-3: Download a Model

Open the Model Manager tab. Under Image models, click Install All on the Juggernaut XL V9 bundle. This is a single file (about 7 GB) that includes everything needed for high-quality SDXL generation. It works on GPUs with 6+ GB VRAM.

Wait for the download to complete. The progress bar shows download speed and remaining time.

Minute 3-4: Generate

Switch to the Create tab. Your newly downloaded model appears in the model dropdown on the right panel. Select it.

Type a prompt in the text box at the bottom:

portrait of an astronaut in a lush garden, golden hour lighting, detailed, sharp focus

Click Generate. The app builds the workflow, sends it to ComfyUI, and shows a progress indicator. On a modern GPU, a 1024x1024 SDXL image takes about 10-15 seconds.

Minute 4-5: Iterate

Your image appears in the gallery above the prompt box. Try changing the prompt, adjusting the seed (for variation), or tweaking the step count in the right panel. Each generation adds to your gallery, so you can compare results side by side.

Understanding Workflows (Conceptually)

Even if you never edit a workflow manually, understanding the concept helps you make better decisions about parameters. Every AI image generation follows this pipeline:

Step	Node Type	What It Does
1	Model Loader	Loads the diffusion model weights into GPU memory
2	Text Encoder (CLIP)	Converts your text prompt into numerical embeddings the model understands
3	Latent Image	Creates the initial noise pattern (controlled by the seed)
4	Sampler	Iteratively denoises the latent image using the model and prompt embeddings
5	VAE Decoder	Converts the final latent representation into a visible pixel image

Parameters you control in the UI map directly to these nodes:

Steps -- How many denoising iterations the sampler runs. More steps generally means more detail, with diminishing returns past 25-30.
CFG Scale -- How strongly the model follows your prompt vs exploring freely. Low values (1-3) produce natural but loose results. High values (7-12) produce tight prompt adherence but can look artificial.
Seed -- The random starting point. Same seed + same parameters = same image. Change the seed to explore variations.
Sampler -- The algorithm used for denoising. euler is the default and works well for most cases. dpmpp_2m is a common alternative with slightly different characteristics.
Resolution -- The output image size. Stick to the model's training resolution (1024x1024 for SDXL and FLUX) for best results.

When You Might Want Custom Workflows

The automatic workflow builder covers standard text-to-image and text-to-video generation. For advanced use cases, you can import custom ComfyUI workflows:

ControlNet -- Guide generation with reference images (poses, edges, depth maps)
IP-Adapter -- Use a reference image to influence the style or subject
Inpainting -- Edit specific regions of an existing image
Upscaling -- Increase resolution of generated images with dedicated upscaling models
LoRA -- Apply fine-tuned style or subject modifications on top of base models

Locally Uncensored includes a Workflow Finder that lets you search and import workflows from CivitAI. The app parses the workflow, identifies which parameters can be adjusted, and presents them in the UI. This gives you the flexibility of custom workflows without manually editing the node graph.

ComfyUI vs Other Backends

Feature	ComfyUI	Automatic1111	Fooocus
Model Support	SD 1.5, SDXL, FLUX, Video	SD 1.5, SDXL	SDXL only
Interface	Node graph (or frontend apps)	Web UI	Simplified Web UI
VRAM Efficiency	Excellent	Good	Good
Custom Workflows	Full node editor	Extensions/scripts	Limited
Video Generation	Yes	No	No
Active Development	Very active	Slowing	Moderate

ComfyUI has become the standard backend for a reason. It supports more model architectures than any alternative, receives updates faster when new models release, and its modular design means new capabilities are added as nodes without breaking existing workflows.

Next Steps

Once you are comfortable generating images, explore these topics:

Run FLUX locally for state-of-the-art image quality
Generate AI videos with Wan 2.1 and HunyuanVideo
Best uncensored models in 2026 for a complete overview of available models

Try Locally Uncensored

Free, open source, MIT licensed. One command to get started.

View on GitHub