What is Flux.1?

Flux.1 is an image generation model by Black Forest Labs, launched by members who developed Stable Diffusion. It is a model that became a major turning point in terms of architecture, not just a "high-performance version".

  • The core of image generation was replaced from the traditional UNet to Transformer (DiT) base.
  • T5-based LLM was adopted as a text encoder.

This combination made it possible to learn efficiently from large-scale datasets, and by utilizing the sentence comprehension ability of LLM as is, it became a branching point to the group of image generation models that are currently mainstream.

There are 3 variations of Flux.1.

  • Flux.1 pro
    • A version available only via API; model weights are not public.
  • Flux.1 dev
    • A research/verification model distilled from pro. This is the most commonly used version in local environments.
  • Flux.1 schnell
    • A model further distilled from dev, released under the relatively loose Apache-2.0 license.

Model Download

Here, we use the fp8 version of dev / schnell.

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   ├── flux1-dev-fp8.safetensors
    │   └── flux1-schnell-fp8.safetensors
    ├── 📂clip/
    │   ├── clip_l.safetensors
    │   └── t5xxl_fp8_e4m3fn_scaled.safetensors
    └── 📂vae/
        └── ae.safetensors

text2image - Flux.1 [dev]

flux1-dev.json

Flux.1 dev / schnell are models distilled with CFG fixed at 1.0. Therefore, adjustment of CFG scale and Negative Prompt like in traditional Stable Diffusion is not assumed, and Negative Prompt has no effect at all.

I leave the Negative side prompt empty, but in other workflows, ConditioningZeroOut node is sometimes inserted instead of the CLIP Text Encode node for Negative.

In either case, since the condition on the Negative side is multiplied by 0, writing anything will not affect the output.


text2image - Flux.1 [schnell]

This is a further distilled version of Flux.1 [dev], capable of generating images in 4-6 steps.

flux1-schnell.json
  • Set steps to 4-6.

LoRA - Flux.1 [dev]

Let's use LoRA to improve the quality of portrait images.

flux1-dev_lora.json
  • 🟪 As written in LoRA, since Flux and later models no longer train the text encoder, use the LoraLoaderModelOnly node which applies only to weights, instead of the Load LoRA node.

ControlNet - Flux.1 [dev]

Several ControlNet models for Flux.1 have been released, but here we introduce a Union type model as an example.

Model Download

📂ComfyUI/
└── 📂models/
    └── 📂controlnet/
        └── FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8.safetensors

workflow

ControlNet-Union incorporates multiple typical ControlNets into a single model.

FLUX.1-dev-ControlNet-Union-Pro_depth.json
  • 🟩 It is simply a workflow where ControlNet is inserted into an image2image workflow using Flux.

    • Although it is image2image, since denoise is 1.0, the behavior is almost the same as text2image.
    • I often use this form because an image of the same size as the input image can be created with fewer nodes.
  • 🟩 Input the type of ControlNet you want to use in SetUnionControlNetType.

    • Basically auto is fine.

GGUF (Lightweighting Flux.1)

Finally, let's touch a little on the GGUF version of Flux.1.

Originally GGUF is a format for lightweighting LLMs (quantized weight format), but by applying this to Flux.1, you can run it at a reasonable speed while reducing VRAM usage.

Custom Nodes

Model Download

There are several variations depending on the balance between performance and model size. Please choose according to your PC specs and usage.

📂ComfyUI/
└── 📂models/
    └── 📂unet/
        └── flux1-dev.gguf

workflow

FLUX.1-dev-gguf.json
  • 🟪 Replace the Load Diffusion Model node with the Unet Loader (GGUF) node.

  • Other CLIP / T5 / VAE parts remain the same.

    • You can also change T5 to GGUF, but in my experience, the effect is not that significant.

GGUF versions are available for many current models. Since there are almost no downsides to using GGUF, please try using it actively when VRAM is insufficient.