What is Qwen-Image-Edit?

Qwen-Image-Edit is an instruction-based image editing model based on Qwen-Image.

Roughly speaking, you can think of it as the Qwen-Image version of Flux.1 Kontext.

Flux.1 Kontext was limited to VAE-based editing only, but Qwen-Image-Edit uses MLLM to actually "see" the reference image, allowing for more flexible editing.

A while later, a model called Qwen-Image-Edit-2509 which supports multi-reference was released.

Previously, it was only possible to "edit a single image", but with Qwen-Image-Edit-2509, you can do things like:

  • "Change the clothes of the person in Image 1 to those in Image 2"
  • "Generate an image where Image 1 and Image 2 are standing on the same stage"

Since the training method is different, 2509 is not necessarily fully upward compatible with the original version, but if you are unsure, using 2509 should be fine.


Qwen-Image-Edit (Original)

For what it can do, please refer to the Official GitHub or Flux.1 Kontext / What it can do.

Model Download

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   └── qwen_image_edit_fp8_e4m3fn.safetensors
    ├── 📂text_encoders/
    │   ├── qwen_2.5_vl_7b_fp8_scaled.safetensors
    │   ├── Qwen2.5-VL-7B-Instruct.gguf               ← Only when using gguf
    │   └── Qwen2.5-VL-7B-Instruct-mmproj-BF16.gguf    ← Only when using gguf
    ├── 📂vae/
    │   └── qwen_image_vae.safetensors
    └── 📂unet/
        └── qwen-image-edit.gguf                       ← Only when using gguf

workflow

Qwen-Image-Edit.json

🟩 I will add a little supplementary explanation about the behavior of the TextEncodeQwenImageEdit node.

Internally, it does roughly the following processing:

    1. Resize the input image to be about 1M pixels
    1. Generate latent from that image
    1. Pass text + image together to Qwen2.5-VL

Since image resizing processing is included automatically, if the generated image size deviates significantly from 1M pixels, unexpected results may occur.

Therefore, in this workflow, image size preprocessing is done in advance.

  • Resize to 1M pixels with ImageScaleToTotalPixels node
  • Crop so that the resolution is a multiple of 8 with Resize Image v2 node

Qwen-Image-Edit cannot "match pixel perfect with the input image and edited image" no matter how much you try. Several workarounds have been proposed, but it is better to understand the premise that the model design itself is not suitable for such use.


Qwen-Image-Edit-2509

Qwen-Image-Edit-2509 is a version that extends the original version. The biggest difference is that multiple reference images can be input.

Model Download

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   └── qwen_image_edit_2509_fp8_e4m3fn.safetensors
    └── 📂unet/
        └── qwen-image-edit-2509.gguf      ← Only when using gguf

workflow (Single Image)

Qwen-Image-Edit-2509.json
  • The basic flow is the same as the original version, but replace the TextEncodeQwenImageEdit node with the TextEncodeQwenImageEditPlus node.

workflow (Multiple Images)

Qwen-Image-Edit-2509_multi-ref.json
  • 🟩 Since it looks at the images properly, it works even with somewhat vague instructions, but you can also explicitly specify which image, such as "XX of image1", "XX of image2".

Previously, because we wanted to finish the input image and the edited image to the same size as much as possible, we performed resizing processing first and input it to latent_image.

On the other hand, in cases where "I just want to generate a new image with the reference image as a hint", there is no problem using the EmptySD3LatentImage node like text2image.


Qwen-Image-Edit-2511

Qwen-Image-Edit-2511 is a new model that improves upon 2509.

While there are no drastic changes like from the original to 2509, steady improvements have been made, such as improved character consistency and integration of popular LoRA models like Relighting LoRA.

Model Download

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   └── qwen_image_edit_2511_fp8mixed.safetensors
    └── 📂unet/
        └── qwen-image-edit-2511-XXXX.gguf      ← Only when using gguf

workflow

Qwen-Image-Edit-2511.json

It works with exactly the same workflow as 2509.


Lightning

Qwen-Image-Edit-Lightning is a LoRA set distilled to allow Qwen-Image-Edit to run in 4 / 8 steps.

Since the number of steps can be significantly reduced with almost no degradation, it is adopted in many workflows.

Model Download

📂ComfyUI/
└── 📂models/
    └── 📂loras/
        ├── Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors
        ├── Qwen-Image-Edit-2509-Lightning-8steps-V1.0-bf16.safetensors
        └── Qwen-Image-Edit-2511-Lightning-4steps-V1.0-bf16.safetensors

Qwen-Image-Edit-2509

Qwen-Image-Edit_lightning_8steps.json
  • Load Lightning LoRA with the LoraLoaderModelOnly node.
  • Set steps in KSampler to 4 or 8, and CFG to 1.0.

Qwen-Image-Edit-2511

Qwen-Image-Edit-2511_lightning_4steps.json
  • Load Lightning LoRA with the LoraLoaderModelOnly node.