Qwen-Image-Layered

Qwen-Image-Layeredとは？

入力した画像を、任意枚数の レイヤー に分解する拡散モデルです。

昨今流行りの画像編集ですが、指示とは関係ない部分が変化してしまうことがあります。　　それならば、これまでデザイナーがやってきたのと同じようにレイヤー分けをして、対象のレイヤーだけ編集すればよいよね？という動機から生まれたタスクですね。

透過画像(RGBA)を扱う初の汎用的な手法であることも注目すべき点です。
これまでの手法だと、後処理が必要だったり、デコード時だけ特殊処理が必要だったりしましたが、より素直に「RGBA画像として扱う」やり方が取られています。

モデルのダウンロード

diffusion_models
- qwen_image_layered_fp8mixed.safetensors（20.5 GB）
text_encoders
- qwen_2.5_vl_7b_fp8_scaled.safetensors（9.38 GB）
vae
- qwen_image_layered_vae.safetensors（254 MB）
gguf（任意）
- QuantStack/Qwen-Image-Layered-GGUF

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   └── qwen_image_layered_fp8mixed.safetensors
    ├── 📂text_encoders/
    │   └── qwen_2.5_vl_7b_fp8_scaled.safetensors
    ├── 📂unet/
    │   └── Qwen_Image_Layered-XXXX.gguf          ← gguf を使う場合のみ
    └── 📂vae/
        └── qwen_image_layered_vae.safetensors

workflow

Qwen-Image-Layered.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 87,
  "last_link_id": 148,
  "nodes": [
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "qwen_2.5_vl_7b_fp8_scaled.safetensors",
        "qwen_image",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 57,
      "type": "ReferenceLatent",
      "pos": [
        864.2781462760086,
        186
      ],
      "size": [
        204.134765625,
        46
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 103
        },
        {
          "name": "latent",
          "shape": 7,
          "type": "LATENT",
          "link": 110
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            104
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ReferenceLatent"
      },
      "widgets_values": []
    },
    {
      "id": 58,
      "type": "ReferenceLatent",
      "pos": [
        864.2781462760086,
        405.392333984375
      ],
      "size": [
        204.134765625,
        46
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 102
        },
        {
          "name": "latent",
          "shape": 7,
          "type": "LATENT",
          "link": 109
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            105
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ReferenceLatent"
      },
      "widgets_values": []
    },
    {
      "id": 54,
      "type": "ModelSamplingAuraFlow",
      "pos": [
        838.0823302359695,
        42.94671378647985
      ],
      "size": [
        230.33058166503906,
        58
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 99
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            100
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.49",
        "Node name for S&R": "ModelSamplingAuraFlow"
      },
      "widgets_values": [
        1
      ]
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        415.9506530761719,
        405.392333984375
      ],
      "size": [
        418.3189392089844,
        107.08506774902344
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            102
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "text, worst quality, blurry, ugly"
      ]
    },
    {
      "id": 64,
      "type": "ImageScaleToTotalPixels",
      "pos": [
        249.72535062227473,
        718.9234534762987
      ],
      "size": [
        229.5555480957031,
        106
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 115
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            113,
            114
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ImageScaleToTotalPixels"
      },
      "widgets_values": [
        "nearest-exact",
        0.5,
        1
      ]
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        186
      ],
      "size": [
        419.26959228515625,
        156.00363159179688
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            103
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "Intimate macro of a 33-year-old Brazilian dancer's feet en pointe, focus on toes and ballet shoe, studio lighting from above, shot on Sony FE 90mm f/2.8 macro, realistic worn shoe fabric texture, individual toe details visible through shoe, strained tendons, slight blood spot on shoe tip, dusty studio floor texture, ankle ribbons tied tight uphill"
      ]
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        497.22367921939565,
        42.94671378647985
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            99
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Qwen-Image\\qwen_image_layered_fp8mixed.safetensors",
        "fp8_e4m3fn"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        223.02005587937379,
        578.5647381339587
      ],
      "size": [
        256.26084283860405,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            116,
            122
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "qwen_image_layered_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 61,
      "type": "LoadImage",
      "pos": [
        -134.3561028852609,
        718.9234534762987
      ],
      "size": [
        353.5766357421875,
        459.44451904296864
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            115
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "pasted/image (113).png",
        "image"
      ]
    },
    {
      "id": 60,
      "type": "VAEEncode",
      "pos": [
        512.2301683876235,
        581.0691055180919
      ],
      "size": [
        171.72218557769065,
        46
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "pixels",
          "type": "IMAGE",
          "link": 113
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 116
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            109,
            110
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "VAEEncode"
      },
      "widgets_values": []
    },
    {
      "id": 63,
      "type": "GetImageSize",
      "pos": [
        512.2301683876235,
        718.9234534762987
      ],
      "size": [
        210,
        136
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 114
        }
      ],
      "outputs": [
        {
          "name": "width",
          "type": "INT",
          "links": [
            117
          ]
        },
        {
          "name": "height",
          "type": "INT",
          "links": [
            118
          ]
        },
        {
          "name": "batch_size",
          "type": "INT",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "GetImageSize"
      },
      "widgets_values": []
    },
    {
      "id": 66,
      "type": "VAEDecode",
      "pos": [
        1696.6426615505557,
        173.13380452764375
      ],
      "size": [
        166.0271370269786,
        46
      ],
      "flags": {},
      "order": 16,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 121
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 122
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            120,
            128,
            129
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 77,
      "type": "ImageFromBatch",
      "pos": [
        1562.2305676328322,
        947.806885766381
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 19,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 129
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            131
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ImageFromBatch"
      },
      "widgets_values": [
        2,
        1
      ]
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        1104.4448189452391,
        173.13380452764375
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 14,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 100
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 104
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 105
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 108
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            119
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        20,
        2.5,
        "euler",
        "simple",
        1
      ]
    },
    {
      "id": 76,
      "type": "ImageFromBatch",
      "pos": [
        1562.2305676328322,
        797.301847591665
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 18,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 128
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            136
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ImageFromBatch"
      },
      "widgets_values": [
        1,
        1
      ]
    },
    {
      "id": 67,
      "type": "SaveImage",
      "pos": [
        1939.850523648282,
        171.13269321533235
      ],
      "size": [
        428.5909735732416,
        468.94454416638166
      ],
      "flags": {
        "collapsed": false
      },
      "order": 17,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 120
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    },
    {
      "id": 65,
      "type": "LatentCutToBatch",
      "pos": [
        1453.0437402478974,
        173.13380452764375
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 15,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 119
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            121
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "LatentCutToBatch"
      },
      "widgets_values": [
        "t",
        1
      ],
      "color": "#332922",
      "bgcolor": "#593930"
    },
    {
      "id": 55,
      "type": "MarkdownNote",
      "pos": [
        12.546970997699502,
        -11.88447421897053
      ],
      "size": [
        345.70001220703125,
        225.77000427246094
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [],
      "properties": {},
      "widgets_values": [
        "## models\n\n- [qwen_image_layered_fp8mixed.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-Layered_ComfyUI/blob/main/split_files/diffusion_models/qwen_image_layered_fp8mixed.safetensors)\n- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)\n- [qwen_image_layered_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-Layered_ComfyUI/blob/main/split_files/vae/qwen_image_layered_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└── 📂models/\n    ├── 📂diffusion_models/\n    │   └── qwen_image_layered_fp8mixed.safetensors\n    ├── 📂text_encoders/\n    │   └── qwen_2.5_vl_7b_fp8_scaled.safetensors\n    └── 📂vae/\n         └── qwen_image_layered_vae.safetensors\n```"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 80,
      "type": "PreviewImage",
      "pos": [
        2460.612938515339,
        795.0211535441636
      ],
      "size": [
        353.88890380859357,
        371.8889038085938
      ],
      "flags": {},
      "order": 24,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 135
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    },
    {
      "id": 59,
      "type": "EmptyQwenImageLayeredLatentImage",
      "pos": [
        755.2170791040447,
        693.0025348793122
      ],
      "size": [
        305.1563720703124,
        130
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 117
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 118
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            108
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "EmptyQwenImageLayeredLatentImage"
      },
      "widgets_values": [
        640,
        640,
        2,
        1
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 81,
      "type": "SplitImageWithAlpha",
      "pos": [
        1792.8667692821584,
        797.301847591665
      ],
      "size": [
        213.68285814424544,
        46
      ],
      "flags": {},
      "order": 20,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 136
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            144
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": []
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "SplitImageWithAlpha"
      },
      "widgets_values": []
    },
    {
      "id": 87,
      "type": "InvertMask",
      "pos": [
        2032.7321075290527,
        965.5261917188795
      ],
      "size": [
        140,
        26
      ],
      "flags": {},
      "order": 22,
      "mode": 0,
      "inputs": [
        {
          "name": "mask",
          "type": "MASK",
          "link": 147
        }
      ],
      "outputs": [
        {
          "name": "MASK",
          "type": "MASK",
          "links": [
            148
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "InvertMask"
      }
    },
    {
      "id": 79,
      "type": "SplitImageWithAlpha",
      "pos": [
        1792.8667692821584,
        947.806885766381
      ],
      "size": [
        213.68285814424544,
        46
      ],
      "flags": {},
      "order": 21,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 131
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            143
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": [
            147
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "SplitImageWithAlpha"
      },
      "widgets_values": []
    },
    {
      "id": 74,
      "type": "ImageCompositeMasked",
      "pos": [
        2200.3106540392373,
        795.0211535441636
      ],
      "size": [
        228.33342285156277,
        146
      ],
      "flags": {},
      "order": 23,
      "mode": 0,
      "inputs": [
        {
          "name": "destination",
          "type": "IMAGE",
          "link": 144
        },
        {
          "name": "source",
          "type": "IMAGE",
          "link": 143
        },
        {
          "name": "mask",
          "shape": 7,
          "type": "MASK",
          "link": 148
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            135
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.6.0",
        "Node name for S&R": "ImageCompositeMasked"
      },
      "widgets_values": [
        0,
        0,
        false
      ]
    }
  ],
  "links": [
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      99,
      37,
      0,
      54,
      0,
      "MODEL"
    ],
    [
      100,
      54,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      102,
      7,
      0,
      58,
      0,
      "CONDITIONING"
    ],
    [
      103,
      6,
      0,
      57,
      0,
      "CONDITIONING"
    ],
    [
      104,
      57,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      105,
      58,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      108,
      59,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      109,
      60,
      0,
      58,
      1,
      "LATENT"
    ],
    [
      110,
      60,
      0,
      57,
      1,
      "LATENT"
    ],
    [
      113,
      64,
      0,
      60,
      0,
      "IMAGE"
    ],
    [
      114,
      64,
      0,
      63,
      0,
      "IMAGE"
    ],
    [
      115,
      61,
      0,
      64,
      0,
      "IMAGE"
    ],
    [
      116,
      39,
      0,
      60,
      1,
      "VAE"
    ],
    [
      117,
      63,
      0,
      59,
      0,
      "INT"
    ],
    [
      118,
      63,
      1,
      59,
      1,
      "INT"
    ],
    [
      119,
      3,
      0,
      65,
      0,
      "LATENT"
    ],
    [
      120,
      66,
      0,
      67,
      0,
      "IMAGE"
    ],
    [
      121,
      65,
      0,
      66,
      0,
      "LATENT"
    ],
    [
      122,
      39,
      0,
      66,
      1,
      "VAE"
    ],
    [
      128,
      66,
      0,
      76,
      0,
      "IMAGE"
    ],
    [
      129,
      66,
      0,
      77,
      0,
      "IMAGE"
    ],
    [
      131,
      77,
      0,
      79,
      0,
      "IMAGE"
    ],
    [
      135,
      74,
      0,
      80,
      0,
      "IMAGE"
    ],
    [
      136,
      76,
      0,
      81,
      0,
      "IMAGE"
    ],
    [
      143,
      79,
      0,
      74,
      1,
      "IMAGE"
    ],
    [
      144,
      81,
      0,
      74,
      0,
      "IMAGE"
    ],
    [
      147,
      79,
      1,
      87,
      0,
      "MASK"
    ],
    [
      148,
      87,
      0,
      74,
      2,
      "MASK"
    ]
  ],
  "groups": [
    {
      "id": 1,
      "title": "Image Composite",
      "bounding": [
        1552.2305676328322,
        723.701847591665,
        1277.625191808733,
        456.85072813132865
      ],
      "color": "#3f789e",
      "font_size": 24,
      "flags": {}
    }
  ],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6209213230591553,
      "offset": [
        306.76748492398815,
        311.2185045299079
      ]
    },
    "frontendVersion": "1.36.12",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

入力画像のリサイズ
- 1024px まで大きくできますが、レイヤー数が増えるほど重くなりやすいので、ここでは 0.5M ピクセルに設定しています。
🟩Empty Qwen Image Layered Latent
- layers: 分割したいレイヤー数
- こちらも増やすほど、メモリと時間コストが上がります。
🟫LatentCutToBatch
- なにをやっているのか分かりづらいとは思いますが、実装都合の「整形」だと思ってしまってください。
- このモデルはその名の通り複数枚の画像を「レイヤー」として出力しますが、現在の VAE Decode はレイヤーという概念をうまく理解できないため、単なるN枚のバッチ画像に変換します。
🟦画像をまた合成する（任意）
- 2つのレイヤーに分けた場合、合計3枚の RGBA 画像（元画像＋分解結果）が出力されます。
- 2枚目以降の画像を、ImageCompositeMasked で重ね続ければ元の1枚の画像に戻せます。
  - ただし、このノードはRGB画像しか扱えないため、RGB画像 + マスクという形に変換する必要があります。
  - cf. マスクとアルファチャンネル
- 面倒くさいと思いますが、ComfyUIに限らず、ノードベースUIとレイヤーシステムは相性が悪いです😥

参考

Qwen Image Edit 2511 & Qwen Image Layered in ComfyUI

Qwen-Image-Layered

Qwen-Image-Layeredとは？

モデルのダウンロード

workflow

参考

jsonコピーボタンとは？

修正・誤字報告

記事リクエスト

感想・その他

ありがとうございます