Wan2.1

Wan2.1とは？

Wan2.1 は、アリババが開発したオープンソースの動画生成モデルです。

オープンソース界隈で、本格的に動画生成が行われるようになった火付け役ともいえる、印象的なモデルです。

text2video、image2video、FLF2V の 3 つのモードに対応しています。
また、1 フレームだけ生成することで、画像生成モデルとしても使用できます。

細かい話にはなりますが、これは「動画生成が画像生成を内包している」というわけではなく、最初から画像生成もできるように設計された動画生成モデルです。

1.3B と 14B の 2 つのモデルサイズが用意されていますが、1.3B はさすがに性能が足りずほとんど使われていないため、ここでは 14B のみ使っていきます。

推奨設定

推奨解像度
- 480p（854×480）〜 720p（1280×720）
最大フレーム数
- 81 フレーム
FPS
- 16fps 付近で出力されることが多い

16fps だとスローモーションのような動画になることが多いので、24fps で保存したり、コマ落としで調整したりしてください。

モデルのダウンロード

diffusion models
text encoder
- umt5_xxl_fp8_e4m3fn_scaled.safetensors
VAE
- wan_2.1_vae.safetensors
gguf (任意)

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   ├── wan2.1_t2v_14B_fp8_e4m3fn.safetensors
    │   ├── wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors
    │   └── wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors
    ├── 📂text_encoders/
    │   └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
    ├── 📂unet/
    │   ├── wan2.1-t2v-14b-XXXX.gguf          ← gguf を使う場合のみ
    │   ├── wan2.1-i2v-14b-720p-XXXX.gguf     ← gguf を使う場合のみ
    │   └── wan2.1-flf2v-14b-720p-XXXX.gguf   ← gguf を使う場合のみ
    └── 📂vae/
        └── wan_2.1_vae.safetensors

fp16 / bf16 版を使いたい場合は、上の fp8 のファイル名を読み替えてください。基本的な配置パスは同じです。

text2video

Wan2.1 の基本となる text2video の workflow です。

Wan2.1_text2video_14B.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 51,
  "last_link_id": 96,
  "nodes": [
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1252.432861328125,
        188.1918182373047
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            96
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        898.7548217773438,
        188.1918182373047
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 95
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 46
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 52
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 91
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        30,
        6,
        "uni_pc",
        "simple",
        1
      ]
    },
    {
      "id": 49,
      "type": "VHS_VideoCombine",
      "pos": [
        1448.6710205078125,
        188.1918182373047
      ],
      "size": [
        372.2688903808594,
        547.3974851212412
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 96
        },
        {
          "name": "audio",
          "shape": 7,
          "type": "AUDIO",
          "link": null
        },
        {
          "name": "meta_batch",
          "shape": 7,
          "type": "VHS_BatchManager",
          "link": null
        },
        {
          "name": "vae",
          "shape": 7,
          "type": "VAE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "Filenames",
          "type": "VHS_FILENAMES",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-videohelpersuite",
        "ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
        "Node name for S&R": "VHS_VideoCombine"
      },
      "widgets_values": {
        "frame_rate": 16,
        "loop_count": 0,
        "filename_prefix": "Wan2.1",
        "format": "video/h264-mp4",
        "pix_fmt": "yuv420p",
        "crf": 19,
        "save_metadata": true,
        "trim_to_audio": false,
        "pingpong": false,
        "save_output": true,
        "videopreview": {
          "hidden": false,
          "paused": false,
          "params": {
            "filename": "Wan2.1_00014.mp4",
            "subfolder": "",
            "type": "output",
            "format": "video/h264-mp4",
            "frame_rate": 16,
            "workflow": "Wan2.1_00014.png",
            "fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.1_00014.mp4"
          }
        }
      }
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            46
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "Filming a dilapidated tram traveling at dusk, from inside the carriage."
      ]
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        292.3735046386719,
        50.834712982177734
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            94
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_t2v_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        942.8928474299169,
        72.63327169166966
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            52
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走 "
      ]
    },
    {
      "id": 40,
      "type": "EmptyHunyuanLatentVideo",
      "pos": [
        543.8643937544389,
        595.3501586914062
      ],
      "size": [
        290.4545454545455,
        130
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            91
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyHunyuanLatentVideo"
      },
      "widgets_values": [
        848,
        480,
        81,
        1
      ]
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        624.2695922851562,
        50.834712982177734
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 94
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            95
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      46,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      52,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      91,
      40,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      94,
      37,
      0,
      48,
      0,
      "MODEL"
    ],
    [
      95,
      48,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      96,
      8,
      0,
      49,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.9090909090909091,
      "offset": [
        44.811334228515626,
        50.26528701782227
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

🟦 ModelSamplingSD3 ノードの Shift は、動きの大きさに効くパラメータです。
- 上げるほどカメラワークや被写体の変化が大きくなりますが、大きすぎれば崩れる原因になります。ひとまず 8 のままで問題ないでしょう。
- cf. Wan2.1 parameter sweep

品質が上がるかもしれない技術

体感できるほどではないですが、ほぼノーデメリットで品質を上げる技術がコアノードとして実装されているので、使っておきましょう。

Wan2.1_text2video_14B_imp.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 53,
  "last_link_id": 99,
  "nodes": [
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        898.7548217773438,
        188.1918182373047
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 99
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 46
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 52
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 91
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        30,
        6,
        "uni_pc",
        "simple",
        1
      ]
    },
    {
      "id": 49,
      "type": "VHS_VideoCombine",
      "pos": [
        1448.6710205078125,
        188.1918182373047
      ],
      "size": [
        372.2688903808594,
        547.3974851212412
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 96
        },
        {
          "name": "audio",
          "shape": 7,
          "type": "AUDIO",
          "link": null
        },
        {
          "name": "meta_batch",
          "shape": 7,
          "type": "VHS_BatchManager",
          "link": null
        },
        {
          "name": "vae",
          "shape": 7,
          "type": "VAE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "Filenames",
          "type": "VHS_FILENAMES",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-videohelpersuite",
        "ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
        "Node name for S&R": "VHS_VideoCombine"
      },
      "widgets_values": {
        "frame_rate": 16,
        "loop_count": 0,
        "filename_prefix": "Wan2.1",
        "format": "video/h264-mp4",
        "pix_fmt": "yuv420p",
        "crf": 19,
        "save_metadata": true,
        "trim_to_audio": false,
        "pingpong": false,
        "save_output": true,
        "videopreview": {
          "hidden": false,
          "paused": false,
          "params": {
            "filename": "Wan2.1_00015.mp4",
            "subfolder": "",
            "type": "output",
            "format": "video/h264-mp4",
            "frame_rate": 16,
            "workflow": "Wan2.1_00015.png",
            "fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.1_00015.mp4"
          }
        }
      }
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        -180.6951953613281,
        -34.42733247236781
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            94
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_t2v_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        942.8928474299169,
        72.63327169166966
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            52
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走 "
      ]
    },
    {
      "id": 40,
      "type": "EmptyHunyuanLatentVideo",
      "pos": [
        543.8643937544389,
        595.3501586914062
      ],
      "size": [
        290.4545454545455,
        130
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            91
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyHunyuanLatentVideo"
      },
      "widgets_values": [
        848,
        480,
        81,
        1
      ]
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        148.4901115071613,
        -34.42733247236781
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 94
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            98
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    },
    {
      "id": 53,
      "type": "CFGZeroStar",
      "pos": [
        652.9691603027337,
        -34.42733247236781
      ],
      "size": [
        167.09765625,
        26
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 97
        }
      ],
      "outputs": [
        {
          "name": "patched_model",
          "type": "MODEL",
          "links": [
            99
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CFGZeroStar"
      },
      "widgets_values": [],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 52,
      "type": "UNetTemporalAttentionMultiply",
      "pos": [
        382.29721402994755,
        -34.42733247236781
      ],
      "size": [
        246.86484375,
        150
      ],
      "flags": {
        "collapsed": false
      },
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 98
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            97
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.27",
        "Node name for S&R": "UNetTemporalAttentionMultiply",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        1,
        1,
        1.2,
        1.3
      ],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            46
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "A high-quality, ultra-detailed image of a kingfisher diving into a crystal-clear pool of water, captured at the precise moment its beak touches the surface. The bird’s vibrant blue and orange feathers are sharply defined, with water droplets suspended in the air around it. The background features a softly blurred natural riverside setting, with lush green foliage and gentle sunlight filtering through the trees, creating a serene and dynamic scene."
      ]
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1252.432861328125,
        188.1918182373047
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            96
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      46,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      52,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      91,
      40,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      94,
      37,
      0,
      48,
      0,
      "MODEL"
    ],
    [
      96,
      8,
      0,
      49,
      0,
      "IMAGE"
    ],
    [
      97,
      52,
      0,
      53,
      0,
      "MODEL"
    ],
    [
      98,
      48,
      0,
      52,
      0,
      "MODEL"
    ],
    [
      99,
      53,
      0,
      3,
      0,
      "MODEL"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6830134553650712,
      "offset": [
        277.7669953613281,
        135.89143247236782
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

🟦 UNetTemporalAttentionMultiply
- フレーム間の一貫性を補強し、チラつきを抑制します。
🟦 CFG-Zero
- サンプリング序盤の CFG を弱めることで、過剰な補正による破綻を防ぎます。

image2video

画像を与えると、その画像から続きを生成します。

Wan2.1_image2video_14B.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 59,
  "last_link_id": 114,
  "nodes": [
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        492.2313707107671,
        -31.185647273269385
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 94
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            98
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1568.4238098490541,
        236.16638207772687
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 15,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            96
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        416.14803513844254,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            103
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走 "
      ]
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        564.605,
        590.4560000000004
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76,
            108
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 52,
      "type": "UNetTemporalAttentionMultiply",
      "pos": [
        726.0384732335532,
        -31.185647273269385
      ],
      "size": [
        246.86484375,
        150
      ],
      "flags": {
        "collapsed": false
      },
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 98
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            97
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.27",
        "Node name for S&R": "UNetTemporalAttentionMultiply",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        1,
        1,
        1.2,
        1.3
      ],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 49,
      "type": "VHS_VideoCombine",
      "pos": [
        1758.4238098490541,
        236.16638207772687
      ],
      "size": [
        372.2688903808594,
        817.6918538411458
      ],
      "flags": {},
      "order": 16,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 96
        },
        {
          "name": "audio",
          "shape": 7,
          "type": "AUDIO",
          "link": null
        },
        {
          "name": "meta_batch",
          "shape": 7,
          "type": "VHS_BatchManager",
          "link": null
        },
        {
          "name": "vae",
          "shape": 7,
          "type": "VAE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "Filenames",
          "type": "VHS_FILENAMES",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-videohelpersuite",
        "ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
        "Node name for S&R": "VHS_VideoCombine"
      },
      "widgets_values": {
        "frame_rate": 16,
        "loop_count": 0,
        "filename_prefix": "Wan2.1",
        "format": "video/h264-mp4",
        "pix_fmt": "yuv420p",
        "crf": 19,
        "save_metadata": true,
        "trim_to_audio": false,
        "pingpong": false,
        "save_output": true,
        "videopreview": {
          "hidden": false,
          "paused": false,
          "params": {
            "filename": "Wan2.1_00017.mp4",
            "subfolder": "",
            "type": "output",
            "format": "video/h264-mp4",
            "frame_rate": 16,
            "workflow": "Wan2.1_00017.png",
            "fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.1_00017.mp4"
          }
        }
      }
    },
    {
      "id": 58,
      "type": "ImageScaleToTotalPixels",
      "pos": [
        314.27713654778063,
        862.9226351439696
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 109
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            110,
            111,
            114
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "ImageScaleToTotalPixels"
      },
      "widgets_values": [
        "nearest-exact",
        0.5
      ]
    },
    {
      "id": 55,
      "type": "CLIPVisionEncode",
      "pos": [
        563.7908268864894,
        711.0523027813421
      ],
      "size": [
        271.6761474609375,
        78
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "clip_vision",
          "type": "CLIP_VISION",
          "link": 101
        },
        {
          "name": "image",
          "type": "IMAGE",
          "link": 110
        }
      ],
      "outputs": [
        {
          "name": "CLIP_VISION_OUTPUT",
          "type": "CLIP_VISION_OUTPUT",
          "links": [
            100
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPVisionEncode"
      },
      "widgets_values": [
        "none"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        416.19738206227066,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            102
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "A gentle breeze blows through her hair, and the leaves on the trees rustle softly."
      ]
    },
    {
      "id": 54,
      "type": "WanImageToVideo",
      "pos": [
        905.4842129180748,
        253.24900340945018
      ],
      "size": [
        270,
        210
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 102
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 103
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 108
        },
        {
          "name": "clip_vision_output",
          "shape": 7,
          "type": "CLIP_VISION_OUTPUT",
          "link": 100
        },
        {
          "name": "start_image",
          "shape": 7,
          "type": "IMAGE",
          "link": 114
        },
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 112
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 113
        }
      ],
      "outputs": [
        {
          "name": "positive",
          "type": "CONDITIONING",
          "links": [
            104
          ]
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "links": [
            105
          ]
        },
        {
          "name": "latent",
          "type": "LATENT",
          "links": [
            106
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "WanImageToVideo"
      },
      "widgets_values": [
        832,
        480,
        81,
        1
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 56,
      "type": "CLIPVisionLoader",
      "pos": [
        247.40101168979618,
        711.0523027813421
      ],
      "size": [
        270,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP_VISION",
          "type": "CLIP_VISION",
          "links": [
            101
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPVisionLoader"
      },
      "widgets_values": [
        "clip_vision_h.safetensors"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 59,
      "type": "GetImageSize",
      "pos": [
        625.4669743474269,
        862.9226351439696
      ],
      "size": [
        210,
        136
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 111
        }
      ],
      "outputs": [
        {
          "name": "width",
          "type": "INT",
          "links": [
            112
          ]
        },
        {
          "name": "height",
          "type": "INT",
          "links": [
            113
          ]
        },
        {
          "name": "batch_size",
          "type": "INT",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "GetImageSize"
      },
      "widgets_values": []
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        1222.7670924117206,
        232.75131480090158
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 14,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 99
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 104
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 105
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 106
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        30,
        6,
        "uni_pc",
        "simple",
        1
      ]
    },
    {
      "id": 53,
      "type": "CFGZeroStar",
      "pos": [
        996.7104195063404,
        -31.185647273269385
      ],
      "size": [
        167.09765625,
        26
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 97
        }
      ],
      "outputs": [
        {
          "name": "patched_model",
          "type": "MODEL",
          "links": [
            99
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CFGZeroStar"
      },
      "widgets_values": [],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        163.04606384227836,
        -31.185647273269385
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            94
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 57,
      "type": "LoadImage",
      "pos": [
        -28.495148953139406,
        863.9226351439696
      ],
      "size": [
        306.4238752929689,
        517.0728697460936
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            109
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "pasted/image (108).png",
        "image"
      ]
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      94,
      37,
      0,
      48,
      0,
      "MODEL"
    ],
    [
      96,
      8,
      0,
      49,
      0,
      "IMAGE"
    ],
    [
      97,
      52,
      0,
      53,
      0,
      "MODEL"
    ],
    [
      98,
      48,
      0,
      52,
      0,
      "MODEL"
    ],
    [
      99,
      53,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      100,
      55,
      0,
      54,
      3,
      "CLIP_VISION_OUTPUT"
    ],
    [
      101,
      56,
      0,
      55,
      0,
      "CLIP_VISION"
    ],
    [
      102,
      6,
      0,
      54,
      0,
      "CONDITIONING"
    ],
    [
      103,
      7,
      0,
      54,
      1,
      "CONDITIONING"
    ],
    [
      104,
      54,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      105,
      54,
      1,
      3,
      2,
      "CONDITIONING"
    ],
    [
      106,
      54,
      2,
      3,
      3,
      "LATENT"
    ],
    [
      108,
      39,
      0,
      54,
      2,
      "VAE"
    ],
    [
      109,
      57,
      0,
      58,
      0,
      "IMAGE"
    ],
    [
      110,
      58,
      0,
      55,
      1,
      "IMAGE"
    ],
    [
      111,
      58,
      0,
      59,
      0,
      "IMAGE"
    ],
    [
      112,
      59,
      0,
      54,
      5,
      "INT"
    ],
    [
      113,
      59,
      1,
      54,
      6,
      "INT"
    ],
    [
      114,
      58,
      0,
      54,
      4,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6209213230591555,
      "offset": [
        128.4951489531394,
        131.18564727326938
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

🟩 適度にリサイズした画像を CLIP Vision Encode と WanImageToVideo の両方に入力します。

FLF2V（First–Last Frame to Video）

2 枚の画像を与えて、その間が自然に埋まるように動画を生成します。

Wan2.1_FLF2V_14B.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 81,
  "last_link_id": 157,
  "nodes": [
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        492.2313707107671,
        -31.185647273269385
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 94
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            98
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1568.4238098490541,
        236.16638207772687
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 20,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            96
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 52,
      "type": "UNetTemporalAttentionMultiply",
      "pos": [
        726.0384732335532,
        -31.185647273269385
      ],
      "size": [
        246.86484375,
        150
      ],
      "flags": {
        "collapsed": false
      },
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 98
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            97
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.27",
        "Node name for S&R": "UNetTemporalAttentionMultiply",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        1,
        1,
        1.2,
        1.3
      ],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 49,
      "type": "VHS_VideoCombine",
      "pos": [
        1758.4238098490541,
        236.16638207772687
      ],
      "size": [
        372.2688903808594,
        855.2672021484375
      ],
      "flags": {},
      "order": 21,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 96
        },
        {
          "name": "audio",
          "shape": 7,
          "type": "AUDIO",
          "link": null
        },
        {
          "name": "meta_batch",
          "shape": 7,
          "type": "VHS_BatchManager",
          "link": null
        },
        {
          "name": "vae",
          "shape": 7,
          "type": "VAE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "Filenames",
          "type": "VHS_FILENAMES",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-videohelpersuite",
        "ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
        "Node name for S&R": "VHS_VideoCombine"
      },
      "widgets_values": {
        "frame_rate": 16,
        "loop_count": 0,
        "filename_prefix": "Wan2.1",
        "format": "video/h264-mp4",
        "pix_fmt": "yuv420p",
        "crf": 19,
        "save_metadata": true,
        "trim_to_audio": false,
        "pingpong": false,
        "save_output": true,
        "videopreview": {
          "hidden": false,
          "paused": false,
          "params": {
            "filename": "Wan2.1_00021.mp4",
            "subfolder": "",
            "type": "output",
            "format": "video/h264-mp4",
            "frame_rate": 16,
            "workflow": "Wan2.1_00021.png",
            "fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.1_00021.mp4"
          }
        }
      }
    },
    {
      "id": 53,
      "type": "CFGZeroStar",
      "pos": [
        996.7104195063404,
        -31.185647273269385
      ],
      "size": [
        167.09765625,
        26
      ],
      "flags": {},
      "order": 15,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 97
        }
      ],
      "outputs": [
        {
          "name": "patched_model",
          "type": "MODEL",
          "links": [
            99
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CFGZeroStar"
      },
      "widgets_values": [],
      "color": "#223",
      "bgcolor": "#335"
    },
    {
      "id": 72,
      "type": "LoadImage",
      "pos": [
        -512.3427957333753,
        667.5799129210463
      ],
      "size": [
        288.85467529296875,
        379.00823974609375
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            125
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "MCNK7B5FR2J9K.png",
        "image"
      ]
    },
    {
      "id": 74,
      "type": "CLIPVisionEncode",
      "pos": [
        567.9705870670186,
        860.5992386490295
      ],
      "size": [
        271.6761474609375,
        78
      ],
      "flags": {},
      "order": 17,
      "mode": 0,
      "inputs": [
        {
          "name": "clip_vision",
          "type": "CLIP_VISION",
          "link": 129
        },
        {
          "name": "image",
          "type": "IMAGE",
          "link": 130
        }
      ],
      "outputs": [
        {
          "name": "CLIP_VISION_OUTPUT",
          "type": "CLIP_VISION_OUTPUT",
          "links": [
            135
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPVisionEncode"
      },
      "widgets_values": [
        "none"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        420.32779531897177,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            146
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走 "
      ]
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        568.7847601805292,
        590.4560000000004
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76,
            147
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 73,
      "type": "CLIPVisionEncode",
      "pos": [
        567.9705870670186,
        717.3984329849668
      ],
      "size": [
        271.6761474609375,
        78
      ],
      "flags": {},
      "order": 16,
      "mode": 0,
      "inputs": [
        {
          "name": "clip_vision",
          "type": "CLIP_VISION",
          "link": 127
        },
        {
          "name": "image",
          "type": "IMAGE",
          "link": 128
        }
      ],
      "outputs": [
        {
          "name": "CLIP_VISION_OUTPUT",
          "type": "CLIP_VISION_OUTPUT",
          "links": [
            134
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPVisionEncode"
      },
      "widgets_values": [
        "none"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 69,
      "type": "CLIPVisionLoader",
      "pos": [
        245.92978242158162,
        717.3984329849668
      ],
      "size": [
        270,
        58
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP_VISION",
          "type": "CLIP_VISION",
          "links": [
            127,
            129
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPVisionLoader"
      },
      "widgets_values": [
        "clip_vision_h.safetensors"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 75,
      "type": "ImageFromBatch",
      "pos": [
        302.1028159498623,
        837.4995072037173
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 151
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            128,
            136
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ImageFromBatch"
      },
      "widgets_values": [
        0,
        1
      ]
    },
    {
      "id": 79,
      "type": "ImageScaleToTotalPixels",
      "pos": [
        26.740406834706505,
        1086.6724332331928
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 154
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            151,
            152,
            153
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "ImageScaleToTotalPixels"
      },
      "widgets_values": [
        "nearest-exact",
        0.5
      ]
    },
    {
      "id": 71,
      "type": "ImageBatch",
      "pos": [
        -172.33973843620936,
        1086.6724332331928
      ],
      "size": [
        173.05785123966945,
        46
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "image1",
          "type": "IMAGE",
          "link": 125
        },
        {
          "name": "image2",
          "type": "IMAGE",
          "link": 126
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            154
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ImageBatch"
      },
      "widgets_values": []
    },
    {
      "id": 70,
      "type": "LoadImage",
      "pos": [
        -515.3028177060315,
        1106.1858699522957
      ],
      "size": [
        288.85467529296875,
        379.00823974609375
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            126
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "ComfyUI_00507_.png",
        "image"
      ]
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        420.3771422427999,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            145
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "A full-body shot of a young woman with short black hair, wearing a grey sweatshirt and long black skirt. She starts in a standing position and smoothly transitions into a crouching pose, sitting down on the floor while looking at the camera. High quality, 3D render style, clean white background."
      ]
    },
    {
      "id": 78,
      "type": "WanFirstLastFrameToVideo",
      "pos": [
        907.4719485040978,
        252.88411872755603
      ],
      "size": [
        270.3999938964844,
        250
      ],
      "flags": {},
      "order": 18,
      "mode": 0,
      "inputs": [
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 145
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 146
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 147
        },
        {
          "name": "clip_vision_start_image",
          "shape": 7,
          "type": "CLIP_VISION_OUTPUT",
          "link": 134
        },
        {
          "name": "clip_vision_end_image",
          "shape": 7,
          "type": "CLIP_VISION_OUTPUT",
          "link": 135
        },
        {
          "name": "start_image",
          "shape": 7,
          "type": "IMAGE",
          "link": 136
        },
        {
          "name": "end_image",
          "shape": 7,
          "type": "IMAGE",
          "link": 157
        },
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 155
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 156
        }
      ],
      "outputs": [
        {
          "name": "positive",
          "type": "CONDITIONING",
          "links": [
            148
          ]
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "links": [
            149
          ]
        },
        {
          "name": "latent",
          "type": "LATENT",
          "links": [
            150
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.29",
        "Node name for S&R": "WanFirstLastFrameToVideo"
      },
      "widgets_values": [
        720,
        1280,
        81,
        1
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 76,
      "type": "ImageFromBatch",
      "pos": [
        303.8139975904874,
        976.6982376724662
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 152
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            130,
            157
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ImageFromBatch"
      },
      "widgets_values": [
        1,
        1
      ]
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        163.04606384227836,
        -31.185647273269385
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            94
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        1222.7670924117206,
        232.75131480090158
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 19,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 99
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 148
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 149
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 150
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        30,
        3,
        "uni_pc",
        "simple",
        1
      ]
    },
    {
      "id": 80,
      "type": "GetImageSize",
      "pos": [
        629.6467345279561,
        1086.6724332331928
      ],
      "size": [
        210,
        136
      ],
      "flags": {},
      "order": 14,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 153
        }
      ],
      "outputs": [
        {
          "name": "width",
          "type": "INT",
          "links": [
            155
          ]
        },
        {
          "name": "height",
          "type": "INT",
          "links": [
            156
          ]
        },
        {
          "name": "batch_size",
          "type": "INT",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "GetImageSize"
      },
      "widgets_values": [
        "width: 606, height: 865\n batch size: 2"
      ]
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      94,
      37,
      0,
      48,
      0,
      "MODEL"
    ],
    [
      96,
      8,
      0,
      49,
      0,
      "IMAGE"
    ],
    [
      97,
      52,
      0,
      53,
      0,
      "MODEL"
    ],
    [
      98,
      48,
      0,
      52,
      0,
      "MODEL"
    ],
    [
      99,
      53,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      125,
      72,
      0,
      71,
      0,
      "IMAGE"
    ],
    [
      126,
      70,
      0,
      71,
      1,
      "IMAGE"
    ],
    [
      127,
      69,
      0,
      73,
      0,
      "CLIP_VISION"
    ],
    [
      128,
      75,
      0,
      73,
      1,
      "IMAGE"
    ],
    [
      129,
      69,
      0,
      74,
      0,
      "CLIP_VISION"
    ],
    [
      130,
      76,
      0,
      74,
      1,
      "IMAGE"
    ],
    [
      134,
      73,
      0,
      78,
      3,
      "CLIP_VISION_OUTPUT"
    ],
    [
      135,
      74,
      0,
      78,
      4,
      "CLIP_VISION_OUTPUT"
    ],
    [
      136,
      75,
      0,
      78,
      5,
      "IMAGE"
    ],
    [
      145,
      6,
      0,
      78,
      0,
      "CONDITIONING"
    ],
    [
      146,
      7,
      0,
      78,
      1,
      "CONDITIONING"
    ],
    [
      147,
      39,
      0,
      78,
      2,
      "VAE"
    ],
    [
      148,
      78,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      149,
      78,
      1,
      3,
      2,
      "CONDITIONING"
    ],
    [
      150,
      78,
      2,
      3,
      3,
      "LATENT"
    ],
    [
      151,
      79,
      0,
      75,
      0,
      "IMAGE"
    ],
    [
      152,
      79,
      0,
      76,
      0,
      "IMAGE"
    ],
    [
      153,
      79,
      0,
      80,
      0,
      "IMAGE"
    ],
    [
      154,
      71,
      0,
      79,
      0,
      "IMAGE"
    ],
    [
      155,
      80,
      0,
      78,
      7,
      "INT"
    ],
    [
      156,
      80,
      1,
      78,
      8,
      "INT"
    ],
    [
      157,
      76,
      0,
      78,
      6,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6830134553650711,
      "offset": [
        433.7544177060316,
        129.72154727326938
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

🟩 2 枚の画像をバッチにして WanFirstLastFrameToVideo ノードに入力します。

Self Forcing（高速生成）

本来はリアルタイム動画生成のための技術ですが、ComfyUI では単なる数ステップ生成による高速化手法として使います。

モデルのダウンロード

loras
- Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors
- Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors

📂ComfyUI/
└── 📂models/
    └── 📂loras/
        ├── Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors
        └── Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors

workflow

Wan2.1_text2video_14B_Self-Forcing.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 52,
  "last_link_id": 98,
  "nodes": [
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1252.432861328125,
        188.1918182373047
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            96
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 49,
      "type": "VHS_VideoCombine",
      "pos": [
        1448.6710205078125,
        188.1918182373047
      ],
      "size": [
        372.2688903808594,
        547.3974851212412
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 96
        },
        {
          "name": "audio",
          "shape": 7,
          "type": "AUDIO",
          "link": null
        },
        {
          "name": "meta_batch",
          "shape": 7,
          "type": "VHS_BatchManager",
          "link": null
        },
        {
          "name": "vae",
          "shape": 7,
          "type": "VAE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "Filenames",
          "type": "VHS_FILENAMES",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-videohelpersuite",
        "ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
        "Node name for S&R": "VHS_VideoCombine"
      },
      "widgets_values": {
        "frame_rate": 16,
        "loop_count": 0,
        "filename_prefix": "Wan2.1",
        "format": "video/h264-mp4",
        "pix_fmt": "yuv420p",
        "crf": 19,
        "save_metadata": true,
        "trim_to_audio": false,
        "pingpong": false,
        "save_output": true,
        "videopreview": {
          "hidden": false,
          "paused": false,
          "params": {
            "filename": "Wan2.1_00022.mp4",
            "subfolder": "",
            "type": "output",
            "format": "video/h264-mp4",
            "frame_rate": 16,
            "workflow": "Wan2.1_00022.png",
            "fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.1_00022.mp4"
          }
        }
      }
    },
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        942.8928474299169,
        72.63327169166966
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        624.2695922851562,
        50.834712982177734
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 98
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            95
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {
        "collapsed": true
      },
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            52
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        ""
      ]
    },
    {
      "id": 52,
      "type": "LoraLoaderModelOnly",
      "pos": [
        302.6253941125505,
        50.834712982177734
      ],
      "size": [
        294.4378358179616,
        82
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 97
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            98
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "LoraLoaderModelOnly"
      },
      "widgets_values": [
        "Wan2.1\\Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors",
        1
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        -29.959172587796715,
        50.834712982177734
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            97
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_t2v_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            46
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "An origami fox running in the forest. The fox is made of polygons. speed and passion. realistic."
      ]
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        898.7548217773438,
        188.1918182373047
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 95
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 46
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 52
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 91
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        6,
        1,
        "euler",
        "simple",
        1
      ]
    },
    {
      "id": 40,
      "type": "EmptyHunyuanLatentVideo",
      "pos": [
        543.8150468306108,
        516.1501586914057
      ],
      "size": [
        290.4545454545455,
        130
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            91
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyHunyuanLatentVideo"
      },
      "widgets_values": [
        848,
        480,
        81,
        1
      ]
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      46,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      52,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      91,
      40,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      95,
      48,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      96,
      8,
      0,
      49,
      0,
      "IMAGE"
    ],
    [
      97,
      37,
      0,
      52,
      0,
      "MODEL"
    ],
    [
      98,
      52,
      0,
      48,
      0,
      "MODEL"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.8264462809917354,
      "offset": [
        253.37917258779672,
        144.75528701782227
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

LoraLoaderModelOnly ノードで LoRA を読み込みます。
KSampler の steps を 4 ～ 8、CFG を 1.0 に設定します。

Self Forcing は「とにかく高速に回したいとき」の選択肢です。
許容できないほどではないですが、劣化は大きいです。

画像生成

text2video の workflow で、1 フレームで動画生成するだけです。

Wan2.1_text2image_14B.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 53,
  "last_link_id": 100,
  "nodes": [
    {
      "id": 38,
      "type": "CLIPLoader",
      "pos": [
        56.288665771484375,
        312.74468994140625
      ],
      "size": [
        301.3524169921875,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 0,
          "links": [
            74,
            75
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
        "wan",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        942.8928474299169,
        72.63327169166966
      ],
      "size": [
        270.8619743474269,
        58.62092132305915
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "wan_2.1_vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 48,
      "type": "ModelSamplingSD3",
      "pos": [
        624.2695922851562,
        49.18182042019427
      ],
      "size": [
        210,
        58
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 99
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            95
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "ModelSamplingSD3"
      },
      "widgets_values": [
        8
      ],
      "color": "#2a363b",
      "bgcolor": "#3f5159"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415.0493469238281,
        186
      ],
      "size": [
        419.26959228515625,
        148.8194122314453
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 74
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            46
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "An origami fox running in the forest. The fox is made of polygons. speed and passion. realistic."
      ]
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        293.1813232799717,
        49.18182042019427
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            99
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Wan2.1\\wan2.1_t2v_14B_fp8_e4m3fn.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        898.7548217773438,
        188.1918182373047
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 95
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 46
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 52
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 91
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        20,
        6,
        "euler",
        "simple",
        1
      ]
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1252.432861328125,
        188.1918182373047
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 35
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            100
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        389
      ],
      "size": [
        419.3189392089844,
        138.8924560546875
      ],
      "flags": {
        "collapsed": false
      },
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 75
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            52
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走 "
      ]
    },
    {
      "id": 40,
      "type": "EmptyHunyuanLatentVideo",
      "pos": [
        543.8643937544389,
        598.4301586914057
      ],
      "size": [
        290.4545454545455,
        130
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            91
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyHunyuanLatentVideo"
      },
      "widgets_values": [
        1280,
        720,
        1,
        1
      ]
    },
    {
      "id": 53,
      "type": "SaveImage",
      "pos": [
        1446.1040505526994,
        188.1918182373047
      ],
      "size": [
        531.0700000000002,
        389.10000000000014
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 100
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    }
  ],
  "links": [
    [
      35,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      46,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      52,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      74,
      38,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      75,
      38,
      0,
      7,
      0,
      "CLIP"
    ],
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      91,
      40,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      95,
      48,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      99,
      37,
      0,
      48,
      0,
      "MODEL"
    ],
    [
      100,
      8,
      0,
      53,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.7513148009015777,
      "offset": [
        176.81133422851562,
        251.79917957980578
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

Wan2.1

Wan2.1とは？

推奨設定

モデルのダウンロード

text2video

品質が上がるかもしれない技術

image2video

FLF2V（First–Last Frame to Video）

Self Forcing（高速生成）

モデルのダウンロード

workflow

画像生成

jsonコピーボタンとは？

修正・誤字報告

記事リクエスト

感想・その他

ありがとうございます