Wan2.2とは?
Wan2.2 は、Wan-2.1 の正当後継にあたる動画生成モデルファミリーです。 大きく 2 つのモデルで構成されています。
- 14B:
high_noise/low_noiseの 2 モデルを切り替える二段構成 - 5B:text2video と image2video を単一モデルで扱う TI2V モデル+Wan2.2 VAE
Wan2.2 14Bモデル
Wan2.2-A14B は、サンプリング前半を high_noise モデル、後半を low_noise モデルが担当する二段パイプラインになっています。
2つのモデルに分けることにより、VRAMの使用量はWan2.1から増やさずにモデルの大きさを二倍にして性能向上を図っています。
推奨設定値
- 推奨解像度
- 480p(854×480)〜 720p(1280×720)
- 最大フレーム数
- 81フレーム
- FPS
- 16fps 付近で出力されることが多い
- ただ、16fps だとスロモーションのような動画になることが多いので、24fps で保存したり、コマ落としで調整したりしてください。
モデルのダウンロード
- diffusion_models
- text_encoders
- umt5_xxl_fp8_e4m3fn_scaled.safetensors(Wan2.1 と共通)
- vae
- wan_2.1_vae.safetensors(Wan2.1 と共通)
- gguf(任意)
📂ComfyUI/
└── 📂models/
├── 📂diffusion_models/
│ ├── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
│ ├── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
│ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors
│ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors
├── 📂text_encoders/
│ └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
├── 📂unet/
│ ├── wan2.2_t2v_high_noise_14B-XXXX.gguf ← gguf を使う場合のみ
│ ├── wan2.2_t2v_low_noise_14B-XXXX.gguf ← gguf を使う場合のみ
│ ├── wan2.2_i2v_high_noise_14B-XXXX.gguf ← gguf を使う場合のみ
│ └── wan2.2_i2v_low_noise_14B-XXXX.gguf ← gguf を使う場合のみ
└── 📂vae/
└── wan_2.1_vae.safetensors
text2video(14B)
KSampler Advanced を使って前半を high_noise、後半を low_noise モデルで処理します。

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 159,
"last_link_id": 327,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1613.413330078125,
176.01361083984375
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
319,
321
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1005.2630004882812,
-109.75807189941406
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
705.2611694335938,
-109.75807189941406
],
"size": [
270,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1805.07373046875,
176.01361083984375
],
"size": [
372.2688903808594,
334
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00090.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00090.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00090.mp4"
}
}
}
},
{
"id": 153,
"type": "EmptyHunyuanLatentVideo",
"pos": [
567.0985107421875,
585.5717163085938
],
"size": [
270.0943298339844,
130
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
327
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "EmptyHunyuanLatentVideo"
},
"widgets_values": [
848,
480,
81,
1
]
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
627.1928100585938,
772.84326171875
],
"size": [
210,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
318,
320
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A high-quality, ultra-detailed image of a kingfisher diving into a crystal-clear pool of water, captured at the precise moment its beak touches the surface. The bird’s vibrant blue and orange feathers are sharply defined, with water droplets suspended in the air around it. The background features a softly blurred natural riverside setting, with lush green foliage and gentle sunlight filtering through the trees, creating a serene and dynamic scene."
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
925.6195068359375,
171.68209838867188
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 318
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 319
},
{
"name": "latent_image",
"type": "LATENT",
"link": 327
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
12345,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1274.5648193359375,
176.01361083984375
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 320
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 321
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
1314.0928344726562,
49.20225813953267
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
318,
6,
0,
154,
1,
"CONDITIONING"
],
[
319,
7,
0,
154,
2,
"CONDITIONING"
],
[
320,
6,
0,
155,
1,
"CONDITIONING"
],
[
321,
7,
0,
155,
2,
"CONDITIONING"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
327,
153,
0,
154,
3,
"LATENT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.7513148009015777,
"offset": [
213.67849731445312,
209.75807189941406
]
},
"frontendVersion": "1.35.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟩 全体 20 steps のうち、何 step 目で
high_noise→low_noiseに切り替えるかを指定します- この切り替えるタイミングですが、規定では 「ノイズじゃない部分」と「ノイズ部分」の割合 が、
1 : 1の時を推奨されています。 - これを計算することはできますが、Sampler / Scheduler / sigma_shift / step数が絡み合い難しいです。
- また、これを完璧あわせたものが必ず最適というわけでもありません。
- この workflow では、4 steps で切り替えていますが、まずは、これを基準に色々試してみてください。
- この切り替えるタイミングですが、規定では 「ノイズじゃない部分」と「ノイズ部分」の割合 が、
- 🟨🟥 テキストエンコーダと VAE は Wan2.1 と同じです。
image2video(14B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 164,
"last_link_id": 351,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
35.79126739501953,
314.20880126953125
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1846.9432373046875,
185.6936798095703
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 160,
"type": "LoadImage",
"pos": [
46.66357421875,
723.9078369140625
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
330
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"Clipboard - 2025-05-13 21.27.11.png",
"image"
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
624.2608642578125,
44.42214584350586
],
"size": [
270,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
397.4257507324219,
390.464111328125
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
341
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
921.9925537109375,
47.08414840698242
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
1170.0396728515625,
186.20217895507812
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 342
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 343
},
{
"name": "latent_image",
"type": "LATENT",
"link": 346
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
1234,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
},
{
"id": 162,
"type": "GetImageSize",
"pos": [
676.7446899414062,
723.2024536132812
],
"size": [
140,
124
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 348
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
350
]
},
{
"name": "height",
"type": "INT",
"links": [
351
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": [
"width: 724, height: 724\n batch size: 1"
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
397.47509765625,
187.46409606933594
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
340
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"Photorealistic style. A young girl with blonde braided hair, wearing a white dress, stands in a lush green garden filled with warm afternoon sunlight. She carefully holds a glass jar with both hands, watching a small goldfish swim gracefully inside with a gentle smile. The camera starts with a medium shot focused on the girl and the jar, then slowly and smoothly zooms in for a close-up of the goldfish. The overall atmosphere is soft, calm, and heartwarming."
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
551.524658203125,
597.2347412109375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
347
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 161,
"type": "ImageScaleToTotalPixels",
"pos": [
405.4178161621094,
723.9078369140625
],
"size": [
239.38699340820312,
82
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 330
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
348,
349
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1264.7877197265625,
-104.88626098632812
],
"size": [
210,
58
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
959.7449340820312,
-104.88626098632812
],
"size": [
270,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
2038.6036376953125,
185.6936798095703
],
"size": [
372.2688903808594,
700.2689208984375
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00091.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00091.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00091.mp4"
}
}
}
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
920.8279418945312,
588.53564453125
],
"size": [
210,
82
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 164,
"type": "WanImageToVideo",
"pos": [
870.6785278320312,
204.1481475830078
],
"size": [
270,
210
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 340
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 341
},
{
"name": "vae",
"type": "VAE",
"link": 347
},
{
"name": "clip_vision_output",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 349
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 350
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 351
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
342,
344
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
343,
345
]
},
{
"name": "latent",
"type": "LATENT",
"links": [
346
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "WanImageToVideo"
},
"widgets_values": [
832,
480,
81,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1510.51513671875,
185.6936798095703
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 344
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 345
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
330,
160,
0,
161,
0,
"IMAGE"
],
[
340,
6,
0,
164,
0,
"CONDITIONING"
],
[
341,
7,
0,
164,
1,
"CONDITIONING"
],
[
342,
164,
0,
154,
1,
"CONDITIONING"
],
[
343,
164,
1,
154,
2,
"CONDITIONING"
],
[
344,
164,
0,
155,
1,
"CONDITIONING"
],
[
345,
164,
1,
155,
2,
"CONDITIONING"
],
[
346,
164,
2,
154,
3,
"LATENT"
],
[
347,
39,
0,
164,
2,
"VAE"
],
[
348,
161,
0,
162,
0,
"IMAGE"
],
[
349,
161,
0,
164,
4,
"IMAGE"
],
[
350,
162,
0,
164,
5,
"INT"
],
[
351,
162,
1,
164,
6,
"INT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.6209213230591554,
"offset": [
213.67849731445312,
204.88626098632812
]
},
"frontendVersion": "1.28.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟦 スタート画像は
WanImageToVideo系ノードに入力します。 - Wan2.2ではWan2.1のときと違い、
clip_visionを使用しません。
FLF2V(14B / First–Last Frame to Video)
Wan2.1 では FLF2V は専用モデルがありましたが、Wan2.2 の image2video モデルは FLF2V にも対応しています。
ComfyUI では WanFirstLastFrameToVideo ノードに Start / End の 2 枚の画像を入力するだけで、2 枚の間を補間した動画を生成できます。

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 171,
"last_link_id": 372,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
35.79126739501953,
314.20880126953125
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1846.9432373046875,
185.6936798095703
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 20,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
624.2608642578125,
44.42214584350586
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
921.9925537109375,
47.08414840698242
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1264.7877197265625,
-104.88626098632812
],
"size": [
210,
58
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
959.7449340820312,
-104.88626098632812
],
"size": [
270,
82
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
2038.6036376953125,
185.6936798095703
],
"size": [
372.2688903808594,
855.2672119140625
],
"flags": {},
"order": 21,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00093.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00093.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00093.mp4"
}
}
}
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
397.4257507324219,
390.464111328125
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
353
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
551.524658203125,
597.2347412109375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
354
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 166,
"type": "WanFirstLastFrameToVideo",
"pos": [
866.294921875,
205.22122192382812
],
"size": [
275.09765625,
250
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 352
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 353
},
{
"name": "vae",
"type": "VAE",
"link": 354
},
{
"name": "clip_vision_start_image",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "clip_vision_end_image",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 370
},
{
"name": "end_image",
"shape": 7,
"type": "IMAGE",
"link": 371
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 355
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 356
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
357,
362
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
358,
363
]
},
{
"name": "latent",
"type": "LATENT",
"links": [
359
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "WanFirstLastFrameToVideo"
},
"widgets_values": [
832,
480,
81,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 168,
"type": "LoadImage",
"pos": [
-248.64210510253906,
963.0194091796875
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
366
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"ComfyUI_00507_.png",
"image"
]
},
{
"id": 167,
"type": "ImageBatch",
"pos": [
161.45394897460938,
801.67626953125
],
"size": [
140,
46
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "image1",
"type": "IMAGE",
"link": 365
},
{
"name": "image2",
"type": "IMAGE",
"link": 366
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
367
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageBatch"
},
"widgets_values": [],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 170,
"type": "ImageFromBatch",
"pos": [
604.3246459960938,
867.176513671875
],
"size": [
210,
82
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 369
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
371
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageFromBatch"
},
"widgets_values": [
1,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 169,
"type": "ImageFromBatch",
"pos": [
604.3246459960938,
729.67626953125
],
"size": [
210,
82
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 368
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
370,
372
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageFromBatch"
},
"widgets_values": [
0,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 160,
"type": "LoadImage",
"pos": [
-249.5445556640625,
511.5525817871094
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
365
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"MCNK7B5FR2J9K.png",
"image"
]
},
{
"id": 162,
"type": "GetImageSize",
"pos": [
604.3246459960938,
1025.370361328125
],
"size": [
210,
136
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 372
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
355
]
},
{
"name": "height",
"type": "INT",
"links": [
356
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": [
"width: 606, height: 865\n batch size: 1"
]
},
{
"id": 161,
"type": "ImageScaleToTotalPixels",
"pos": [
330.69580078125,
801.67626953125
],
"size": [
239.38699340820312,
82
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 367
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
368,
369
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
397.47509765625,
187.46409606933594
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
352
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A girl is about to sit as the camera moves upward"
]
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
920.8279418945312,
588.53564453125
],
"size": [
210,
82
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1510.51513671875,
185.6936798095703
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 19,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 362
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 363
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
1170.0396728515625,
186.20217895507812
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 18,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 357
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 358
},
{
"name": "latent_image",
"type": "LATENT",
"link": 359
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
123,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
352,
6,
0,
166,
0,
"CONDITIONING"
],
[
353,
7,
0,
166,
1,
"CONDITIONING"
],
[
354,
39,
0,
166,
2,
"VAE"
],
[
355,
162,
0,
166,
7,
"INT"
],
[
356,
162,
1,
166,
8,
"INT"
],
[
357,
166,
0,
154,
1,
"CONDITIONING"
],
[
358,
166,
1,
154,
2,
"CONDITIONING"
],
[
359,
166,
2,
154,
3,
"LATENT"
],
[
362,
166,
0,
155,
1,
"CONDITIONING"
],
[
363,
166,
1,
155,
2,
"CONDITIONING"
],
[
365,
160,
0,
167,
0,
"IMAGE"
],
[
366,
168,
0,
167,
1,
"IMAGE"
],
[
367,
167,
0,
161,
0,
"IMAGE"
],
[
368,
161,
0,
169,
0,
"IMAGE"
],
[
369,
161,
0,
170,
0,
"IMAGE"
],
[
370,
169,
0,
166,
5,
"IMAGE"
],
[
371,
170,
0,
166,
6,
"IMAGE"
],
[
372,
169,
0,
162,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.424097618372485,
"offset": [
349.5445556640625,
204.88626098632812
]
},
"frontendVersion": "1.28.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟦 Wan2.1 と同様、
WanFirstLastFrameToVideoノードに Start / End 画像を入力します。
Wan2.2 5Bモデル(TI2V-5B)
Wan2.2-TI2V-5B は、text2video / image2video の両方を 1 つのモデルで扱う TI2V モデルです。 より圧縮率の高い VAE やパッチ化処理を組み合わせることで、720p・24fps・5秒程度の動画を、14B より軽い計算コストで生成できます。
14B の 1.3B 的な縮小版というよりは、もう少し根本から設計の異なるラインと考えたほうがいいでしょう。
設計は面白いのですが、残念ながらやはり14Bに性能で勝てず、実際のところほとんど使われていません。
推奨設定値
- 推奨解像度
- 720p(1280×720)
- 最大フレーム数
- 121フレーム
- FPS
- 24fps
モデルのダウンロード
- diffusion_models
- vae
- gguf(任意)
配置例です。
📂ComfyUI/
└── 📂models/
├── 📂diffusion_models/
│ └── wan2.2_ti2v_5B_fp16.safetensors
├── 📂unet/
│ └── Wan2.2-TI2V-5B-XXXX.gguf ← gguf を使う場合のみ
└── 📂vae/
└── wan2.2_vae.safetensors
text2video(5B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 165,
"last_link_id": 341,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
329
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
330
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A high-quality, ultra-detailed image of a kingfisher diving into a crystal-clear pool of water, captured at the precise moment its beak touches the surface. The bird’s vibrant blue and orange feathers are sharply defined, with water droplets suspended in the air around it. The background features a softly blurred natural riverside setting, with lush green foliage and gentle sunlight filtering through the trees, creating a serene and dynamic scene."
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
331
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_ti2v_5B_fp16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-87.05850982666016,
-15.094049453735352
],
"size": [
366.41766357421875,
229.31654357910156
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_ti2v_5B_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan2.2_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan2.2_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── wan2.2_ti2v_5B_fp16.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan2.2_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 161,
"type": "Wan22ImageToVideoLatent",
"pos": [
563.1520385742188,
593.4163818359375
],
"size": [
271.9126892089844,
150
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 336
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": null
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
335
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "Wan22ImageToVideoLatent"
},
"widgets_values": [
1280,
704,
53,
1
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 160,
"type": "KSampler",
"pos": [
877.5444946289062,
186
],
"size": [
270,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 329
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 330
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 331
},
{
"name": "latent_image",
"type": "LATENT",
"link": 335
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
332
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSampler"
},
"widgets_values": [
123456,
"fixed",
20,
4,
"uni_pc",
"simple",
1
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
277.54449462890625,
596.2767333984375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
336
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan2.2_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1179.2911376953125,
186
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 332
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
341
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1368.597900390625,
186
],
"size": [
372.2688903808594,
541.7478637695312
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 341
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00006.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00006.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00006.mp4"
}
}
}
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
329,
48,
0,
160,
0,
"MODEL"
],
[
330,
6,
0,
160,
1,
"CONDITIONING"
],
[
331,
7,
0,
160,
2,
"CONDITIONING"
],
[
332,
160,
0,
8,
0,
"LATENT"
],
[
335,
161,
0,
160,
3,
"LATENT"
],
[
336,
39,
0,
161,
0,
"VAE"
],
[
341,
8,
0,
113,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.7513148009015777,
"offset": [
187.05850982666016,
115.09404945373535
]
},
"frontendVersion": "1.25.1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
5B の text2video では、内部的には「最初のフレーム latent」を経由して動画を生成します。
- 🟥 VAE は
wan2.2_vaeを使用します Wan2.1 の VAE と圧縮構造が異なるため、差し替え忘れると画質や動きが大きく崩れます。 - 🟩 text2video でも TI2V 用の latent ノード(例:
Wan22ImageToVideoLatent)を挟みます 5B は「1フレーム latent → 動画」のパイプライン前提で設計されているため、ここを飛ばす構成は想定されていません。
「text2video だが、中身は image2video の特別ケース」という理解でいると、他の TI2V 系モデルと併せて整理しやすいと思います。
image2video(5B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 168,
"last_link_id": 347,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
329
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
331
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_ti2v_5B_fp16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-87.05850982666016,
-15.094049453735352
],
"size": [
366.41766357421875,
229.31654357910156
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_ti2v_5B_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan2.2_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan2.2_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── wan2.2_ti2v_5B_fp16.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan2.2_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 39,
"type": "VAELoader",
"pos": [
277.54449462890625,
596.2767333984375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
336
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan2.2_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1179.2911376953125,
186
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 332
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
341
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1368.597900390625,
186
],
"size": [
372.2688903808594,
886.1885986328125
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 341
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00012.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00012.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00012.mp4"
}
}
}
},
{
"id": 167,
"type": "ImageScaleToTotalPixels",
"pos": [
93.42740631103516,
723.7991943359375
],
"size": [
270,
82
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 343
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
344,
345
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
]
},
{
"id": 168,
"type": "GetImageSize",
"pos": [
393.28973388671875,
771.4497680664062
],
"size": [
140,
124
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 345
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
346
]
},
{
"name": "height",
"type": "INT",
"links": [
347
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": []
},
{
"id": 161,
"type": "Wan22ImageToVideoLatent",
"pos": [
563.1520385742188,
593.4163818359375
],
"size": [
271.9126892089844,
150
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 336
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 344
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 346
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 347
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
335
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "Wan22ImageToVideoLatent"
},
"widgets_values": [
1280,
704,
65,
1
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 166,
"type": "LoadImage",
"pos": [
-210.51498413085938,
721.4742431640625
],
"size": [
275.080078125,
478
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
343
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"cloud.png",
"image"
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
330
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"Anime girl with light blue hair gradually dissolving into white clouds at the tips. She is looking upwards against a bright blue sky with soft white clouds"
]
},
{
"id": 160,
"type": "KSampler",
"pos": [
877.5444946289062,
186
],
"size": [
270,
262
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 329
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 330
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 331
},
{
"name": "latent_image",
"type": "LATENT",
"link": 335
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
332
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSampler"
},
"widgets_values": [
1234,
"fixed",
20,
4,
"uni_pc",
"simple",
1
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
329,
48,
0,
160,
0,
"MODEL"
],
[
330,
6,
0,
160,
1,
"CONDITIONING"
],
[
331,
7,
0,
160,
2,
"CONDITIONING"
],
[
332,
160,
0,
8,
0,
"LATENT"
],
[
335,
161,
0,
160,
3,
"LATENT"
],
[
336,
39,
0,
161,
0,
"VAE"
],
[
341,
8,
0,
113,
0,
"IMAGE"
],
[
343,
166,
0,
167,
0,
"IMAGE"
],
[
344,
167,
0,
161,
1,
"IMAGE"
],
[
345,
167,
0,
168,
0,
"IMAGE"
],
[
346,
168,
0,
161,
2,
"INT"
],
[
347,
168,
1,
161,
3,
"INT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.6830134553650707,
"offset": [
310.5149841308594,
115.09404945373535
]
},
"frontendVersion": "1.25.1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
image2video も text2video と同じ TI2V モデルを使います。 入力が増えるだけで、後段の KSampler 以降はほぼ共通です。
- 🟦 start 画像は TI2V 用の latent ノードに入力し、圧縮 latent を作ります。
- 🟩 text2video / image2video の両方が同一モデルで済むので、 まず 5B でワークフローを固めてから、必要に応じて 14B を足す、といった運用がしやすくなっています。