What is Wan 2.2?
Wan 2.2 is a family of video generation models that are the legitimate successors to Wan 2.1. It consists largely of two models:
- 14B: A two-stage architecture that switches between
high_noiseandlow_noisemodels. - 5B: A TI2V model that handles text2video and image2video with a single model + Wan 2.2 VAE.
Wan 2.2 14B Model
Wan 2.2-A14B has a two-stage pipeline where the high_noise model handles the first half of sampling and the low_noise model handles the second half.
By dividing it into two models, the model size is doubled to improve performance without increasing VRAM usage compared to Wan 2.1.
Recommended Settings
- Recommended Resolution
- 480p (854×480) - 720p (1280×720)
- Maximum Number of Frames
- 81 frames
- FPS
- Often output around 16fps
- However, since 16fps often results in slow-motion video, adjust it by saving at 24fps or dropping frames.
Model Download
- diffusion_models
- text_encoders
- umt5_xxl_fp8_e4m3fn_scaled.safetensors (Same as Wan2.1)
- vae
- wan_2.1_vae.safetensors (Same as Wan2.1)
- gguf (Optional)
📂ComfyUI/
└── 📂models/
├── 📂diffusion_models/
│ ├── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
│ ├── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
│ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors
│ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors
├── 📂text_encoders/
│ └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
├── 📂unet/
│ ├── wan2.2_t2v_high_noise_14B-XXXX.gguf ← Only when using gguf
│ ├── wan2.2_t2v_low_noise_14B-XXXX.gguf ← Only when using gguf
│ ├── wan2.2_i2v_high_noise_14B-XXXX.gguf ← Only when using gguf
│ └── wan2.2_i2v_low_noise_14B-XXXX.gguf ← Only when using gguf
└── 📂vae/
└── wan_2.1_vae.safetensors
text2video (14B)
Use KSampler Advanced to process the first half with the high_noise model and the second half with the low_noise model.

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 159,
"last_link_id": 327,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1613.413330078125,
176.01361083984375
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
319,
321
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1005.2630004882812,
-109.75807189941406
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
705.2611694335938,
-109.75807189941406
],
"size": [
270,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1805.07373046875,
176.01361083984375
],
"size": [
372.2688903808594,
334
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00090.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00090.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00090.mp4"
}
}
}
},
{
"id": 153,
"type": "EmptyHunyuanLatentVideo",
"pos": [
567.0985107421875,
585.5717163085938
],
"size": [
270.0943298339844,
130
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
327
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "EmptyHunyuanLatentVideo"
},
"widgets_values": [
848,
480,
81,
1
]
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
627.1928100585938,
772.84326171875
],
"size": [
210,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
318,
320
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A high-quality, ultra-detailed image of a kingfisher diving into a crystal-clear pool of water, captured at the precise moment its beak touches the surface. The bird’s vibrant blue and orange feathers are sharply defined, with water droplets suspended in the air around it. The background features a softly blurred natural riverside setting, with lush green foliage and gentle sunlight filtering through the trees, creating a serene and dynamic scene."
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
925.6195068359375,
171.68209838867188
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 318
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 319
},
{
"name": "latent_image",
"type": "LATENT",
"link": 327
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
12345,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1274.5648193359375,
176.01361083984375
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 320
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 321
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
1314.0928344726562,
49.20225813953267
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
318,
6,
0,
154,
1,
"CONDITIONING"
],
[
319,
7,
0,
154,
2,
"CONDITIONING"
],
[
320,
6,
0,
155,
1,
"CONDITIONING"
],
[
321,
7,
0,
155,
2,
"CONDITIONING"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
327,
153,
0,
154,
3,
"LATENT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.7513148009015777,
"offset": [
213.67849731445312,
209.75807189941406
]
},
"frontendVersion": "1.35.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟩 Specify at which step out of the total 20 steps to switch from
high_noise->low_noise.- As for the timing of this switch, it is recommended that the ratio of "non-noise parts" to "noise parts" be
1 : 1. - Although it is possible to calculate this, it is difficult because Sampler / Scheduler / sigma_shift / number of steps are all intertwined.
- Also, perfectly matching this does not necessarily mean it is optimal.
- In this workflow, we switch at 4 steps, but please try experimenting with this as a baseline.
- As for the timing of this switch, it is recommended that the ratio of "non-noise parts" to "noise parts" be
- 🟨🟥 The text encoder and VAE are the same as Wan 2.1.
image2video (14B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 164,
"last_link_id": 351,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
35.79126739501953,
314.20880126953125
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1846.9432373046875,
185.6936798095703
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 160,
"type": "LoadImage",
"pos": [
46.66357421875,
723.9078369140625
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
330
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"Clipboard - 2025-05-13 21.27.11.png",
"image"
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
624.2608642578125,
44.42214584350586
],
"size": [
270,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
397.4257507324219,
390.464111328125
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
341
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
921.9925537109375,
47.08414840698242
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
1170.0396728515625,
186.20217895507812
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 342
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 343
},
{
"name": "latent_image",
"type": "LATENT",
"link": 346
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
1234,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
},
{
"id": 162,
"type": "GetImageSize",
"pos": [
676.7446899414062,
723.2024536132812
],
"size": [
140,
124
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 348
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
350
]
},
{
"name": "height",
"type": "INT",
"links": [
351
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": [
"width: 724, height: 724\n batch size: 1"
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
397.47509765625,
187.46409606933594
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
340
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"Photorealistic style. A young girl with blonde braided hair, wearing a white dress, stands in a lush green garden filled with warm afternoon sunlight. She carefully holds a glass jar with both hands, watching a small goldfish swim gracefully inside with a gentle smile. The camera starts with a medium shot focused on the girl and the jar, then slowly and smoothly zooms in for a close-up of the goldfish. The overall atmosphere is soft, calm, and heartwarming."
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
551.524658203125,
597.2347412109375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
347
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 161,
"type": "ImageScaleToTotalPixels",
"pos": [
405.4178161621094,
723.9078369140625
],
"size": [
239.38699340820312,
82
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 330
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
348,
349
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1264.7877197265625,
-104.88626098632812
],
"size": [
210,
58
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
959.7449340820312,
-104.88626098632812
],
"size": [
270,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
2038.6036376953125,
185.6936798095703
],
"size": [
372.2688903808594,
700.2689208984375
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00091.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00091.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00091.mp4"
}
}
}
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
920.8279418945312,
588.53564453125
],
"size": [
210,
82
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 164,
"type": "WanImageToVideo",
"pos": [
870.6785278320312,
204.1481475830078
],
"size": [
270,
210
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 340
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 341
},
{
"name": "vae",
"type": "VAE",
"link": 347
},
{
"name": "clip_vision_output",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 349
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 350
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 351
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
342,
344
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
343,
345
]
},
{
"name": "latent",
"type": "LATENT",
"links": [
346
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "WanImageToVideo"
},
"widgets_values": [
832,
480,
81,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1510.51513671875,
185.6936798095703
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 344
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 345
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
330,
160,
0,
161,
0,
"IMAGE"
],
[
340,
6,
0,
164,
0,
"CONDITIONING"
],
[
341,
7,
0,
164,
1,
"CONDITIONING"
],
[
342,
164,
0,
154,
1,
"CONDITIONING"
],
[
343,
164,
1,
154,
2,
"CONDITIONING"
],
[
344,
164,
0,
155,
1,
"CONDITIONING"
],
[
345,
164,
1,
155,
2,
"CONDITIONING"
],
[
346,
164,
2,
154,
3,
"LATENT"
],
[
347,
39,
0,
164,
2,
"VAE"
],
[
348,
161,
0,
162,
0,
"IMAGE"
],
[
349,
161,
0,
164,
4,
"IMAGE"
],
[
350,
162,
0,
164,
5,
"INT"
],
[
351,
162,
1,
164,
6,
"INT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.6209213230591554,
"offset": [
213.67849731445312,
204.88626098632812
]
},
"frontendVersion": "1.28.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟦 Input the start image into the
WanImageToVideonode. - Unlike Wan 2.1, Wan 2.2 does not use
clip_vision.
FLF2V (14B / First–Last Frame to Video)
Wan 2.1 had a dedicated model for FLF2V, but Wan 2.2's image2video model also supports FLF2V.
In ComfyUI, you can generate a video that interpolates between two images simply by inputting the Start / End images into the WanFirstLastFrameToVideo node.

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 171,
"last_link_id": 372,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
35.79126739501953,
314.20880126953125
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1846.9432373046875,
185.6936798095703
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 20,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 317
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
256
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
624.2608642578125,
44.42214584350586
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
921.9925537109375,
47.08414840698242
],
"size": [
210,
58
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
322
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-113.67849731445312,
-40.50404357910156
],
"size": [
405.56439208984375,
254.14065551757812
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors)\n- [wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors)\n\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ ├── wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors\n │ └── wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan_2.1_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 157,
"type": "ModelSamplingSD3",
"pos": [
1264.7877197265625,
-104.88626098632812
],
"size": [
210,
58
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 323
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
324
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 156,
"type": "UNETLoader",
"pos": [
959.7449340820312,
-104.88626098632812
],
"size": [
270,
82
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
323
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors",
"fp8_e4m3fn"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
2038.6036376953125,
185.6936798095703
],
"size": [
372.2688903808594,
855.2672119140625
],
"flags": {},
"order": 21,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 256
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00093.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00093.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00093.mp4"
}
}
}
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
397.4257507324219,
390.464111328125
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
353
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
551.524658203125,
597.2347412109375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
354
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan_2.1_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 166,
"type": "WanFirstLastFrameToVideo",
"pos": [
866.294921875,
205.22122192382812
],
"size": [
275.09765625,
250
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 352
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 353
},
{
"name": "vae",
"type": "VAE",
"link": 354
},
{
"name": "clip_vision_start_image",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "clip_vision_end_image",
"shape": 7,
"type": "CLIP_VISION_OUTPUT",
"link": null
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 370
},
{
"name": "end_image",
"shape": 7,
"type": "IMAGE",
"link": 371
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 355
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 356
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
357,
362
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
358,
363
]
},
{
"name": "latent",
"type": "LATENT",
"links": [
359
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "WanFirstLastFrameToVideo"
},
"widgets_values": [
832,
480,
81,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 168,
"type": "LoadImage",
"pos": [
-248.64210510253906,
963.0194091796875
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
366
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"ComfyUI_00507_.png",
"image"
]
},
{
"id": 167,
"type": "ImageBatch",
"pos": [
161.45394897460938,
801.67626953125
],
"size": [
140,
46
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "image1",
"type": "IMAGE",
"link": 365
},
{
"name": "image2",
"type": "IMAGE",
"link": 366
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
367
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageBatch"
},
"widgets_values": [],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 170,
"type": "ImageFromBatch",
"pos": [
604.3246459960938,
867.176513671875
],
"size": [
210,
82
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 369
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
371
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageFromBatch"
},
"widgets_values": [
1,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 169,
"type": "ImageFromBatch",
"pos": [
604.3246459960938,
729.67626953125
],
"size": [
210,
82
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 368
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
370,
372
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.48",
"Node name for S&R": "ImageFromBatch"
},
"widgets_values": [
0,
1
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 160,
"type": "LoadImage",
"pos": [
-249.5445556640625,
511.5525817871094
],
"size": [
334.93658447265625,
388.3801574707031
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
365
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"MCNK7B5FR2J9K.png",
"image"
]
},
{
"id": 162,
"type": "GetImageSize",
"pos": [
604.3246459960938,
1025.370361328125
],
"size": [
210,
136
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 372
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
355
]
},
{
"name": "height",
"type": "INT",
"links": [
356
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": [
"width: 606, height: 865\n batch size: 1"
]
},
{
"id": 161,
"type": "ImageScaleToTotalPixels",
"pos": [
330.69580078125,
801.67626953125
],
"size": [
239.38699340820312,
82
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 367
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
368,
369
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
],
"color": "#2a363b",
"bgcolor": "#3f5159"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
397.47509765625,
187.46409606933594
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
352
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A girl is about to sit as the camera moves upward"
]
},
{
"id": 158,
"type": "PrimitiveNode",
"pos": [
920.8279418945312,
588.53564453125
],
"size": [
210,
82
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"links": [
325,
326
]
}
],
"title": "start_at_step",
"properties": {
"Run widget replace on values": false
},
"widgets_values": [
4,
"fixed"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 155,
"type": "KSamplerAdvanced",
"pos": [
1510.51513671875,
185.6936798095703
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 19,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 324
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 362
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 363
},
{
"name": "latent_image",
"type": "LATENT",
"link": 316
},
{
"name": "start_at_step",
"type": "INT",
"widget": {
"name": "start_at_step"
},
"link": 325
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
317
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"disable",
0,
"fixed",
20,
3,
"euler",
"simple",
4,
9999,
"disable"
]
},
{
"id": 154,
"type": "KSamplerAdvanced",
"pos": [
1170.0396728515625,
186.20217895507812
],
"size": [
304.748046875,
334
],
"flags": {},
"order": 18,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 322
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 357
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 358
},
{
"name": "latent_image",
"type": "LATENT",
"link": 359
},
{
"name": "end_at_step",
"type": "INT",
"widget": {
"name": "end_at_step"
},
"link": 326
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSamplerAdvanced"
},
"widgets_values": [
"enable",
123,
"fixed",
20,
4,
"euler",
"simple",
0,
4,
"enable"
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
256,
8,
0,
113,
0,
"IMAGE"
],
[
316,
154,
0,
155,
3,
"LATENT"
],
[
317,
155,
0,
8,
0,
"LATENT"
],
[
322,
48,
0,
154,
0,
"MODEL"
],
[
323,
156,
0,
157,
0,
"MODEL"
],
[
324,
157,
0,
155,
0,
"MODEL"
],
[
325,
158,
0,
155,
4,
"INT"
],
[
326,
158,
0,
154,
4,
"INT"
],
[
352,
6,
0,
166,
0,
"CONDITIONING"
],
[
353,
7,
0,
166,
1,
"CONDITIONING"
],
[
354,
39,
0,
166,
2,
"VAE"
],
[
355,
162,
0,
166,
7,
"INT"
],
[
356,
162,
1,
166,
8,
"INT"
],
[
357,
166,
0,
154,
1,
"CONDITIONING"
],
[
358,
166,
1,
154,
2,
"CONDITIONING"
],
[
359,
166,
2,
154,
3,
"LATENT"
],
[
362,
166,
0,
155,
1,
"CONDITIONING"
],
[
363,
166,
1,
155,
2,
"CONDITIONING"
],
[
365,
160,
0,
167,
0,
"IMAGE"
],
[
366,
168,
0,
167,
1,
"IMAGE"
],
[
367,
167,
0,
161,
0,
"IMAGE"
],
[
368,
161,
0,
169,
0,
"IMAGE"
],
[
369,
161,
0,
170,
0,
"IMAGE"
],
[
370,
169,
0,
166,
5,
"IMAGE"
],
[
371,
170,
0,
166,
6,
"IMAGE"
],
[
372,
169,
0,
162,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.424097618372485,
"offset": [
349.5445556640625,
204.88626098632812
]
},
"frontendVersion": "1.28.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- 🟦 As with Wan 2.1, input the Start / End images into the
WanFirstLastFrameToVideonode.
Wan 2.2 5B Model (TI2V-5B)
Wan 2.2-TI2V-5B is a TI2V model that handles both text2video and image2video with a single model. By combining a higher compression VAE and patchification processing, it can generate 720p, 24fps, ~5 second videos with lighter computational cost than 14B.
Rather than being a 1.3B-like scaled-down version of 14B, it is better to think of it as a line with a fundamentally different design.
The design is interesting, but unfortunately, it cannot beat 14B in performance, and in reality, it is rarely used.
Recommended Settings
- Recommended Resolution
- 720p (1280×720)
- Maximum Number of Frames
- 121 frames
- FPS
- 24fps
Model Download
- diffusion_models
- vae
- gguf (Optional)
Placement example:
📂ComfyUI/
└── 📂models/
├── 📂diffusion_models/
│ └── wan2.2_ti2v_5B_fp16.safetensors
├── 📂unet/
│ └── Wan2.2-TI2V-5B-XXXX.gguf ← Only when using gguf
└── 📂vae/
└── wan2.2_vae.safetensors
text2video (5B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 165,
"last_link_id": 341,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
329
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
330
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A high-quality, ultra-detailed image of a kingfisher diving into a crystal-clear pool of water, captured at the precise moment its beak touches the surface. The bird’s vibrant blue and orange feathers are sharply defined, with water droplets suspended in the air around it. The background features a softly blurred natural riverside setting, with lush green foliage and gentle sunlight filtering through the trees, creating a serene and dynamic scene."
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
331
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_ti2v_5B_fp16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-87.05850982666016,
-15.094049453735352
],
"size": [
366.41766357421875,
229.31654357910156
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_ti2v_5B_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan2.2_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan2.2_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── wan2.2_ti2v_5B_fp16.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan2.2_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 161,
"type": "Wan22ImageToVideoLatent",
"pos": [
563.1520385742188,
593.4163818359375
],
"size": [
271.9126892089844,
150
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 336
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": null
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
335
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "Wan22ImageToVideoLatent"
},
"widgets_values": [
1280,
704,
53,
1
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 160,
"type": "KSampler",
"pos": [
877.5444946289062,
186
],
"size": [
270,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 329
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 330
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 331
},
{
"name": "latent_image",
"type": "LATENT",
"link": 335
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
332
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSampler"
},
"widgets_values": [
123456,
"fixed",
20,
4,
"uni_pc",
"simple",
1
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
277.54449462890625,
596.2767333984375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
336
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan2.2_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1179.2911376953125,
186
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 332
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
341
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1368.597900390625,
186
],
"size": [
372.2688903808594,
541.7478637695312
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 341
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00006.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00006.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00006.mp4"
}
}
}
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
329,
48,
0,
160,
0,
"MODEL"
],
[
330,
6,
0,
160,
1,
"CONDITIONING"
],
[
331,
7,
0,
160,
2,
"CONDITIONING"
],
[
332,
160,
0,
8,
0,
"LATENT"
],
[
335,
161,
0,
160,
3,
"LATENT"
],
[
336,
39,
0,
161,
0,
"VAE"
],
[
341,
8,
0,
113,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.7513148009015777,
"offset": [
187.05850982666016,
115.09404945373535
]
},
"frontendVersion": "1.25.1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
In 5B text2video, the video is generated internally via the "first frame latent".
- 🟥 Use
wan2.2_vaefor VAE. Since the compression structure is different from Wan 2.1 VAE, failing to replace it will cause significant degradation in image quality and movement. - 🟩 Even for text2video, insert a latent node for TI2V (e.g.,
Wan22ImageToVideoLatent). Since 5B is designed based on the "1 frame latent -> video" pipeline, a configuration that skips this step is not envisaged.
If you understand it as "text2video, but essentially a special case of image2video", it will be easier to organize along with other TI2V models.
image2video (5B)

{
"id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
"revision": 0,
"last_node_id": 168,
"last_link_id": 347,
"nodes": [
{
"id": 38,
"type": "CLIPLoader",
"pos": [
56.288665771484375,
312.74468994140625
],
"size": [
301.3524169921875,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
627.1928100585938,
49.284149169921875
],
"size": [
210,
58
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 134
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
329
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
417.8738708496094,
389
],
"size": [
419.3189392089844,
138.8924560546875
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
331
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走 "
]
},
{
"id": 73,
"type": "UNETLoader",
"pos": [
329.4610595703125,
46.62214660644531
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
134
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Wan2.2\\wan2.2_ti2v_5B_fp16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
-87.05850982666016,
-15.094049453735352
],
"size": [
366.41766357421875,
229.31654357910156
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [wan2.2_ti2v_5B_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)\n- [umt5_xxl.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/text_encoders)\n- [wan2.2_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan2.2_vae.safetensors)\n\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── wan2.2_ti2v_5B_fp16.safetensors\n ├── 📂text_encoders/\n │ └── umt5_xxl (fp16 or fp8).safetensors\n └── 📂vae/\n └── wan2.2_vae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 39,
"type": "VAELoader",
"pos": [
277.54449462890625,
596.2767333984375
],
"size": [
265.22003173828125,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
336
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"wan2.2_vae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1179.2911376953125,
186
],
"size": [
157.56002807617188,
46
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 332
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
341
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 113,
"type": "VHS_VideoCombine",
"pos": [
1368.597900390625,
186
],
"size": [
372.2688903808594,
886.1885986328125
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 341
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "a7ce59e381934733bfae03b1be029756d6ce936d",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "Wan2.2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Wan2.2_00012.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "Wan2.2_00012.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\Wan2.2_00012.mp4"
}
}
}
},
{
"id": 167,
"type": "ImageScaleToTotalPixels",
"pos": [
93.42740631103516,
723.7991943359375
],
"size": [
270,
82
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 343
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
344,
345
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "ImageScaleToTotalPixels"
},
"widgets_values": [
"nearest-exact",
0.5
]
},
{
"id": 168,
"type": "GetImageSize",
"pos": [
393.28973388671875,
771.4497680664062
],
"size": [
140,
124
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 345
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
346
]
},
{
"name": "height",
"type": "INT",
"links": [
347
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "GetImageSize"
},
"widgets_values": []
},
{
"id": 161,
"type": "Wan22ImageToVideoLatent",
"pos": [
563.1520385742188,
593.4163818359375
],
"size": [
271.9126892089844,
150
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 336
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 344
},
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 346
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 347
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
335
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "Wan22ImageToVideoLatent"
},
"widgets_values": [
1280,
704,
65,
1
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 166,
"type": "LoadImage",
"pos": [
-210.51498413085938,
721.4742431640625
],
"size": [
275.080078125,
478
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
343
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"cloud.png",
"image"
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
417.9232177734375,
186
],
"size": [
419.26959228515625,
148.8194122314453
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
330
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.33",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"Anime girl with light blue hair gradually dissolving into white clouds at the tips. She is looking upwards against a bright blue sky with soft white clouds"
]
},
{
"id": 160,
"type": "KSampler",
"pos": [
877.5444946289062,
186
],
"size": [
270,
262
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 329
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 330
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 331
},
{
"name": "latent_image",
"type": "LATENT",
"link": 335
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
332
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.46",
"Node name for S&R": "KSampler"
},
"widgets_values": [
1234,
"fixed",
20,
4,
"uni_pc",
"simple",
1
]
}
],
"links": [
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
134,
73,
0,
48,
0,
"MODEL"
],
[
329,
48,
0,
160,
0,
"MODEL"
],
[
330,
6,
0,
160,
1,
"CONDITIONING"
],
[
331,
7,
0,
160,
2,
"CONDITIONING"
],
[
332,
160,
0,
8,
0,
"LATENT"
],
[
335,
161,
0,
160,
3,
"LATENT"
],
[
336,
39,
0,
161,
0,
"VAE"
],
[
341,
8,
0,
113,
0,
"IMAGE"
],
[
343,
166,
0,
167,
0,
"IMAGE"
],
[
344,
167,
0,
161,
1,
"IMAGE"
],
[
345,
167,
0,
168,
0,
"IMAGE"
],
[
346,
168,
0,
161,
2,
"INT"
],
[
347,
168,
1,
161,
3,
"INT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.6830134553650707,
"offset": [
310.5149841308594,
115.09404945373535
]
},
"frontendVersion": "1.25.1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
image2video also uses the same TI2V model as text2video. Only the input increases, and the subsequent steps from KSampler onwards are almost common.
- 🟦 Input the start image into the latent node for TI2V to create a compressed latent.
- 🟩 Since both text2video / image2video can be done with the same model, it is easy to solidify the workflow with 5B first and then add 14B as needed.