What is LTX-2?
LTX-2 is an audio-visual diffusion model released by Lightricks that can generate both audio and video simultaneously.
Recommended Settings
- Resolution
- 640×640 (1:1)
- 768×512 (3:2)
- 704×512 (4:3)
- Upscaled 2x in post-processing, so actual output will be 1280×1280, etc.
- Must be a multiple of 32
- FPS
- 24 / 25 / 30
- Frames
- Max: 257 frames (approx. 10 sec at 25fps)
- Recommended: 121–161 (balance of quality and memory)
- Must be 8n+1
Model Download
-
checkpoints (VAE included)
-
latent_upscale_models
-
loras
-
text_encoders
📂ComfyUI/
└── 📂models/
├── 📂checkpoints/
│ └── ltx-2-19b-dev-fp8.safetensors
├── 📂latent_upscale_models/
│ └── ltx-2-spatial-upscaler-x2-1.0.safetensors
├── 📂loras/
│ └── ltx-2-19b-distilled-lora-384.safetensors
└── 📂text_encoders/
└── gemma_3_12B_it_fp8_scaled.safetensors
Basic Process Flow

It might feel complicated because there are more nodes compared to Wan, but this is all it does:
-
- text2video + audio
- First, generate the base video (and audio).
-
- Hires.fix (2nd stage)
- Upscale the generated video by 2x and refine it with video2video.
- You can skip this and decode directly, but Hires.fix is recommended for quality.
-
- Decode
- Decode video and audio separately for output.
text2video

Follow the basic flow explained above to build the workflow.
- 1, 2, 3 are the 1st stage.
- 4, 5 are Hires.fix.
- 6 is Decode.
1. Set Video Resolution, Length, FPS
Decide the parameters for the video and audio you want to generate here.
- Enter resolution, frame count, and FPS in
EmptyLTXVLatentVideo/LTXV Empty Latent Audio. - Follow the Recommended Settings.
- 🚨Resolution will be doubled in post-processing.
- In other words, set the resolution here to half the value of the video you want to create.
2. Prompt
A characteristic of the LTX series is that you need to be somewhat particular about the prompt, otherwise you won't get a very good video.
- That said, there isn't a strict format like when borrowing the power of LLMs.
- Try describing the video you want to generate as if you were writing a novel.
- cf. Prompting Guide for LTX-2
3. Sampling (1st Stage)
It doesn't look like the familiar KSampler so it might seem a bit complicated, but the basics are just "decide steps and CFG and sample".
- In this workflow, the 1st stage is run with 20 steps / CFG 4.0.
- It uses a dedicated scheduler called
LTXVScheduler.- It behaves similarly to
linear_quadratic, but you don't need to worry about it too much.
- It behaves similarly to
- Since LTX-2 handles video and audio simultaneously, combine video latent and audio latent into one with 🟫
LTXVConcatAVLatent.
4. Latent Upscale (x2)
Upscale the resolution of the video latent by 2x.
- Use a dedicated model (
ltx-2-spatial-upscaler-x2).
5. Sampling (2nd Stage / video2video)
Refine the upscaled latent with short steps.
- Here we use
distilled-lorawhich allows generation in 4~8 steps.- Think of it as something like Lightning / Turbo in other models.
- This workflow runs in 3 steps.
- Accordingly, CFG is changed to
1.0.
- Because it uses
Manual Sigma, it's a bit hard to understand, but if thinking in terms ofSimple, it behaves somewhat close todenoise = 0.47.
6. Decode
Finally, decode and export video and audio respectively.
- Separate the latent for video / audio and decode with appropriate VAE.
- (Tiled VAE is used because VRAM is tight.)
text2video 8 steps
Above, we used distilled-lora only for Hires.fix, but let's apply it to the 1st stage as well and generate quickly in 8 steps.

To apply distilled-lora, change some sampling settings.
- CFG :
1.0 - scheduler :
Simple - steps :
8
20 steps / 8 steps distilled-lora Comparison
As far as I tried, applying distilled-lora produces more stable generations.
Therefore, for speed and stability, all subsequent workflows apply distilled-lora from the 1st stage.
image2video
single-frame I2V

The basic idea is "fix the 1st frame with input image and generate the rest".
For example, if creating a 121-frame video, the flow is roughly like this:
(1) Create a frame for 121 frames (8n+1)
[ 🌫️ 🌫️ 🌫️ 🌫️ 🌫️ ... 🌫️ ]
(2) Overwrite only the 1st frame with input image
[ 🖼️ 🌫️ 🌫️ 🌫️ 🌫️ ... 🌫️ ]
(3) Generate the remaining 120 frames
[ 🖼️ ✨ ✨ ✨ ✨ ... ✨ ]
Imagine the frames (✨) filling up consecutively starting from 🖼️.
1. Resize Input Image (Create 2 versions)
- First, create a full-resolution version matching the final output resolution.
- Resize to arbitrary size (here 1MP).
- Width and height must be multiples of 64.
- Since the 1st stage runs at 1/2 resolution, make it a multiple of 64 so it remains a multiple of 32 when halved.
- Next, for the 1st stage (half resolution), create a version with width/height halved from the above image.
- Input this half-resolution width/height into
EmptyLTXVLatentVideo.
- Input this half-resolution width/height into
2. Image Preprocessing
A characteristic from LTX-Video is that since video is slightly compressed and degraded compared to still images, using an image that is too clean may result in a video that doesn't move at all.
- To avoid this, intentionally degrade it to look like video compression with
LTXVPreprocess.
3. LTXVImgToVideoInplace (Insert into 1st Stage)
This is the core of image2video.
- Insert the image as the 1st frame into the video latent of the 1st stage (half resolution).
4. Do the same for Upscale side (2nd Stage)
Insert the image into the 2nd stage as well.
- Make sure to connect this node after the spatial node.
- Set strength to
1.0.- If you reduce this, the inserted image itself will behave like it's being image2image'd.
- That's fine if you want it to blend in as a whole, but if you want to match the input image and 1st frame perfectly, set it to
1.0.
Output Example

As a known issue, often the video hardly moves or just zooms out.
Using appropriate prompts helps to some extent, but a LoRA has been introduced to address this.link + workflow : LTX-2 Image2Video Adapter LoRa
multi-frame I2V
The previous image2video workflow can take not only a single image but also an image batch (= video) as input.
By applying this, you can create a workflow that uses the end of an arbitrary video as a "connector" and extends it further.

{
"id": "7f5e0c56-93b4-4937-b7f2-efd0f1853e33",
"revision": 0,
"last_node_id": 196,
"last_link_id": 429,
"nodes": [
{
"id": 144,
"type": "Reroute",
"pos": [
3664.65243485745,
3746.68342826367
],
"size": [
75,
26
],
"flags": {},
"order": 28,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 324
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
325
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 100,
"type": "ManualSigmas",
"pos": [
2882.609869170561,
4335.240528755964
],
"size": [
270,
58
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SIGMAS",
"type": "SIGMAS",
"links": [
275
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "ManualSigmas",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"0.909375, 0.725, 0.421875, 0.0"
]
},
{
"id": 131,
"type": "PrimitiveInt",
"pos": [
-10.399004680767268,
4926.614637224299
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"links": [
306,
315
]
}
],
"title": "INT: Frame Rate",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "PrimitiveInt",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
24,
"fixed"
]
},
{
"id": 128,
"type": "LTXVAudioVAEDecode",
"pos": [
3792.0481481830034,
4299.797177158513
],
"size": [
257.2388542190106,
46
],
"flags": {},
"order": 48,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 297
},
{
"label": "Audio VAE",
"name": "audio_vae",
"type": "VAE",
"link": 326
}
],
"outputs": [
{
"name": "Audio",
"type": "AUDIO",
"links": []
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVAudioVAEDecode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 99,
"type": "LTXAVTextEncoderLoader",
"pos": [
37.989254913013944,
4138.954135935162
],
"size": [
325.4143077141439,
106
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
288,
289
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXAVTextEncoderLoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
},
{
"name": "gemma_3_12B_it.safetensors",
"url": "https://huggingface.co/Comfy-Org/ltx-2/resolve/main/split_files/text_encoders/gemma_3_12B_it.safetensors",
"directory": "text_encoders"
}
]
},
"widgets_values": [
"gemma_3_12B_it_fp8_scaled.safetensors",
"LTX-2\\ltx-2-19b-dev-fp8.safetensors",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 137,
"type": "KSamplerSelect",
"pos": [
1328.7113717033576,
4286.285225429741
],
"size": [
270,
68.88020833333334
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"type": "SAMPLER",
"links": [
261
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "KSamplerSelect",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"euler"
]
},
{
"id": 116,
"type": "LTXVSeparateAVLatent",
"pos": [
1970.5497343481866,
4187.958891681814
],
"size": [
240,
46
],
"flags": {},
"order": 41,
"mode": 0,
"inputs": [
{
"name": "av_latent",
"type": "LATENT",
"link": 271
}
],
"outputs": [
{
"name": "video_latent",
"type": "LATENT",
"links": [
359
]
},
{
"name": "audio_latent",
"type": "LATENT",
"links": [
265
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "LTXVSeparateAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 101,
"type": "LatentUpscaleModelLoader",
"pos": [
1970.5497343481866,
4482.2684207951115
],
"size": [
279.7901046187276,
58
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT_UPSCALE_MODEL",
"type": "LATENT_UPSCALE_MODEL",
"links": [
360
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LatentUpscaleModelLoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-spatial-upscaler-x2-1.0.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-spatial-upscaler-x2-1.0.safetensors",
"directory": "latent_upscale_models"
}
]
},
"widgets_values": [
"ltx-2-spatial-upscaler-x2-1.0.safetensors"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 117,
"type": "LTXVConcatAVLatent",
"pos": [
2882.609869170561,
4456.286246568414
],
"size": [
270,
46
],
"flags": {},
"order": 44,
"mode": 0,
"inputs": [
{
"name": "video_latent",
"type": "LATENT",
"link": 388
},
{
"name": "audio_latent",
"type": "LATENT",
"link": 265
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
276
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "LTXVConcatAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 110,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
4225.4796800153135
],
"size": [
403.50317378836485,
117.09155367536096
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 288
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
287
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
""
]
},
{
"id": 129,
"type": "CFGGuider",
"pos": [
1328.7113717033576,
4113.19520467289
],
"size": [
270,
106.66666666666667
],
"flags": {},
"order": 26,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 364
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 254
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 255
}
],
"outputs": [
{
"name": "GUIDER",
"type": "GUIDER",
"links": [
260
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.64",
"Node name for S&R": "CFGGuider",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
1
]
},
{
"id": 113,
"type": "SamplerCustomAdvanced",
"pos": [
1682.2951507877015,
4188.309385581295
],
"size": [
242.12760404770165,
106
],
"flags": {},
"order": 40,
"mode": 0,
"inputs": [
{
"name": "noise",
"type": "NOISE",
"link": 259
},
{
"name": "guider",
"type": "GUIDER",
"link": 260
},
{
"name": "sampler",
"type": "SAMPLER",
"link": 261
},
{
"name": "sigmas",
"type": "SIGMAS",
"link": 367
},
{
"name": "latent_image",
"type": "LATENT",
"link": 263
}
],
"outputs": [
{
"name": "output",
"type": "LATENT",
"links": [
271
]
},
{
"name": "denoised_output",
"type": "LATENT",
"links": []
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "SamplerCustomAdvanced",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 109,
"type": "LTXVConcatAVLatent",
"pos": [
1328.7113717033576,
4594.012141943443
],
"size": [
270,
46
],
"flags": {},
"order": 39,
"mode": 0,
"inputs": [
{
"name": "video_latent",
"type": "LATENT",
"link": 384
},
{
"name": "audio_latent",
"type": "LATENT",
"link": 413
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
263
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVConcatAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 164,
"type": "BasicScheduler",
"pos": [
1328.7113717033576,
4421.5887878532585
],
"size": [
270,
106
],
"flags": {},
"order": 23,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 368
}
],
"outputs": [
{
"name": "SIGMAS",
"type": "SIGMAS",
"links": [
367
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "BasicScheduler"
},
"widgets_values": [
"simple",
8,
1
]
},
{
"id": 169,
"type": "ResizeImageMaskNode",
"pos": [
-3.6442069279673888,
4451.504639125714
],
"size": [
270,
106
],
"flags": {},
"order": 31,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 415
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
371
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale by multiplier",
0.5,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 112,
"type": "PrimitiveInt",
"pos": [
-10.399004680767268,
4748.438305444043
],
"size": [
270,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"links": [
282,
292
]
}
],
"title": "INT: Length",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "PrimitiveInt",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
121,
"fixed"
]
},
{
"id": 141,
"type": "SimpleMath+",
"pos": [
296.7380095268876,
4814.283172945113
],
"size": [
210,
98
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "a",
"shape": 7,
"type": "*",
"link": 315
},
{
"name": "b",
"shape": 7,
"type": "*",
"link": null
},
{
"name": "c",
"shape": 7,
"type": "*",
"link": null
}
],
"outputs": [
{
"name": "INT",
"type": "INT",
"links": null
},
{
"name": "FLOAT",
"type": "FLOAT",
"links": [
316
]
}
],
"properties": {
"cnr_id": "comfyui_essentials",
"ver": "9d9f4bedfc9f0321c19faf71855e228c93bd0dc9",
"Node name for S&R": "SimpleMath+"
},
"widgets_values": [
"a"
]
},
{
"id": 154,
"type": "Reroute",
"pos": [
883.103846226827,
3746.68342826367
],
"size": [
75,
26
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 342
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
343,
385
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 161,
"type": "LTXVLatentUpsampler",
"pos": [
2293.8710660266784,
4465.248811234943
],
"size": [
223.3783852709311,
66
],
"flags": {},
"order": 42,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 359
},
{
"name": "upscale_model",
"type": "LATENT_UPSCALE_MODEL",
"link": 360
},
{
"name": "vae",
"type": "VAE",
"link": 363
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
386
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "LTXVLatentUpsampler"
},
"widgets_values": []
},
{
"id": 143,
"type": "Reroute",
"pos": [
2135.5497343481848,
3746.3329343641885
],
"size": [
75,
26
],
"flags": {},
"order": 24,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 343
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
324,
363,
387
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 173,
"type": "LTXVImgToVideoInplace",
"pos": [
2560.0346079247215,
4425.721369465231
],
"size": [
279.7901046187276,
122
],
"flags": {},
"order": 43,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 387
},
{
"name": "image",
"type": "IMAGE",
"link": 376
},
{
"name": "latent",
"type": "LATENT",
"link": 386
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
388
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVImgToVideoInplace"
},
"widgets_values": [
1,
false
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 138,
"type": "KSamplerSelect",
"pos": [
2882.609869170561,
4213.725371725666
],
"size": [
270,
58
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"type": "SAMPLER",
"links": [
274
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "KSamplerSelect",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"euler"
]
},
{
"id": 170,
"type": "LTXVPreprocess",
"pos": [
561.5017024022684,
4653.923090349445
],
"size": [
270,
58
],
"flags": {
"collapsed": false
},
"order": 35,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 373
}
],
"outputs": [
{
"name": "output_image",
"type": "IMAGE",
"links": [
374,
377
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "LTXVPreprocess"
},
"widgets_values": [
33
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 174,
"type": "Reroute",
"pos": [
2443.4014126180414,
4654.383490442259
],
"size": [
75,
26
],
"flags": {},
"order": 37,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 377
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
376
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 171,
"type": "LTXVImgToVideoInplace",
"pos": [
943.5810456173188,
4486.879108422938
],
"size": [
270,
122
],
"flags": {},
"order": 38,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 385
},
{
"name": "image",
"type": "IMAGE",
"link": 374
},
{
"name": "latent",
"type": "LATENT",
"link": 381
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
384
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVImgToVideoInplace"
},
"widgets_values": [
1,
false
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 119,
"type": "SamplerCustomAdvanced",
"pos": [
3217.8124312949008,
4176.1423033096735
],
"size": [
237.86096495408756,
106
],
"flags": {},
"order": 45,
"mode": 0,
"inputs": [
{
"name": "noise",
"type": "NOISE",
"link": 272
},
{
"name": "guider",
"type": "GUIDER",
"link": 273
},
{
"name": "sampler",
"type": "SAMPLER",
"link": 274
},
{
"name": "sigmas",
"type": "SIGMAS",
"link": 275
},
{
"name": "latent_image",
"type": "LATENT",
"link": 276
}
],
"outputs": [
{
"name": "output",
"type": "LATENT",
"links": []
},
{
"name": "denoised_output",
"type": "LATENT",
"links": [
299
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "SamplerCustomAdvanced",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
57.44161655463637,
3561.647385437717
],
"size": [
399.0254035325611,
339.2647673465967
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n - checkpoints\n - [ltx-2-19b-dev-fp8.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev-fp8.safetensors)\n - latent_upscale_models\n - [ltx-2-spatial-upscaler-x2-1.0.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-spatial-upscaler-x2-1.0.safetensors)\n - loras\n - [ltx-2-19b-distilled-lora-384.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-distilled-lora-384.safetensors)\n - text_encoders\n - [gemma_3_12B_it_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/ltx-2/blob/main/split_files/text_encoders/gemma_3_12B_it_fp8_scaled.safetensors)\n\n```text\n📂ComfyUI/\n└── 📂models/\n ├── 📂checkpoints/\n │ └── ltx-2-19b-dev-fp8.safetensors\n ├── 📂latent_upscale_models/\n │ └── ltx-2-spatial-upscaler-x2-1.0.safetensors\n ├── 📂loras/\n │ └── ltx-2-19b-distilled-lora-384.safetensors\n └── 📂text_encoders/\n └── gemma_3_12B_it_fp8_scaled.safetensors\n"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 125,
"type": "LTXVSeparateAVLatent",
"pos": [
3501.110323035425,
4199.1048351621475
],
"size": [
237.68443744811694,
46
],
"flags": {},
"order": 46,
"mode": 0,
"inputs": [
{
"name": "av_latent",
"type": "LATENT",
"link": 299
}
],
"outputs": [
{
"name": "video_latent",
"type": "LATENT",
"links": [
302
]
},
{
"name": "audio_latent",
"type": "LATENT",
"links": [
297
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "LTXVSeparateAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 160,
"type": "Reroute",
"pos": [
2764.6860939776006,
3633.4523874609713
],
"size": [
75,
26
],
"flags": {},
"order": 22,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 365
}
],
"outputs": [
{
"name": "",
"type": "MODEL",
"links": [
366
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 134,
"type": "LoraLoaderModelOnly",
"pos": [
884.245410818498,
3633.802881360453
],
"size": [
350.9069033720766,
82
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 331
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
364,
365,
368
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "LoraLoaderModelOnly",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-distilled-lora-384.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-distilled-lora-384.safetensors",
"directory": "loras"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-distilled-lora-384.safetensors",
0.7
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 133,
"type": "CheckpointLoaderSimple",
"pos": [
482.4816826527883,
3633.802881360453
],
"size": [
350.9069033720766,
98
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
331
]
},
{
"name": "CLIP",
"type": "CLIP",
"links": []
},
{
"name": "VAE",
"type": "VAE",
"links": [
342
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CheckpointLoaderSimple",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 145,
"type": "Reroute",
"pos": [
3664.65243485745,
3810.115867718167
],
"size": [
75,
26
],
"flags": {},
"order": 19,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 327
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
326
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 103,
"type": "CFGGuider",
"pos": [
2882.609869170561,
4052.210214695361
],
"size": [
270,
98
],
"flags": {},
"order": 27,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 366
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 349
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 350
}
],
"outputs": [
{
"name": "GUIDER",
"type": "GUIDER",
"links": [
273
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.71",
"Node name for S&R": "CFGGuider",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
1
]
},
{
"id": 107,
"type": "LTXVConditioning",
"pos": [
943.3923572550591,
4079.8906519129855
],
"size": [
270,
86.66666666666667
],
"flags": {},
"order": 21,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 286
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 287
},
{
"name": "frame_rate",
"type": "FLOAT",
"widget": {
"name": "frame_rate"
},
"link": 316
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
254,
349
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
255,
350
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "LTXVConditioning",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
25
]
},
{
"id": 124,
"type": "LTXVAudioVAELoader",
"pos": [
482.4816826527883,
3810.115867718167
],
"size": [
350.9069033720766,
58
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "Audio VAE",
"type": "VAE",
"links": [
281,
327
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.68",
"Node name for S&R": "LTXVAudioVAELoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 121,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
3982.090817803126
],
"size": [
403.50317378836485,
178.09168459401417
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 289
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
286
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"A low-angle, vertical smartphone-style shot of an orange tabby cat walking straight toward the camera on a rough gravel path beside a modern black vertical-slat fence and light concrete wall. Natural daylight, soft shadows, realistic colors, slight handheld micro-movement, shallow depth of field focused on the cat’s face as it approaches with tail raised. As the cat reaches near the camera, a woman enters from screen right, steps into frame, crouches beside the cat, and gently picks it up by supporting its chest and hindquarters. She lifts the cat to a comfortable hold against her torso, and the cat remains calm, looking around while being held. Continuous shot, steady pacing, no cuts.\n"
]
},
{
"id": 106,
"type": "LTXVEmptyLatentAudio",
"pos": [
561.5017024022684,
4881.993543004677
],
"size": [
270,
120
],
"flags": {},
"order": 18,
"mode": 0,
"inputs": [
{
"name": "audio_vae",
"type": "VAE",
"link": 281
},
{
"name": "frames_number",
"type": "INT",
"widget": {
"name": "frames_number"
},
"link": 282
},
{
"name": "frame_rate",
"type": "INT",
"widget": {
"name": "frame_rate"
},
"link": 306
}
],
"outputs": [
{
"name": "Latent",
"type": "LATENT",
"links": [
413
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.68",
"Node name for S&R": "LTXVEmptyLatentAudio",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
97,
25,
1
]
},
{
"id": 115,
"type": "RandomNoise",
"pos": [
1328.7113717033576,
3964.7718505827065
],
"size": [
270,
82
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"type": "NOISE",
"links": [
259
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "RandomNoise",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
12345,
"fixed"
]
},
{
"id": 114,
"type": "RandomNoise",
"pos": [
2882.609869170561,
3906.695057665063
],
"size": [
270,
82
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"type": "NOISE",
"links": [
272
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "RandomNoise",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
12345,
"fixed"
]
},
{
"id": 168,
"type": "GetImageSize",
"pos": [
296.7380095268876,
4451.504639125714
],
"size": [
210,
136
],
"flags": {},
"order": 34,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 371
}
],
"outputs": [
{
"name": "width",
"type": "INT",
"links": [
382
]
},
{
"name": "height",
"type": "INT",
"links": [
383
]
},
{
"name": "batch_size",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "GetImageSize"
},
"widgets_values": [
"width: 448, height: 832\n batch size: 25"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 108,
"type": "EmptyLTXVLatentVideo",
"pos": [
561.5017024022684,
4429.181100445509
],
"size": [
270,
146.66666666666669
],
"flags": {},
"order": 36,
"mode": 0,
"inputs": [
{
"name": "width",
"type": "INT",
"widget": {
"name": "width"
},
"link": 382
},
{
"name": "height",
"type": "INT",
"widget": {
"name": "height"
},
"link": 383
},
{
"name": "length",
"type": "INT",
"widget": {
"name": "length"
},
"link": 292
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
381
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "EmptyLTXVLatentVideo",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
704,
512,
97,
1
]
},
{
"id": 180,
"type": "GetImageRangeFromBatch",
"pos": [
-348.192985902192,
4451.504639125714
],
"size": [
313.4466174618275,
102
],
"flags": {},
"order": 29,
"mode": 0,
"inputs": [
{
"name": "images",
"shape": 7,
"type": "IMAGE",
"link": 417
},
{
"name": "masks",
"shape": 7,
"type": "MASK",
"link": null
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
415,
416
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-kjnodes",
"ver": "4dfb85dcc52e4315c33170d97bb987baa46d128b",
"Node name for S&R": "GetImageRangeFromBatch"
},
"widgets_values": [
-1,
25
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 165,
"type": "ResizeImageMaskNode",
"pos": [
-927,
4451.504639125714
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 20,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 418
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
378
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale total pixels",
1.5,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 172,
"type": "Reroute",
"pos": [
-3.6442069279675025,
4652.781525757774
],
"size": [
75,
26
],
"flags": {},
"order": 32,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 416
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
373
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 127,
"type": "VAEDecodeTiled",
"pos": [
3792.0481481830034,
4074.5664863706947
],
"size": [
257.2388542190106,
150
],
"flags": {},
"order": 47,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 302
},
{
"name": "vae",
"type": "VAE",
"link": 325
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
313,
421
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "VAEDecodeTiled",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
512,
64,
4096,
8
]
},
{
"id": 177,
"type": "VHS_LoadVideo",
"pos": [
-1264.4156462722526,
4201.0535315294655
],
"size": [
296.6746972124074,
818.3484766220628
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
418
]
},
{
"name": "frame_count",
"type": "INT",
"links": null
},
{
"name": "audio",
"type": "AUDIO",
"links": []
},
{
"name": "video_info",
"type": "VHS_VIDEOINFO",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "8923bd836bdab8b7bbdf4ed104b7d045e70c66e2",
"Node name for S&R": "VHS_LoadVideo"
},
"widgets_values": {
"video": "12503985_2160_3840_30fps.mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 0,
"skip_first_frames": 0,
"select_every_nth": 1,
"format": "None",
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "12503985_2160_3840_30fps.mp4",
"type": "input",
"format": "video/mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 0,
"skip_first_frames": 0,
"select_every_nth": 1
}
}
},
"color": "#232",
"bgcolor": "#353"
},
{
"id": 190,
"type": "Reroute",
"pos": [
-348.192985902192,
5085.861751952041
],
"size": [
75,
26
],
"flags": {},
"order": 30,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 420
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
427
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 192,
"type": "BatchImagesNode",
"pos": [
5136.616886225551,
4074.5664863706947
],
"size": [
169.23046875,
66
],
"flags": {},
"order": 51,
"mode": 0,
"inputs": [
{
"label": "image0",
"name": "images.image0",
"type": "IMAGE",
"link": 428
},
{
"label": "image1",
"name": "images.image1",
"type": "IMAGE",
"link": 425
},
{
"label": "image2",
"name": "images.image2",
"shape": 7,
"type": "IMAGE",
"link": null
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
426
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.2",
"Node name for S&R": "BatchImagesNode"
}
},
{
"id": 193,
"type": "VHS_VideoCombine",
"pos": [
5347.391694466165,
3661.81066002572
],
"size": [
554.3878616024967,
1338.8997713113658
],
"flags": {},
"order": 52,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 426
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "8923bd836bdab8b7bbdf4ed104b7d045e70c66e2",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "LTX-2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "LTX-2_00425.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "LTX-2_00425.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\LTX-2_00425.mp4"
}
}
}
},
{
"id": 194,
"type": "Reroute",
"pos": [
5037.219836784776,
5085.861751952041
],
"size": [
75,
26
],
"flags": {},
"order": 33,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 427
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
428
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 191,
"type": "ImageFromBatch",
"pos": [
4902.219836784776,
4074.5664863706947
],
"size": [
210,
82
],
"flags": {},
"order": 50,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 421
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
425
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.2",
"Node name for S&R": "ImageFromBatch"
},
"widgets_values": [
25,
4096
]
},
{
"id": 140,
"type": "VHS_VideoCombine",
"pos": [
4150.269910132865,
3651.3945769533184
],
"size": [
559.3465392884473,
1349.6435729642594
],
"flags": {},
"order": 49,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 313
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "8923bd836bdab8b7bbdf4ed104b7d045e70c66e2",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "LTX-2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "LTX-2_00424.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "LTX-2_00424.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\LTX-2_00424.mp4"
}
}
}
},
{
"id": 175,
"type": "ResizeImageMaskNode",
"pos": [
-637.596492951096,
4451.504639125714
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 25,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 378
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
417,
420
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale to multiple",
64,
"area"
],
"color": "#232",
"bgcolor": "#353"
}
],
"links": [
[
254,
107,
0,
129,
1,
"CONDITIONING"
],
[
255,
107,
1,
129,
2,
"CONDITIONING"
],
[
259,
115,
0,
113,
0,
"NOISE"
],
[
260,
129,
0,
113,
1,
"GUIDER"
],
[
261,
137,
0,
113,
2,
"SAMPLER"
],
[
263,
109,
0,
113,
4,
"LATENT"
],
[
265,
116,
1,
117,
1,
"LATENT"
],
[
271,
113,
0,
116,
0,
"LATENT"
],
[
272,
114,
0,
119,
0,
"NOISE"
],
[
273,
103,
0,
119,
1,
"GUIDER"
],
[
274,
138,
0,
119,
2,
"SAMPLER"
],
[
275,
100,
0,
119,
3,
"SIGMAS"
],
[
276,
117,
0,
119,
4,
"LATENT"
],
[
281,
124,
0,
106,
0,
"VAE"
],
[
282,
112,
0,
106,
1,
"INT"
],
[
286,
121,
0,
107,
0,
"CONDITIONING"
],
[
287,
110,
0,
107,
1,
"CONDITIONING"
],
[
288,
99,
0,
110,
0,
"CLIP"
],
[
289,
99,
0,
121,
0,
"CLIP"
],
[
292,
112,
0,
108,
2,
"INT"
],
[
297,
125,
1,
128,
0,
"LATENT"
],
[
299,
119,
1,
125,
0,
"LATENT"
],
[
302,
125,
0,
127,
0,
"LATENT"
],
[
306,
131,
0,
106,
2,
"INT"
],
[
313,
127,
0,
140,
0,
"IMAGE"
],
[
315,
131,
0,
141,
0,
"INT"
],
[
316,
141,
1,
107,
2,
"FLOAT"
],
[
324,
143,
0,
144,
0,
"VAE"
],
[
325,
144,
0,
127,
1,
"VAE"
],
[
326,
145,
0,
128,
1,
"VAE"
],
[
327,
124,
0,
145,
0,
"VAE"
],
[
331,
133,
0,
134,
0,
"MODEL"
],
[
342,
133,
2,
154,
0,
"VAE"
],
[
343,
154,
0,
143,
0,
"VAE"
],
[
349,
107,
0,
103,
1,
"CONDITIONING"
],
[
350,
107,
1,
103,
2,
"CONDITIONING"
],
[
359,
116,
0,
161,
0,
"LATENT"
],
[
360,
101,
0,
161,
1,
"LATENT_UPSCALE_MODEL"
],
[
363,
143,
0,
161,
2,
"VAE"
],
[
364,
134,
0,
129,
0,
"MODEL"
],
[
365,
134,
0,
160,
0,
"MODEL"
],
[
366,
160,
0,
103,
0,
"MODEL"
],
[
367,
164,
0,
113,
3,
"SIGMAS"
],
[
368,
134,
0,
164,
0,
"MODEL"
],
[
371,
169,
0,
168,
0,
"IMAGE"
],
[
373,
172,
0,
170,
0,
"IMAGE"
],
[
374,
170,
0,
171,
1,
"IMAGE"
],
[
376,
174,
0,
173,
1,
"IMAGE"
],
[
377,
170,
0,
174,
0,
"IMAGE"
],
[
378,
165,
0,
175,
0,
"IMAGE"
],
[
381,
108,
0,
171,
2,
"LATENT"
],
[
382,
168,
0,
108,
0,
"INT"
],
[
383,
168,
1,
108,
1,
"INT"
],
[
384,
171,
0,
109,
0,
"LATENT"
],
[
385,
154,
0,
171,
0,
"VAE"
],
[
386,
161,
0,
173,
2,
"LATENT"
],
[
387,
143,
0,
173,
0,
"VAE"
],
[
388,
173,
0,
117,
0,
"LATENT"
],
[
413,
106,
0,
109,
1,
"LATENT"
],
[
415,
180,
0,
169,
0,
"IMAGE"
],
[
416,
180,
0,
172,
0,
"IMAGE"
],
[
417,
175,
0,
180,
0,
"IMAGE"
],
[
418,
177,
0,
165,
0,
"IMAGE"
],
[
420,
175,
0,
190,
0,
"IMAGE"
],
[
421,
127,
0,
191,
0,
"IMAGE"
],
[
425,
191,
0,
192,
1,
"IMAGE"
],
[
426,
192,
0,
193,
0,
"IMAGE"
],
[
427,
190,
0,
194,
0,
"IMAGE"
],
[
428,
194,
0,
192,
0,
"IMAGE"
]
],
"groups": [
{
"id": 15,
"title": "Upscale",
"bounding": [
1958.6615745559297,
3478.328645861585,
1509.2741025731757,
1566.0504149089452
],
"color": "#8AA",
"font_size": 24,
"flags": {}
},
{
"id": 16,
"title": "Decode",
"bounding": [
3484.579993342886,
3478.9993909455034,
1373.2340850148612,
1565.6946883230557
],
"color": "#8A8",
"font_size": 24,
"flags": {}
},
{
"id": 18,
"title": "Extension",
"bounding": [
-1377.030123851723,
3478.606859313507,
3317.65555186646,
1566.5752780297244
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
},
{
"id": 19,
"title": "Concat",
"bounding": [
4870.845772129356,
3478.930607094412,
1073.8305882917366,
1564.2842204876033
],
"color": "#88A",
"font_size": 24,
"flags": {}
}
],
"config": {},
"extra": {
"ds": {
"scale": 0.3186308177103573,
"offset": [
1364.4156462722526,
-3461.647385437717
]
},
"frontendVersion": "1.38.2",
"workflowRendererVersion": "LG",
"prompt": {
"1": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "CheckpointLoaderSimple",
"_meta": {
"title": "Load Checkpoint"
}
},
"2": {
"inputs": {
"gemma_path": "gemma-3-12b-it-qat-q4_0-unquantized_readout_proj/model/model.safetensors",
"ltxv_path": "ltx-av-step-1751000_vocoder_24K.safetensors",
"max_length": 1024
},
"class_type": "LTXVGemmaCLIPModelLoader",
"_meta": {
"title": "🅛🅣🅧 Gemma 3 Model Loader"
}
},
"3": {
"inputs": {
"text": "A medium close-up shot features a Caucasian man with a closely shaven head and face, wearing a black baseball cap with \"PNTR\" in white letters on the front, and a dark grey t-shirt with \"JUST DO IT\" visible across his chest. A small black microphone is clipped to his shirt collar. He is positioned slightly to the left of the frame, looking intently downwards and to his right, his eyes focused off-camera. His facial expression is one of deep concentration, with his brow slightly furrowed. As he looks down, a quick sniff sound is heard, and then he speaks with a deep male voice and a slightly frustrated tone, saying, \"I think it's so bad.\" The camera remains static throughout, maintaining a shallow depth of field, which keeps the man in sharp focus while the background is softly blurred, revealing a light-colored wall with white wooden shelving or trim, and a partially open white wooden door on the right. After a brief pause, another short, audible sniff is heard. The man then continues to speak, his voice maintaining the same quality, as he states, \"So bad. So bad.\" He elaborates further, emphasizing his point with a final statement, \"This got to be, it's got to be the worst tool I've ever seen.\"",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"4": {
"inputs": {
"text": "blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, unreadable text on shirt or hat, incorrect lettering on cap (“PNTR”), incorrect t-shirt slogan (“JUST DO IT”), missing microphone, misplaced microphone, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, smiling, laughing, exaggerated sadness, wrong gaze direction, eyes looking at camera, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, missing sniff sounds, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, missing door or shelves, missing shallow depth of field, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"8": {
"inputs": {
"sampler_name": "euler"
},
"class_type": "KSamplerSelect",
"_meta": {
"title": "KSamplerSelect"
}
},
"9": {
"inputs": {
"steps": 20,
"max_shift": 2.05,
"base_shift": 0.95,
"stretch": true,
"terminal": 0.1,
"latent": [
"28",
0
]
},
"class_type": "LTXVScheduler",
"_meta": {
"title": "LTXVScheduler"
}
},
"11": {
"inputs": {
"noise_seed": 10
},
"class_type": "RandomNoise",
"_meta": {
"title": "RandomNoise"
}
},
"12": {
"inputs": {
"samples": [
"29",
0
],
"vae": [
"1",
2
]
},
"class_type": "VAEDecode",
"_meta": {
"title": "VAE Decode"
}
},
"13": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "LTXVAudioVAELoader",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Loader"
}
},
"14": {
"inputs": {
"samples": [
"29",
1
],
"audio_vae": [
"13",
0
]
},
"class_type": "LTXVAudioVAEDecode",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Decode"
}
},
"15": {
"inputs": {
"frame_rate": [
"23",
0
],
"loop_count": 0,
"filename_prefix": "AnimateDiff",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"images": [
"12",
0
],
"audio": [
"14",
0
]
},
"class_type": "VHS_VideoCombine",
"_meta": {
"title": "Video Combine 🎥🅥🅗🅢"
}
},
"17": {
"inputs": {
"skip_blocks": "29",
"model": [
"28",
1
],
"positive": [
"22",
0
],
"negative": [
"22",
1
],
"parameters": [
"18",
0
]
},
"class_type": "MultimodalGuider",
"_meta": {
"title": "🅛🅣🅧 Multimodal Guider"
}
},
"18": {
"inputs": {
"modality": "VIDEO",
"cfg": 3,
"stg": 0,
"rescale": 0,
"modality_scale": 3,
"parameters": [
"19",
0
]
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"19": {
"inputs": {
"modality": "AUDIO",
"cfg": 7,
"stg": 0,
"rescale": 0,
"modality_scale": 3
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"21": {
"inputs": {
"audioUI": "",
"audio": [
"14",
0
]
},
"class_type": "PreviewAudio",
"_meta": {
"title": "PreviewAudio"
}
},
"22": {
"inputs": {
"frame_rate": [
"23",
0
],
"positive": [
"3",
0
],
"negative": [
"4",
0
]
},
"class_type": "LTXVConditioning",
"_meta": {
"title": "LTXVConditioning"
}
},
"23": {
"inputs": {
"value": 25
},
"class_type": "FloatConstant",
"_meta": {
"title": "Float Constant"
}
},
"26": {
"inputs": {
"frames_number": [
"27",
0
],
"frame_rate": [
"42",
0
],
"batch_size": 1
},
"class_type": "LTXVEmptyLatentAudio",
"_meta": {
"title": "🅛🅣🅧 LTXV Empty Latent Audio"
}
},
"27": {
"inputs": {
"value": 105
},
"class_type": "INTConstant",
"_meta": {
"title": "INT Constant"
}
},
"28": {
"inputs": {
"video_latent": [
"43",
0
],
"audio_latent": [
"26",
0
],
"model": [
"44",
0
]
},
"class_type": "LTXVConcatAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Concat AV Latent"
}
},
"29": {
"inputs": {
"av_latent": [
"41",
0
],
"model": [
"28",
1
]
},
"class_type": "LTXVSeparateAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Separate AV Latent"
}
},
"41": {
"inputs": {
"noise": [
"11",
0
],
"guider": [
"17",
0
],
"sampler": [
"8",
0
],
"sigmas": [
"9",
0
],
"latent_image": [
"28",
0
]
},
"class_type": "SamplerCustomAdvanced",
"_meta": {
"title": "SamplerCustomAdvanced"
}
},
"42": {
"inputs": {
"a": [
"23",
0
]
},
"class_type": "CM_FloatToInt",
"_meta": {
"title": "FloatToInt"
}
},
"43": {
"inputs": {
"width": 768,
"height": 512,
"length": [
"27",
0
],
"batch_size": 1
},
"class_type": "EmptyLTXVLatentVideo",
"_meta": {
"title": "EmptyLTXVLatentVideo"
}
},
"44": {
"inputs": {
"torch_compile": true,
"disable_backup": false,
"model": [
"1",
0
]
},
"class_type": "LTXVSequenceParallelMultiGPUPatcher",
"_meta": {
"title": "LTXVSequenceParallelMultiGPUPatcher"
}
},
"45": {
"inputs": {
"frame_idx": 0,
"strength": 1
},
"class_type": "LTXVAddGuide",
"_meta": {
"title": "LTXVAddGuide"
}
}
},
"comfy_fork_version": "feature/av_inference@a6994ed1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
It takes the last few frames of the input video and generates the continuation.
(1) Input video (= image batch)
[ 🖼️ 🖼️ 🖼️ 🖼️ ... 🖼️ 🖼️ 🖼️ ]
(2) Take N frames from the end (N = 8n+1)
[ 🖼️ 🖼️ 🖼️ 🖼️... 🖼️ 🖼️ 🖼️ ]
└─── N ───┘
(3) Create a 121 frames slot and overwrite the beginning with N frames
[ 🖼️ 🖼️ 🖼️ 🌫️ 🌫️ 🌫️ ... 🌫️ ]
└── N ──┘
(4) Generate the remaining (121 - N frames) to make the continuation
[ 🖼️ 🖼️ 🖼️ ✨ ✨ ✨ ... ✨ ]
(5) Delete the first N frames (as they duplicate the end of original video)
[ ✨ ✨ ✨ ... ✨ ]
(6) Concatenate original video + continuation
[ 🖼️ 🖼️ 🖼️ ... 🖼️] + [ ✨ ✨ ✨ ... ✨ ]
1. Get End Image Batch
Get the image batch that serves as the connector from the end of the input video.
- Enter an arbitrary number in
num_framesofGet Image or Mask Range From Batch(must be 8n+1). - Increasing N makes it easier to inherit the movement and atmosphere of the original video.
- However, since the generated section becomes 121 - N frames, increasing N makes the "continuation" shorter.
2. Concatenate Generated Video and Original Video
The generation result includes the "connector (N frames from end of original video)" at the beginning, but since this part duplicates the original video, delete it before concatenation.
- Delete the first N frames of the generated video (25 frames in this example)
- Concatenate to the end of the original video
Output Example
audio2video
Since LTX-2 is a model that handles "video + audio" simultaneously, you can configure it to take audio as input and create a video driven by the sound.

- Trim audio to appropriate length with
Trim Audio Duration. - Encode audio and connect to
LTXVConcatAVLatent. - Connect to the second stage
LTXVConcatAVLatentas well. - Use the input audio as is for the output video (do not use generated audio).
🚨If the audio length is shorter than the generated video length, the audio condition will not work. A video unrelated to the sound will be generated. Even if it's silent, you need to make it longer than the video being generated.
I see workflows using Set Latent Noise Mask here, but the result is the same whether it's there or not.
Output Example
audio-image2video
You can combine the above two. If you combine a face image with spoken audio, you can do something like a talking head. Let's try it.

- Just combine the audio2video / image2video workflows.
Output Example

Actually, because the video didn't follow the dialogue very well, I put the dialogue in the prompt. There might be a better workflow.
video2audio
Contrary to audio2video, you can also input a video and generate sound (sound effects or environmental sounds) that matches it.
This task is unstable. Probably needs improvement.

{
"id": "7f5e0c56-93b4-4937-b7f2-efd0f1853e33",
"revision": 0,
"last_node_id": 183,
"last_link_id": 423,
"nodes": [
{
"id": 128,
"type": "LTXVAudioVAEDecode",
"pos": [
2255.521599031807,
4301.225228382185
],
"size": [
257.2388542190106,
46
],
"flags": {},
"order": 26,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 297
},
{
"label": "Audio VAE",
"name": "audio_vae",
"type": "VAE",
"link": 326
}
],
"outputs": [
{
"name": "Audio",
"type": "AUDIO",
"links": [
314
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVAudioVAEDecode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 99,
"type": "LTXAVTextEncoderLoader",
"pos": [
37.989254913013944,
4138.954135935162
],
"size": [
325.4143077141439,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
288,
289
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXAVTextEncoderLoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
},
{
"name": "gemma_3_12B_it.safetensors",
"url": "https://huggingface.co/Comfy-Org/ltx-2/resolve/main/split_files/text_encoders/gemma_3_12B_it.safetensors",
"directory": "text_encoders"
}
]
},
"widgets_values": [
"gemma_3_12B_it_fp8_scaled.safetensors",
"LTX-2\\ltx-2-19b-dev-fp8.safetensors",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 137,
"type": "KSamplerSelect",
"pos": [
1328.7113717033576,
4286.285225429741
],
"size": [
270,
68.88020833333334
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"type": "SAMPLER",
"links": [
261
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "KSamplerSelect",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"euler"
]
},
{
"id": 129,
"type": "CFGGuider",
"pos": [
1328.7113717033576,
4113.19520467289
],
"size": [
270,
106.66666666666667
],
"flags": {},
"order": 21,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 364
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 254
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 255
}
],
"outputs": [
{
"name": "GUIDER",
"type": "GUIDER",
"links": [
260
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.64",
"Node name for S&R": "CFGGuider",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
1
]
},
{
"id": 109,
"type": "LTXVConcatAVLatent",
"pos": [
1328.7113717033576,
4594.012141943443
],
"size": [
270,
46
],
"flags": {},
"order": 23,
"mode": 0,
"inputs": [
{
"name": "video_latent",
"type": "LATENT",
"link": 423
},
{
"name": "audio_latent",
"type": "LATENT",
"link": 415
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
263
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVConcatAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 164,
"type": "BasicScheduler",
"pos": [
1328.7113717033576,
4421.5887878532585
],
"size": [
270,
106
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 368
}
],
"outputs": [
{
"name": "SIGMAS",
"type": "SIGMAS",
"links": [
367
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "BasicScheduler"
},
"widgets_values": [
"simple",
8,
1
]
},
{
"id": 112,
"type": "PrimitiveInt",
"pos": [
363.8058684689672,
4660.537831885363
],
"size": [
270,
82
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "value",
"type": "INT",
"widget": {
"name": "value"
},
"link": 402
}
],
"outputs": [
{
"name": "INT",
"type": "INT",
"links": [
282
]
}
],
"title": "INT: Length",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "PrimitiveInt",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
121,
"fixed"
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
57.44161655463637,
3561.647385437717
],
"size": [
399.0254035325611,
339.2647673465967
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n - checkpoints\n - [ltx-2-19b-dev-fp8.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev-fp8.safetensors)\n - latent_upscale_models\n - [ltx-2-spatial-upscaler-x2-1.0.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-spatial-upscaler-x2-1.0.safetensors)\n - loras\n - [ltx-2-19b-distilled-lora-384.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-distilled-lora-384.safetensors)\n - text_encoders\n - [gemma_3_12B_it_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/ltx-2/blob/main/split_files/text_encoders/gemma_3_12B_it_fp8_scaled.safetensors)\n\n```text\n📂ComfyUI/\n└── 📂models/\n ├── 📂checkpoints/\n │ └── ltx-2-19b-dev-fp8.safetensors\n ├── 📂latent_upscale_models/\n │ └── ltx-2-spatial-upscaler-x2-1.0.safetensors\n ├── 📂loras/\n │ └── ltx-2-19b-distilled-lora-384.safetensors\n └── 📂text_encoders/\n └── gemma_3_12B_it_fp8_scaled.safetensors\n"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 134,
"type": "LoraLoaderModelOnly",
"pos": [
884.245410818498,
3633.802881360453
],
"size": [
350.9069033720766,
82
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 331
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
364,
368
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "LoraLoaderModelOnly",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-distilled-lora-384.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-distilled-lora-384.safetensors",
"directory": "loras"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-distilled-lora-384.safetensors",
0.7
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 145,
"type": "Reroute",
"pos": [
2128.1258857062535,
3811.5439189418425
],
"size": [
75,
26
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 327
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
326
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 140,
"type": "VHS_VideoCombine",
"pos": [
2576.016197925671,
3910.239460332994
],
"size": [
712.6131392034486,
737.5948908019398
],
"flags": {},
"order": 27,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 405
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": 314
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "8923bd836bdab8b7bbdf4ed104b7d045e70c66e2",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "LTX-2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "LTX-2_00482-audio.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "LTX-2_00482.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\LTX-2_00482-audio.mp4"
}
}
}
},
{
"id": 113,
"type": "SamplerCustomAdvanced",
"pos": [
1682.2951507877015,
4188.309385581295
],
"size": [
242.12760404770165,
106
],
"flags": {},
"order": 24,
"mode": 0,
"inputs": [
{
"name": "noise",
"type": "NOISE",
"link": 259
},
{
"name": "guider",
"type": "GUIDER",
"link": 260
},
{
"name": "sampler",
"type": "SAMPLER",
"link": 261
},
{
"name": "sigmas",
"type": "SIGMAS",
"link": 367
},
{
"name": "latent_image",
"type": "LATENT",
"link": 263
}
],
"outputs": [
{
"name": "output",
"type": "LATENT",
"links": []
},
{
"name": "denoised_output",
"type": "LATENT",
"links": [
396
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "SamplerCustomAdvanced",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 125,
"type": "LTXVSeparateAVLatent",
"pos": [
1966.1814188052304,
4206.92346606983
],
"size": [
237.68443744811694,
46
],
"flags": {},
"order": 25,
"mode": 0,
"inputs": [
{
"name": "av_latent",
"type": "LATENT",
"link": 396
}
],
"outputs": [
{
"name": "video_latent",
"type": "LATENT",
"links": []
},
{
"name": "audio_latent",
"type": "LATENT",
"links": [
297
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "LTXVSeparateAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 179,
"type": "Reroute",
"pos": [
116.13034082549694,
4969.413326198261
],
"size": [
75,
26
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 403
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
404
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 180,
"type": "Reroute",
"pos": [
2437.7604532508176,
4969.413326198261
],
"size": [
75,
26
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 404
}
],
"outputs": [
{
"name": "",
"type": "IMAGE",
"links": [
405
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 107,
"type": "LTXVConditioning",
"pos": [
943.3923572550591,
4133.37447838507
],
"size": [
270,
86.66666666666667
],
"flags": {},
"order": 18,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 286
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 287
},
{
"name": "frame_rate",
"type": "FLOAT",
"widget": {
"name": "frame_rate"
},
"link": 420
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
254
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
255
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "LTXVConditioning",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
25
]
},
{
"id": 124,
"type": "LTXVAudioVAELoader",
"pos": [
482.4816826527883,
3810.115867718167
],
"size": [
350.9069033720766,
58
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "Audio VAE",
"type": "VAE",
"links": [
281,
327
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.68",
"Node name for S&R": "LTXVAudioVAELoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 133,
"type": "CheckpointLoaderSimple",
"pos": [
482.4816826527883,
3633.802881360453
],
"size": [
350.9069033720766,
98
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
331
]
},
{
"name": "CLIP",
"type": "CLIP",
"links": []
},
{
"name": "VAE",
"type": "VAE",
"links": [
408
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CheckpointLoaderSimple",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 183,
"type": "VHS_VideoInfoLoaded",
"pos": [
363.8058684689672,
4822.512598169186
],
"size": [
270,
106
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "video_info",
"type": "VHS_VIDEOINFO",
"link": 419
}
],
"outputs": [
{
"name": "fps🟦",
"type": "FLOAT",
"links": [
420,
421
]
},
{
"name": "frame_count🟦",
"type": "INT",
"links": null
},
{
"name": "duration🟦",
"type": "FLOAT",
"links": null
},
{
"name": "width🟦",
"type": "INT",
"links": null
},
{
"name": "height🟦",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "993082e4f2473bf4acaf06f51e33877a7eb38960",
"Node name for S&R": "VHS_VideoInfoLoaded"
},
"widgets_values": {}
},
{
"id": 141,
"type": "SimpleMath+",
"pos": [
683.5991128620132,
4822.512598169186
],
"size": [
210,
98
],
"flags": {},
"order": 19,
"mode": 0,
"inputs": [
{
"name": "a",
"shape": 7,
"type": "*",
"link": 421
},
{
"name": "b",
"shape": 7,
"type": "*",
"link": null
},
{
"name": "c",
"shape": 7,
"type": "*",
"link": null
}
],
"outputs": [
{
"name": "INT",
"type": "INT",
"links": [
422
]
},
{
"name": "FLOAT",
"type": "FLOAT",
"links": []
}
],
"properties": {
"cnr_id": "comfyui_essentials",
"ver": "9d9f4bedfc9f0321c19faf71855e228c93bd0dc9",
"Node name for S&R": "SimpleMath+"
},
"widgets_values": [
"a"
]
},
{
"id": 106,
"type": "LTXVEmptyLatentAudio",
"pos": [
943.3923572550591,
4794.093069445997
],
"size": [
270,
120
],
"flags": {},
"order": 22,
"mode": 0,
"inputs": [
{
"name": "audio_vae",
"type": "VAE",
"link": 281
},
{
"name": "frames_number",
"type": "INT",
"widget": {
"name": "frames_number"
},
"link": 282
},
{
"name": "frame_rate",
"type": "INT",
"widget": {
"name": "frame_rate"
},
"link": 422
}
],
"outputs": [
{
"name": "Latent",
"type": "LATENT",
"links": [
415
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.68",
"Node name for S&R": "LTXVEmptyLatentAudio",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
97,
25,
1
]
},
{
"id": 175,
"type": "ResizeImageMaskNode",
"pos": [
370.4134155453869,
4470.340454888286
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 378
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
407
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale to multiple",
64,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 181,
"type": "VAEEncodeTiled",
"pos": [
943.3923572550591,
4470.340454888286
],
"size": [
270,
150
],
"flags": {},
"order": 20,
"mode": 0,
"inputs": [
{
"name": "pixels",
"type": "IMAGE",
"link": 407
},
{
"name": "vae",
"type": "VAE",
"link": 408
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
423
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.2",
"Node name for S&R": "VAEEncodeTiled"
},
"widgets_values": [
512,
64,
4096,
8
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 165,
"type": "ResizeImageMaskNode",
"pos": [
80.01734313522775,
4470.340454888286
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 397
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
378
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale total pixels",
1,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 115,
"type": "RandomNoise",
"pos": [
1328.7113717033576,
3964.7718505827065
],
"size": [
270,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"type": "NOISE",
"links": [
259
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "RandomNoise",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
1234,
"fixed"
]
},
{
"id": 177,
"type": "VHS_LoadVideo",
"pos": [
-363.4537496361186,
4470.340454888286
],
"size": [
399.96071998098455,
537.2585859181811
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
397,
403
]
},
{
"name": "frame_count",
"type": "INT",
"links": [
402
]
},
{
"name": "audio",
"type": "AUDIO",
"links": null
},
{
"name": "video_info",
"type": "VHS_VIDEOINFO",
"links": [
419
]
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "993082e4f2473bf4acaf06f51e33877a7eb38960",
"Node name for S&R": "VHS_LoadVideo"
},
"widgets_values": {
"video": "13028231_1920_1080_60fps.mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 121,
"skip_first_frames": 0,
"select_every_nth": 1,
"format": "None",
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "13028231_1920_1080_60fps.mp4",
"type": "input",
"format": "video/mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 121,
"skip_first_frames": 0,
"select_every_nth": 1
}
}
},
"color": "#232",
"bgcolor": "#353"
},
{
"id": 110,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
4225.4796800153135
],
"size": [
403.50317378836485,
117.09155367536096
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 288
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
287
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
""
]
},
{
"id": 121,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
3982.090817803126
],
"size": [
403.50317378836485,
178.09168459401417
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 289
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
286
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"A rally car on a muddy dirt track."
]
}
],
"links": [
[
254,
107,
0,
129,
1,
"CONDITIONING"
],
[
255,
107,
1,
129,
2,
"CONDITIONING"
],
[
259,
115,
0,
113,
0,
"NOISE"
],
[
260,
129,
0,
113,
1,
"GUIDER"
],
[
261,
137,
0,
113,
2,
"SAMPLER"
],
[
263,
109,
0,
113,
4,
"LATENT"
],
[
281,
124,
0,
106,
0,
"VAE"
],
[
282,
112,
0,
106,
1,
"INT"
],
[
286,
121,
0,
107,
0,
"CONDITIONING"
],
[
287,
110,
0,
107,
1,
"CONDITIONING"
],
[
288,
99,
0,
110,
0,
"CLIP"
],
[
289,
99,
0,
121,
0,
"CLIP"
],
[
297,
125,
1,
128,
0,
"LATENT"
],
[
314,
128,
0,
140,
1,
"AUDIO"
],
[
326,
145,
0,
128,
1,
"VAE"
],
[
327,
124,
0,
145,
0,
"VAE"
],
[
331,
133,
0,
134,
0,
"MODEL"
],
[
364,
134,
0,
129,
0,
"MODEL"
],
[
367,
164,
0,
113,
3,
"SIGMAS"
],
[
368,
134,
0,
164,
0,
"MODEL"
],
[
378,
165,
0,
175,
0,
"IMAGE"
],
[
396,
113,
1,
125,
0,
"LATENT"
],
[
397,
177,
0,
165,
0,
"IMAGE"
],
[
402,
177,
1,
112,
0,
"INT"
],
[
403,
177,
0,
179,
0,
"IMAGE"
],
[
404,
179,
0,
180,
0,
"IMAGE"
],
[
405,
180,
0,
140,
0,
"IMAGE"
],
[
407,
175,
0,
181,
0,
"IMAGE"
],
[
408,
133,
2,
181,
1,
"VAE"
],
[
415,
106,
0,
109,
1,
"LATENT"
],
[
419,
177,
3,
183,
0,
"VHS_VIDEOINFO"
],
[
420,
183,
0,
107,
2,
"FLOAT"
],
[
421,
183,
0,
141,
0,
"FLOAT"
],
[
422,
141,
0,
106,
2,
"INT"
],
[
423,
181,
0,
109,
0,
"LATENT"
]
],
"groups": [
{
"id": 16,
"title": "Decode",
"bounding": [
1948.0534441916896,
3480.427442169179,
1373.2340850148612,
1565.6946883230557
],
"color": "#8A8",
"font_size": 24,
"flags": {}
},
{
"id": 17,
"title": "video2audio",
"bounding": [
-445.60695699450537,
3480.6238029444626,
2383.948470874875,
1565.7003022187969
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
}
],
"config": {},
"extra": {
"ds": {
"scale": 0.8759907300000329,
"offset": [
568.361252481493,
-3106.1051527020004
]
},
"frontendVersion": "1.38.2",
"workflowRendererVersion": "LG",
"prompt": {
"1": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "CheckpointLoaderSimple",
"_meta": {
"title": "Load Checkpoint"
}
},
"2": {
"inputs": {
"gemma_path": "gemma-3-12b-it-qat-q4_0-unquantized_readout_proj/model/model.safetensors",
"ltxv_path": "ltx-av-step-1751000_vocoder_24K.safetensors",
"max_length": 1024
},
"class_type": "LTXVGemmaCLIPModelLoader",
"_meta": {
"title": "🅛🅣🅧 Gemma 3 Model Loader"
}
},
"3": {
"inputs": {
"text": "A medium close-up shot features a Caucasian man with a closely shaven head and face, wearing a black baseball cap with \"PNTR\" in white letters on the front, and a dark grey t-shirt with \"JUST DO IT\" visible across his chest. A small black microphone is clipped to his shirt collar. He is positioned slightly to the left of the frame, looking intently downwards and to his right, his eyes focused off-camera. His facial expression is one of deep concentration, with his brow slightly furrowed. As he looks down, a quick sniff sound is heard, and then he speaks with a deep male voice and a slightly frustrated tone, saying, \"I think it's so bad.\" The camera remains static throughout, maintaining a shallow depth of field, which keeps the man in sharp focus while the background is softly blurred, revealing a light-colored wall with white wooden shelving or trim, and a partially open white wooden door on the right. After a brief pause, another short, audible sniff is heard. The man then continues to speak, his voice maintaining the same quality, as he states, \"So bad. So bad.\" He elaborates further, emphasizing his point with a final statement, \"This got to be, it's got to be the worst tool I've ever seen.\"",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"4": {
"inputs": {
"text": "blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, unreadable text on shirt or hat, incorrect lettering on cap (“PNTR”), incorrect t-shirt slogan (“JUST DO IT”), missing microphone, misplaced microphone, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, smiling, laughing, exaggerated sadness, wrong gaze direction, eyes looking at camera, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, missing sniff sounds, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, missing door or shelves, missing shallow depth of field, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"8": {
"inputs": {
"sampler_name": "euler"
},
"class_type": "KSamplerSelect",
"_meta": {
"title": "KSamplerSelect"
}
},
"9": {
"inputs": {
"steps": 20,
"max_shift": 2.05,
"base_shift": 0.95,
"stretch": true,
"terminal": 0.1,
"latent": [
"28",
0
]
},
"class_type": "LTXVScheduler",
"_meta": {
"title": "LTXVScheduler"
}
},
"11": {
"inputs": {
"noise_seed": 10
},
"class_type": "RandomNoise",
"_meta": {
"title": "RandomNoise"
}
},
"12": {
"inputs": {
"samples": [
"29",
0
],
"vae": [
"1",
2
]
},
"class_type": "VAEDecode",
"_meta": {
"title": "VAE Decode"
}
},
"13": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "LTXVAudioVAELoader",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Loader"
}
},
"14": {
"inputs": {
"samples": [
"29",
1
],
"audio_vae": [
"13",
0
]
},
"class_type": "LTXVAudioVAEDecode",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Decode"
}
},
"15": {
"inputs": {
"frame_rate": [
"23",
0
],
"loop_count": 0,
"filename_prefix": "AnimateDiff",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"images": [
"12",
0
],
"audio": [
"14",
0
]
},
"class_type": "VHS_VideoCombine",
"_meta": {
"title": "Video Combine 🎥🅥🅗🅢"
}
},
"17": {
"inputs": {
"skip_blocks": "29",
"model": [
"28",
1
],
"positive": [
"22",
0
],
"negative": [
"22",
1
],
"parameters": [
"18",
0
]
},
"class_type": "MultimodalGuider",
"_meta": {
"title": "🅛🅣🅧 Multimodal Guider"
}
},
"18": {
"inputs": {
"modality": "VIDEO",
"cfg": 3,
"stg": 0,
"rescale": 0,
"modality_scale": 3,
"parameters": [
"19",
0
]
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"19": {
"inputs": {
"modality": "AUDIO",
"cfg": 7,
"stg": 0,
"rescale": 0,
"modality_scale": 3
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"21": {
"inputs": {
"audioUI": "",
"audio": [
"14",
0
]
},
"class_type": "PreviewAudio",
"_meta": {
"title": "PreviewAudio"
}
},
"22": {
"inputs": {
"frame_rate": [
"23",
0
],
"positive": [
"3",
0
],
"negative": [
"4",
0
]
},
"class_type": "LTXVConditioning",
"_meta": {
"title": "LTXVConditioning"
}
},
"23": {
"inputs": {
"value": 25
},
"class_type": "FloatConstant",
"_meta": {
"title": "Float Constant"
}
},
"26": {
"inputs": {
"frames_number": [
"27",
0
],
"frame_rate": [
"42",
0
],
"batch_size": 1
},
"class_type": "LTXVEmptyLatentAudio",
"_meta": {
"title": "🅛🅣🅧 LTXV Empty Latent Audio"
}
},
"27": {
"inputs": {
"value": 105
},
"class_type": "INTConstant",
"_meta": {
"title": "INT Constant"
}
},
"28": {
"inputs": {
"video_latent": [
"43",
0
],
"audio_latent": [
"26",
0
],
"model": [
"44",
0
]
},
"class_type": "LTXVConcatAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Concat AV Latent"
}
},
"29": {
"inputs": {
"av_latent": [
"41",
0
],
"model": [
"28",
1
]
},
"class_type": "LTXVSeparateAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Separate AV Latent"
}
},
"41": {
"inputs": {
"noise": [
"11",
0
],
"guider": [
"17",
0
],
"sampler": [
"8",
0
],
"sigmas": [
"9",
0
],
"latent_image": [
"28",
0
]
},
"class_type": "SamplerCustomAdvanced",
"_meta": {
"title": "SamplerCustomAdvanced"
}
},
"42": {
"inputs": {
"a": [
"23",
0
]
},
"class_type": "CM_FloatToInt",
"_meta": {
"title": "FloatToInt"
}
},
"43": {
"inputs": {
"width": 768,
"height": 512,
"length": [
"27",
0
],
"batch_size": 1
},
"class_type": "EmptyLTXVLatentVideo",
"_meta": {
"title": "EmptyLTXVLatentVideo"
}
},
"44": {
"inputs": {
"torch_compile": true,
"disable_backup": false,
"model": [
"1",
0
]
},
"class_type": "LTXVSequenceParallelMultiGPUPatcher",
"_meta": {
"title": "LTXVSequenceParallelMultiGPUPatcher"
}
},
"45": {
"inputs": {
"frame_idx": 0,
"strength": 1
},
"class_type": "LTXVAddGuide",
"_meta": {
"title": "LTXVAddGuide"
}
}
},
"comfy_fork_version": "feature/av_inference@a6994ed1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
Output Example
Caution: Sound may be loud.
Temporal inpainting
This is temporal inpainting (= repairing only a part of the video). Think of it like VACE Extension.

{
"id": "7f5e0c56-93b4-4937-b7f2-efd0f1853e33",
"revision": 0,
"last_node_id": 189,
"last_link_id": 424,
"nodes": [
{
"id": 128,
"type": "LTXVAudioVAEDecode",
"pos": [
2255.549804173873,
4304.017466043185
],
"size": [
257.2388542190106,
46
],
"flags": {},
"order": 28,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 297
},
{
"label": "Audio VAE",
"name": "audio_vae",
"type": "VAE",
"link": 326
}
],
"outputs": [
{
"name": "Audio",
"type": "AUDIO",
"links": [
314
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVAudioVAEDecode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 127,
"type": "VAEDecodeTiled",
"pos": [
2255.549804173873,
4078.7867752553584
],
"size": [
257.2388542190106,
150
],
"flags": {},
"order": 27,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 302
},
{
"name": "vae",
"type": "VAE",
"link": 325
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
313
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "VAEDecodeTiled",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
512,
64,
4096,
8
]
},
{
"id": 99,
"type": "LTXAVTextEncoderLoader",
"pos": [
37.989254913013944,
4138.954135935162
],
"size": [
325.4143077141439,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
288,
289
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXAVTextEncoderLoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
},
{
"name": "gemma_3_12B_it.safetensors",
"url": "https://huggingface.co/Comfy-Org/ltx-2/resolve/main/split_files/text_encoders/gemma_3_12B_it.safetensors",
"directory": "text_encoders"
}
]
},
"widgets_values": [
"gemma_3_12B_it_fp8_scaled.safetensors",
"LTX-2\\ltx-2-19b-dev-fp8.safetensors",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 137,
"type": "KSamplerSelect",
"pos": [
1328.7113717033576,
4286.285225429741
],
"size": [
270,
68.88020833333334
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"type": "SAMPLER",
"links": [
261
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "KSamplerSelect",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"euler"
]
},
{
"id": 129,
"type": "CFGGuider",
"pos": [
1328.7113717033576,
4113.19520467289
],
"size": [
270,
106.66666666666667
],
"flags": {},
"order": 21,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 364
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 254
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 255
}
],
"outputs": [
{
"name": "GUIDER",
"type": "GUIDER",
"links": [
260
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.64",
"Node name for S&R": "CFGGuider",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
1
]
},
{
"id": 109,
"type": "LTXVConcatAVLatent",
"pos": [
1328.7113717033576,
4594.012141943443
],
"size": [
270,
46
],
"flags": {},
"order": 24,
"mode": 0,
"inputs": [
{
"name": "video_latent",
"type": "LATENT",
"link": 391
},
{
"name": "audio_latent",
"type": "LATENT",
"link": 392
}
],
"outputs": [
{
"name": "latent",
"type": "LATENT",
"links": [
263
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "LTXVConcatAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 164,
"type": "BasicScheduler",
"pos": [
1328.7113717033576,
4421.5887878532585
],
"size": [
270,
106
],
"flags": {},
"order": 15,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 368
}
],
"outputs": [
{
"name": "SIGMAS",
"type": "SIGMAS",
"links": [
367
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "BasicScheduler"
},
"widgets_values": [
"simple",
8,
1
]
},
{
"id": 151,
"type": "MarkdownNote",
"pos": [
57.44161655463637,
3561.647385437717
],
"size": [
399.0254035325611,
339.2647673465967
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n - checkpoints\n - [ltx-2-19b-dev-fp8.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev-fp8.safetensors)\n - latent_upscale_models\n - [ltx-2-spatial-upscaler-x2-1.0.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-spatial-upscaler-x2-1.0.safetensors)\n - loras\n - [ltx-2-19b-distilled-lora-384.safetensors](https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-distilled-lora-384.safetensors)\n - text_encoders\n - [gemma_3_12B_it_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/ltx-2/blob/main/split_files/text_encoders/gemma_3_12B_it_fp8_scaled.safetensors)\n\n```text\n📂ComfyUI/\n└── 📂models/\n ├── 📂checkpoints/\n │ └── ltx-2-19b-dev-fp8.safetensors\n ├── 📂latent_upscale_models/\n │ └── ltx-2-spatial-upscaler-x2-1.0.safetensors\n ├── 📂loras/\n │ └── ltx-2-19b-distilled-lora-384.safetensors\n └── 📂text_encoders/\n └── gemma_3_12B_it_fp8_scaled.safetensors\n"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 134,
"type": "LoraLoaderModelOnly",
"pos": [
884.245410818498,
3633.802881360453
],
"size": [
350.9069033720766,
82
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 331
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
364,
368
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.75",
"Node name for S&R": "LoraLoaderModelOnly",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-distilled-lora-384.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-distilled-lora-384.safetensors",
"directory": "loras"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-distilled-lora-384.safetensors",
0.7
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 133,
"type": "CheckpointLoaderSimple",
"pos": [
482.4816826527883,
3633.802881360453
],
"size": [
350.9069033720766,
98
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
331
]
},
{
"name": "CLIP",
"type": "CLIP",
"links": []
},
{
"name": "VAE",
"type": "VAE",
"links": [
342
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CheckpointLoaderSimple",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 113,
"type": "SamplerCustomAdvanced",
"pos": [
1682.2951507877015,
4188.309385581295
],
"size": [
242.12760404770165,
106
],
"flags": {},
"order": 25,
"mode": 0,
"inputs": [
{
"name": "noise",
"type": "NOISE",
"link": 259
},
{
"name": "guider",
"type": "GUIDER",
"link": 260
},
{
"name": "sampler",
"type": "SAMPLER",
"link": 261
},
{
"name": "sigmas",
"type": "SIGMAS",
"link": 367
},
{
"name": "latent_image",
"type": "LATENT",
"link": 263
}
],
"outputs": [
{
"name": "output",
"type": "LATENT",
"links": []
},
{
"name": "denoised_output",
"type": "LATENT",
"links": [
407
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "SamplerCustomAdvanced",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
},
{
"id": 121,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
3982.090817803126
],
"size": [
403.50317378836485,
178.09168459401417
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 289
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
286
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"A medium close-up studio news shot with a static camera and clean broadcast lighting. An anchor sits centered, looking into the lens and speaking calmly in a professional tone. A beautiful white cat enters the foreground from the side and walks across the frame, briefly passing in front of the anchor. The moment she notices the cat, she immediately turns her eyes toward it and says, “What a happy little surprise,” without waiting for it to pass. She keeps a warm smile as the cat continues across, then returns her gaze to the camera. **Audio:** clear English speech with subtle studio room tone, no music.\n"
]
},
{
"id": 165,
"type": "ResizeImageMaskNode",
"pos": [
-285.73515797085037,
4594.012141943443
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 389
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
378
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.7.0",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale total pixels",
1.5,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 125,
"type": "LTXVSeparateAVLatent",
"pos": [
1966.0643834999416,
4208.9895014940075
],
"size": [
237.68443744811694,
46
],
"flags": {},
"order": 26,
"mode": 0,
"inputs": [
{
"name": "av_latent",
"type": "LATENT",
"link": 407
}
],
"outputs": [
{
"name": "video_latent",
"type": "LATENT",
"links": [
302
]
},
{
"name": "audio_latent",
"type": "LATENT",
"links": [
297
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "LTXVSeparateAVLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [],
"color": "#332922",
"bgcolor": "#593930"
},
{
"id": 145,
"type": "Reroute",
"pos": [
2128.1540908483194,
3810.115867718167
],
"size": [
75,
26
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 327
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
326
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 124,
"type": "LTXVAudioVAELoader",
"pos": [
482.4816826527883,
3810.115867718167
],
"size": [
350.9069033720766,
58
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "Audio VAE",
"type": "VAE",
"links": [
327,
394
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.68",
"Node name for S&R": "LTXVAudioVAELoader",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65,
"models": [
{
"name": "ltx-2-19b-dev-fp8.safetensors",
"url": "https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-dev-fp8.safetensors",
"directory": "checkpoints"
}
]
},
"widgets_values": [
"LTX-2\\ltx-2-19b-dev-fp8.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 144,
"type": "Reroute",
"pos": [
2128.1540908483194,
3746.68342826367
],
"size": [
75,
26
],
"flags": {},
"order": 16,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 411
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
325
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 154,
"type": "Reroute",
"pos": [
883.103846226827,
3746.68342826367
],
"size": [
75,
26
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 342
}
],
"outputs": [
{
"name": "",
"type": "VAE",
"links": [
411,
419
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 140,
"type": "VHS_VideoCombine",
"pos": [
2576.044403067737,
3913.0316979939857
],
"size": [
712.6131392034486,
720.9455364941646
],
"flags": {},
"order": 29,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 313
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": 314
},
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "Filenames",
"type": "VHS_FILENAMES",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "8923bd836bdab8b7bbdf4ed104b7d045e70c66e2",
"Node name for S&R": "VHS_VideoCombine"
},
"widgets_values": {
"frame_rate": 24,
"loop_count": 0,
"filename_prefix": "LTX-2",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "LTX-2_00471-audio.mp4",
"subfolder": "",
"type": "output",
"format": "video/h264-mp4",
"frame_rate": 24,
"workflow": "LTX-2_00471.png",
"fullpath": "D:\\AI\\ComfyUI_windows_portable\\ComfyUI\\output\\LTX-2_00471-audio.mp4"
}
}
}
},
{
"id": 175,
"type": "ResizeImageMaskNode",
"pos": [
6.062648386224155,
4594.012141943443
],
"size": [
258.3013455365069,
106
],
"flags": {},
"order": 17,
"mode": 0,
"inputs": [
{
"name": "input",
"type": "IMAGE,MASK",
"link": 378
}
],
"outputs": [
{
"name": "resized",
"type": "IMAGE",
"links": [
401
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.1",
"Node name for S&R": "ResizeImageMaskNode"
},
"widgets_values": [
"scale to multiple",
64,
"area"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 170,
"type": "LTXVPreprocess",
"pos": [
297.8604547432987,
4594.012141943443
],
"size": [
270,
58
],
"flags": {
"collapsed": false
},
"order": 20,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 401
}
],
"outputs": [
{
"name": "output_image",
"type": "IMAGE",
"links": [
418
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.60",
"Node name for S&R": "LTXVPreprocess"
},
"widgets_values": [
33
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 184,
"type": "Reroute",
"pos": [
-285.73515797085037,
4812.210609376599
],
"size": [
75,
26
],
"flags": {},
"order": 13,
"mode": 0,
"inputs": [
{
"name": "",
"type": "*",
"link": 414
}
],
"outputs": [
{
"name": "",
"type": "AUDIO",
"links": [
415
]
}
],
"properties": {
"showOutputText": false,
"horizontal": false
}
},
{
"id": 178,
"type": "LTXVAudioVideoMask",
"pos": [
946,
4594.012141943443
],
"size": [
270,
198
],
"flags": {},
"order": 23,
"mode": 0,
"inputs": [
{
"name": "video_latent",
"shape": 7,
"type": "LATENT",
"link": 420
},
{
"name": "audio_latent",
"shape": 7,
"type": "LATENT",
"link": 395
},
{
"name": "video_fps",
"type": "FLOAT",
"widget": {
"name": "video_fps"
},
"link": 424
}
],
"outputs": [
{
"name": "video_latent",
"type": "LATENT",
"links": [
391
]
},
{
"name": "audio_latent",
"type": "LATENT",
"links": [
392
]
}
],
"properties": {
"cnr_id": "comfyui-kjnodes",
"ver": "02657c3ae1a140bc4d6b6225845a4474b8632ef9",
"Node name for S&R": "LTXVAudioVideoMask"
},
"widgets_values": [
24,
0.5,
4,
0.5,
4,
"pad"
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 107,
"type": "LTXVConditioning",
"pos": [
946,
4132.560597654743
],
"size": [
270,
86.66666666666667
],
"flags": {},
"order": 19,
"mode": 0,
"inputs": [
{
"name": "positive",
"type": "CONDITIONING",
"link": 286
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 287
},
{
"name": "frame_rate",
"type": "FLOAT",
"widget": {
"name": "frame_rate"
},
"link": 423
}
],
"outputs": [
{
"name": "positive",
"type": "CONDITIONING",
"links": [
254
]
},
{
"name": "negative",
"type": "CONDITIONING",
"links": [
255
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "LTXVConditioning",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
24
]
},
{
"id": 110,
"type": "CLIPTextEncode",
"pos": [
429.8854122365001,
4225.4796800153135
],
"size": [
403.50317378836485,
117.09155367536096
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 288
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
287
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
""
]
},
{
"id": 115,
"type": "RandomNoise",
"pos": [
1328.7113717033576,
3964.7718505827065
],
"size": [
270,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"type": "NOISE",
"links": [
259
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.56",
"Node name for S&R": "RandomNoise",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
"hasSecondTab": false,
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
12345,
"fixed"
]
},
{
"id": 189,
"type": "VHS_VideoInfoLoaded",
"pos": [
577.1493282123649,
4410.0513711077965
],
"size": [
256.2392578125,
106
],
"flags": {},
"order": 14,
"mode": 0,
"inputs": [
{
"name": "video_info",
"type": "VHS_VIDEOINFO",
"link": 422
}
],
"outputs": [
{
"name": "fps🟦",
"type": "FLOAT",
"links": [
423,
424
]
},
{
"name": "frame_count🟦",
"type": "INT",
"links": null
},
{
"name": "duration🟦",
"type": "FLOAT",
"links": null
},
{
"name": "width🟦",
"type": "INT",
"links": null
},
{
"name": "height🟦",
"type": "INT",
"links": null
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "993082e4f2473bf4acaf06f51e33877a7eb38960",
"Node name for S&R": "VHS_VideoInfoLoaded"
},
"widgets_values": {}
},
{
"id": 177,
"type": "VHS_LoadVideo",
"pos": [
-757.0255252733103,
4455.591440363906
],
"size": [
434.96358104799754,
557.7611487930216
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "meta_batch",
"shape": 7,
"type": "VHS_BatchManager",
"link": null
},
{
"name": "vae",
"shape": 7,
"type": "VAE",
"link": null
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
389
]
},
{
"name": "frame_count",
"type": "INT",
"links": null
},
{
"name": "audio",
"type": "AUDIO",
"links": [
414
]
},
{
"name": "video_info",
"type": "VHS_VIDEOINFO",
"links": [
422
]
}
],
"properties": {
"cnr_id": "comfyui-videohelpersuite",
"ver": "993082e4f2473bf4acaf06f51e33877a7eb38960",
"Node name for S&R": "VHS_LoadVideo"
},
"widgets_values": {
"video": "Interview_with_Dr._Eugene_Parker.mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 121,
"skip_first_frames": 0,
"select_every_nth": 1,
"format": "None",
"videopreview": {
"hidden": false,
"paused": false,
"params": {
"filename": "Interview_with_Dr._Eugene_Parker.mp4",
"type": "input",
"format": "video/mp4",
"force_rate": 24,
"custom_width": 0,
"custom_height": 0,
"frame_load_cap": 121,
"skip_first_frames": 0,
"select_every_nth": 1
}
}
},
"color": "#232",
"bgcolor": "#353"
},
{
"id": 179,
"type": "LTXVAudioVAEEncode",
"pos": [
598.5265901299719,
4812.210609376599
],
"size": [
234.8619958948931,
46
],
"flags": {},
"order": 18,
"mode": 0,
"inputs": [
{
"name": "audio",
"type": "AUDIO",
"link": 415
},
{
"label": "Audio VAE",
"name": "audio_vae",
"type": "VAE",
"link": 394
}
],
"outputs": [
{
"name": "Audio Latent",
"type": "LATENT",
"links": [
395
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.2",
"Node name for S&R": "LTXVAudioVAEEncode"
},
"widgets_values": [],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 187,
"type": "VAEEncodeTiled",
"pos": [
598.5265901299719,
4594.012141943443
],
"size": [
234.8619958948931,
151.71534874781446
],
"flags": {},
"order": 22,
"mode": 0,
"inputs": [
{
"name": "pixels",
"type": "IMAGE",
"link": 418
},
{
"name": "vae",
"type": "VAE",
"link": 419
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
420
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.9.2",
"Node name for S&R": "VAEEncodeTiled"
},
"widgets_values": [
512,
64,
4096,
8
],
"color": "#232",
"bgcolor": "#353"
}
],
"links": [
[
254,
107,
0,
129,
1,
"CONDITIONING"
],
[
255,
107,
1,
129,
2,
"CONDITIONING"
],
[
259,
115,
0,
113,
0,
"NOISE"
],
[
260,
129,
0,
113,
1,
"GUIDER"
],
[
261,
137,
0,
113,
2,
"SAMPLER"
],
[
263,
109,
0,
113,
4,
"LATENT"
],
[
286,
121,
0,
107,
0,
"CONDITIONING"
],
[
287,
110,
0,
107,
1,
"CONDITIONING"
],
[
288,
99,
0,
110,
0,
"CLIP"
],
[
289,
99,
0,
121,
0,
"CLIP"
],
[
297,
125,
1,
128,
0,
"LATENT"
],
[
302,
125,
0,
127,
0,
"LATENT"
],
[
313,
127,
0,
140,
0,
"IMAGE"
],
[
314,
128,
0,
140,
1,
"AUDIO"
],
[
325,
144,
0,
127,
1,
"VAE"
],
[
326,
145,
0,
128,
1,
"VAE"
],
[
327,
124,
0,
145,
0,
"VAE"
],
[
331,
133,
0,
134,
0,
"MODEL"
],
[
342,
133,
2,
154,
0,
"VAE"
],
[
364,
134,
0,
129,
0,
"MODEL"
],
[
367,
164,
0,
113,
3,
"SIGMAS"
],
[
368,
134,
0,
164,
0,
"MODEL"
],
[
378,
165,
0,
175,
0,
"IMAGE"
],
[
389,
177,
0,
165,
0,
"IMAGE"
],
[
391,
178,
0,
109,
0,
"LATENT"
],
[
392,
178,
1,
109,
1,
"LATENT"
],
[
394,
124,
0,
179,
1,
"VAE"
],
[
395,
179,
0,
178,
1,
"LATENT"
],
[
401,
175,
0,
170,
0,
"IMAGE"
],
[
407,
113,
1,
125,
0,
"LATENT"
],
[
411,
154,
0,
144,
0,
"VAE"
],
[
414,
177,
2,
184,
0,
"AUDIO"
],
[
415,
184,
0,
179,
0,
"AUDIO"
],
[
418,
170,
0,
187,
0,
"IMAGE"
],
[
419,
154,
0,
187,
1,
"VAE"
],
[
420,
187,
0,
178,
0,
"LATENT"
],
[
422,
177,
3,
189,
0,
"VHS_VIDEOINFO"
],
[
423,
189,
0,
107,
2,
"FLOAT"
],
[
424,
189,
0,
178,
2,
"FLOAT"
]
],
"groups": [
{
"id": 16,
"title": "Decode",
"bounding": [
1948.0816493337654,
3483.2196798301707,
1393.0463630521247,
1560.0340374552657
],
"color": "#8A8",
"font_size": 24,
"flags": {}
},
{
"id": 17,
"title": "Temporal inpainting",
"bounding": [
-775.4316450251598,
3483.5437530790978,
2708.1041475876236,
1560.6180572938065
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
}
],
"config": {},
"extra": {
"ds": {
"scale": 0.44952175459436755,
"offset": [
1522.1768709958478,
-2818.7419041206153
]
},
"frontendVersion": "1.38.2",
"workflowRendererVersion": "LG",
"prompt": {
"1": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "CheckpointLoaderSimple",
"_meta": {
"title": "Load Checkpoint"
}
},
"2": {
"inputs": {
"gemma_path": "gemma-3-12b-it-qat-q4_0-unquantized_readout_proj/model/model.safetensors",
"ltxv_path": "ltx-av-step-1751000_vocoder_24K.safetensors",
"max_length": 1024
},
"class_type": "LTXVGemmaCLIPModelLoader",
"_meta": {
"title": "🅛🅣🅧 Gemma 3 Model Loader"
}
},
"3": {
"inputs": {
"text": "A medium close-up shot features a Caucasian man with a closely shaven head and face, wearing a black baseball cap with \"PNTR\" in white letters on the front, and a dark grey t-shirt with \"JUST DO IT\" visible across his chest. A small black microphone is clipped to his shirt collar. He is positioned slightly to the left of the frame, looking intently downwards and to his right, his eyes focused off-camera. His facial expression is one of deep concentration, with his brow slightly furrowed. As he looks down, a quick sniff sound is heard, and then he speaks with a deep male voice and a slightly frustrated tone, saying, \"I think it's so bad.\" The camera remains static throughout, maintaining a shallow depth of field, which keeps the man in sharp focus while the background is softly blurred, revealing a light-colored wall with white wooden shelving or trim, and a partially open white wooden door on the right. After a brief pause, another short, audible sniff is heard. The man then continues to speak, his voice maintaining the same quality, as he states, \"So bad. So bad.\" He elaborates further, emphasizing his point with a final statement, \"This got to be, it's got to be the worst tool I've ever seen.\"",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"4": {
"inputs": {
"text": "blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, unreadable text on shirt or hat, incorrect lettering on cap (“PNTR”), incorrect t-shirt slogan (“JUST DO IT”), missing microphone, misplaced microphone, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, smiling, laughing, exaggerated sadness, wrong gaze direction, eyes looking at camera, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, missing sniff sounds, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, missing door or shelves, missing shallow depth of field, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.",
"clip": [
"2",
0
]
},
"class_type": "CLIPTextEncode",
"_meta": {
"title": "CLIP Text Encode (Prompt)"
}
},
"8": {
"inputs": {
"sampler_name": "euler"
},
"class_type": "KSamplerSelect",
"_meta": {
"title": "KSamplerSelect"
}
},
"9": {
"inputs": {
"steps": 20,
"max_shift": 2.05,
"base_shift": 0.95,
"stretch": true,
"terminal": 0.1,
"latent": [
"28",
0
]
},
"class_type": "LTXVScheduler",
"_meta": {
"title": "LTXVScheduler"
}
},
"11": {
"inputs": {
"noise_seed": 10
},
"class_type": "RandomNoise",
"_meta": {
"title": "RandomNoise"
}
},
"12": {
"inputs": {
"samples": [
"29",
0
],
"vae": [
"1",
2
]
},
"class_type": "VAEDecode",
"_meta": {
"title": "VAE Decode"
}
},
"13": {
"inputs": {
"ckpt_name": "ltx-av-step-1751000_vocoder_24K.safetensors"
},
"class_type": "LTXVAudioVAELoader",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Loader"
}
},
"14": {
"inputs": {
"samples": [
"29",
1
],
"audio_vae": [
"13",
0
]
},
"class_type": "LTXVAudioVAEDecode",
"_meta": {
"title": "🅛🅣🅧 LTXV Audio VAE Decode"
}
},
"15": {
"inputs": {
"frame_rate": [
"23",
0
],
"loop_count": 0,
"filename_prefix": "AnimateDiff",
"format": "video/h264-mp4",
"pix_fmt": "yuv420p",
"crf": 19,
"save_metadata": true,
"trim_to_audio": false,
"pingpong": false,
"save_output": true,
"images": [
"12",
0
],
"audio": [
"14",
0
]
},
"class_type": "VHS_VideoCombine",
"_meta": {
"title": "Video Combine 🎥🅥🅗🅢"
}
},
"17": {
"inputs": {
"skip_blocks": "29",
"model": [
"28",
1
],
"positive": [
"22",
0
],
"negative": [
"22",
1
],
"parameters": [
"18",
0
]
},
"class_type": "MultimodalGuider",
"_meta": {
"title": "🅛🅣🅧 Multimodal Guider"
}
},
"18": {
"inputs": {
"modality": "VIDEO",
"cfg": 3,
"stg": 0,
"rescale": 0,
"modality_scale": 3,
"parameters": [
"19",
0
]
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"19": {
"inputs": {
"modality": "AUDIO",
"cfg": 7,
"stg": 0,
"rescale": 0,
"modality_scale": 3
},
"class_type": "GuiderParameters",
"_meta": {
"title": "🅛🅣🅧 Guider Parameters"
}
},
"21": {
"inputs": {
"audioUI": "",
"audio": [
"14",
0
]
},
"class_type": "PreviewAudio",
"_meta": {
"title": "PreviewAudio"
}
},
"22": {
"inputs": {
"frame_rate": [
"23",
0
],
"positive": [
"3",
0
],
"negative": [
"4",
0
]
},
"class_type": "LTXVConditioning",
"_meta": {
"title": "LTXVConditioning"
}
},
"23": {
"inputs": {
"value": 25
},
"class_type": "FloatConstant",
"_meta": {
"title": "Float Constant"
}
},
"26": {
"inputs": {
"frames_number": [
"27",
0
],
"frame_rate": [
"42",
0
],
"batch_size": 1
},
"class_type": "LTXVEmptyLatentAudio",
"_meta": {
"title": "🅛🅣🅧 LTXV Empty Latent Audio"
}
},
"27": {
"inputs": {
"value": 105
},
"class_type": "INTConstant",
"_meta": {
"title": "INT Constant"
}
},
"28": {
"inputs": {
"video_latent": [
"43",
0
],
"audio_latent": [
"26",
0
],
"model": [
"44",
0
]
},
"class_type": "LTXVConcatAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Concat AV Latent"
}
},
"29": {
"inputs": {
"av_latent": [
"41",
0
],
"model": [
"28",
1
]
},
"class_type": "LTXVSeparateAVLatent",
"_meta": {
"title": "🅛🅣🅧 LTXV Separate AV Latent"
}
},
"41": {
"inputs": {
"noise": [
"11",
0
],
"guider": [
"17",
0
],
"sampler": [
"8",
0
],
"sigmas": [
"9",
0
],
"latent_image": [
"28",
0
]
},
"class_type": "SamplerCustomAdvanced",
"_meta": {
"title": "SamplerCustomAdvanced"
}
},
"42": {
"inputs": {
"a": [
"23",
0
]
},
"class_type": "CM_FloatToInt",
"_meta": {
"title": "FloatToInt"
}
},
"43": {
"inputs": {
"width": 768,
"height": 512,
"length": [
"27",
0
],
"batch_size": 1
},
"class_type": "EmptyLTXVLatentVideo",
"_meta": {
"title": "EmptyLTXVLatentVideo"
}
},
"44": {
"inputs": {
"torch_compile": true,
"disable_backup": false,
"model": [
"1",
0
]
},
"class_type": "LTXVSequenceParallelMultiGPUPatcher",
"_meta": {
"title": "LTXVSequenceParallelMultiGPUPatcher"
}
},
"45": {
"inputs": {
"frame_idx": 0,
"strength": 1
},
"class_type": "LTXVAddGuide",
"_meta": {
"title": "LTXVAddGuide"
}
}
},
"comfy_fork_version": "feature/av_inference@a6994ed1",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
Basically it is video2video. Mask only the "time range you want to remake" of the video and regenerate only that section.
(1) Input video (= existing video latent)
[ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ 🖼️ ]
(2) Specify section to remake (start_time ~ end_time)
e.g.: 2.0s ~ 4.0s
[ 🖼️ 🖼️ | 🖼️ 🖼️ 🖼️ | 🖼️ 🖼️ 🖼️ ]
^ ^
start_time end_time
(3) Mask only the specified section
[ 0 0 | 1 1 1 | 0 0 0 ]
└─── Mask ───┘
(4) Regenerate only the masked section
[ 🖼️ 🖼️ | ✨ ✨ ✨ | 🖼️ 🖼️ 🖼️ ]
└─ inpaint ─┘
Structurally, it is difficult to assemble a two-stage workflow (low resolution -> Hires.fix), so we generate at 1.5MP from the beginning.
1. LTXVAudioVideoMask
Specify the time range you want to inpaint.
video_fps: Basically set to same fps as input videovideo_start_time: Inpainting start (seconds)video_end_time: Inpainting end (seconds)audio_start_time/audio_end_time: Basically same as video, but by shifting them you can do "edit video only while keeping sound" or "edit sound only while keeping video"
Can also Extend
If you specify end_time beyond the length of the input video, the overlapping part is newly generated, resulting in extended video.
e.g.: If input is "2 seconds"
- Remake 2.0s → 5.0s (= newly generate after 2 seconds to extend)
start_time = 2.0/end_time = 5.0
Output Example
IC-LoRA
IC-LoRA creates video from control signals such as pose, depth map, edges, etc.
Model Download
- loras
📂ComfyUI/
└── 📂models/
└── 📂loras/
├── ltx-2-19b-ic-lora-canny-control.safetensors
├── ltx-2-19b-ic-lora-depth-control.safetensors
├── ltx-2-19b-ic-lora-detailer.safetensors
└── ltx-2-19b-ic-lora-pose-control.safetensors
IC-LoRA (Pose)
Add control video based on text2video.

1. Resize Control Video
Align to the same ratio and resolution as the video to be generated.
- Resize to arbitrary size (here 1.5MP).
- Width and height must be multiples of 64.
- Input the width/height of the image halved vertically and horizontally into
EmptyLTXVLatentVideo.
2. Generate Pose Image
Create stick figure images from video.
- Extract pose with OpenPose or DWPose.
3. LTXVAddGuide
Put the control signal (pose video) into conditioning.
- Input the pose video created earlier into
LTXVAddGuide.
4. Apply IC-LoRA
Apply IC-LoRA (Pose this time) and sample.
- IC-LoRA is designed assuming
strength = 1.0. - In this workflow, IC-LoRA is applied only to the 1st stage sampling.
- Making the 2nd stage focus on refining results in a cleaner video.
5. LTXVCropGuides
If you decode once after the 1st stage is finished, it's easy to understand, but the generated video is mixed with the pose video created earlier.
- Focus on the latter half: Before LTXVCropGuides.mp4
This is exactly how IC-LoRA works, but since it is unnecessary for the output, remove it before entering the 2nd stage.
LTXVCropGuidesis a node for removing control images from latent / conditioning.
You can use it in the same way by changing Pose Image / IC-LoRA to Canny / Depth. Note that using basically one type is recommended. (Applying Pose and Depth at the same time is not recommended.)
Output Example
IC-LoRA (Pose) + image2video
You cannot stack multiple IC-LoRAs, but you can combine with image2video or audio2video.

What it's doing is just combining IC-LoRA (Pose) above with image2video.
- Note that
LTXVAddGuideis connected afterLTXVImgToVideoInplace.- Provide control won't work if reversed.
- This is strictly image2video, not reference2video like VACE.
- Since the input image is "an image fixed as the 1st frame", if it deviates significantly from the 1st frame of the pose video, you won't get the expected video.
- Create an "image close to the 1st frame of pose" with ControlNet or Qwen-Image-Edit etc. in advance.
Output Example
IC-LoRA (Detailer)
IC-LoRA (Detailer) restores details and textures of low-resolution videos.
Install Custom Nodes
-
You can run it with just core nodes, but custom nodes are required to handle large resolutions / long duration videos.

Basically it is video2video with IC-LoRA(Detailer) applied.
- 🟦 First, resize the input video to the desired final size.
- Use
🅛🅣🅧 LTXV Looping Samplerinstead ofSamplerCustomAdvanced.- This works like Ultimate SD upscale, processing time/space in tiles, allowing you to save VRAM.
- In this workflow, only the time direction is tiled.
- It does not use distilled LoRA, but generates in 3 steps.
Output Example