Hires.fix

Hires.fixとは？

かっこいい名前ですが、やっていることはそこまで複雑ではありません。

まず text2image で画像を生成し、その画像を 1.5〜2 倍にリサイズします。
その拡大画像を image2image に入れて、もう一度描き直してもらう。

この手順をひとまとめにしただけのものです。

なぜこの手法が生まれたか？

Stable Diffusion 1.5 で推奨されていた解像度は 512 × 512px で、大きな画像は生成できませんでした。

ここには大きく 2 つの理由があります。

計算コストの問題

解像度が上がるほど、必要な VRAM と計算時間は一気に増えます。
画像生成が登場した当時は、いまほど最適化も進んでおらず、大きな画像をいきなり生成するのはかなり重い処理でした。

学習に使った画像サイズの問題

より本質的なのは、モデルが 「どのサイズの画像で学習されたか」 です。

Stable Diffusion 1.5 は、ほぼ 512 × 512px の画像だけで学習されています。
つまり、このサイズ付近の絵を描くのは得意ですが、それ以外の解像度はそもそも練習していません。

漫画家に、いきなり体育館の壁いっぱいに絵を描いてもらうとしましょう。
ふだん原稿用紙サイズで描いているので、おそらくその感覚のまま、小さなコマやキャラをびっしり並べてしまうでしょう。

「壁一面を使って巨大な1枚絵を描く」という描き方自体を練習していない、そもそも発想すら湧かないのです。

Hires.fixの誕生

そこで、まずモデルが得意な 512 × 512px 付近で描いてもらい、それを拡大、拡大した画像を下書きにしてもう一度描き直してもらう。

この二段構えのやり方が生まれました。
この「モデルの得意な解像度を一度経由してから高解像度に持ち上げる」工夫が、Hires.fix の背景にある考え方です。

ベーシックな方法

SD1.5_Hires.fix.json

{
  "id": "8b9f7796-0873-4025-be3c-0f997f67f866",
  "revision": 0,
  "last_node_id": 17,
  "last_link_id": 37,
  "nodes": [
    {
      "id": 5,
      "type": "EmptyLatentImage",
      "pos": [
        582.1350317382813,
        606.5799999999999
      ],
      "size": [
        244.81999999999994,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            2
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyLatentImage"
      },
      "widgets_values": [
        512,
        512,
        1
      ]
    },
    {
      "id": 15,
      "type": "VAEDecode",
      "pos": [
        2192.0144598529414,
        190.6545154746329
      ],
      "size": [
        192,
        46
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 29
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 34
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            30
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      }
    },
    {
      "id": 4,
      "type": "CheckpointLoaderSimple",
      "pos": [
        35.04463803391465,
        305.99511645379476
      ],
      "size": [
        315,
        98
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            16,
            31
          ]
        },
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 1,
          "links": [
            17,
            18
          ]
        },
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 2,
          "links": []
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CheckpointLoaderSimple"
      },
      "widgets_values": [
        "v1-5-pruned-emaonly-fp16.safetensors"
      ]
    },
    {
      "id": 16,
      "type": "SaveImage",
      "pos": [
        2413.5562680422718,
        190.54464832962913
      ],
      "size": [
        440.8026035004723,
        492.16667321788407
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 30
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        416.1970166015625,
        392.37848510742185
      ],
      "size": [
        410.75801513671877,
        158.82607910156253
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 18
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            6,
            22
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "worst quality, text, watermark"
      ]
    },
    {
      "id": 10,
      "type": "VAELoader",
      "pos": [
        896.9256198347109,
        68.77178286934158
      ],
      "size": [
        281.0743801652891,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "links": [
            10,
            33,
            34
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "vae-ft-mse-840000-ema-pruned.safetensors"
      ]
    },
    {
      "id": 17,
      "type": "PreviewImage",
      "pos": [
        1423.6732583128128,
        328.6264740212463
      ],
      "size": [
        245.18407212622105,
        286.5709992486851
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 37
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        863,
        186
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 16
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 4
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 6
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 2
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            7
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        10000,
        "fixed",
        20,
        8,
        "euler",
        "normal",
        1
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 12,
      "type": "ImageScaleBy",
      "pos": [
        1424.2369484504152,
        186
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 19
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            24
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "ImageScaleBy"
      },
      "widgets_values": [
        "nearest-exact",
        1.5
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        186
      ],
      "size": [
        411.95503173828126,
        151.0030493164063
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 17
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            4,
            32
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "high quality,high detailed, RAW photo of a white fluffy puppy,rimlight,on the desk,blurry background,house plant"
      ]
    },
    {
      "id": 14,
      "type": "VAEEncode",
      "pos": [
        1661.3554226756228,
        186
      ],
      "size": [
        164.5454545454545,
        46
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "pixels",
          "type": "IMAGE",
          "link": 24
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 33
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            26
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "VAEEncode"
      }
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1205.1184742252076,
        186
      ],
      "size": [
        179.27272727272702,
        46
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 7
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 10
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            19,
            37
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 13,
      "type": "KSampler",
      "pos": [
        1846.4738969008304,
        187
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 31
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 32
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 22
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 26
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            29
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        10000,
        "fixed",
        20,
        8,
        "euler",
        "normal",
        0.6
      ],
      "color": "#432",
      "bgcolor": "#653"
    }
  ],
  "links": [
    [
      2,
      5,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      4,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      6,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      7,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      10,
      10,
      0,
      8,
      1,
      "VAE"
    ],
    [
      16,
      4,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      17,
      4,
      1,
      6,
      0,
      "CLIP"
    ],
    [
      18,
      4,
      1,
      7,
      0,
      "CLIP"
    ],
    [
      19,
      8,
      0,
      12,
      0,
      "IMAGE"
    ],
    [
      22,
      7,
      0,
      13,
      2,
      "CONDITIONING"
    ],
    [
      24,
      12,
      0,
      14,
      0,
      "IMAGE"
    ],
    [
      26,
      14,
      0,
      13,
      3,
      "LATENT"
    ],
    [
      29,
      13,
      0,
      15,
      0,
      "LATENT"
    ],
    [
      30,
      15,
      0,
      16,
      0,
      "IMAGE"
    ],
    [
      31,
      4,
      0,
      13,
      0,
      "MODEL"
    ],
    [
      32,
      6,
      0,
      13,
      1,
      "CONDITIONING"
    ],
    [
      33,
      10,
      0,
      14,
      1,
      "VAE"
    ],
    [
      34,
      10,
      0,
      15,
      1,
      "VAE"
    ],
    [
      37,
      8,
      0,
      17,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6830134553650705,
      "offset": [
        64.95536196608535,
        32.692317130658424
      ]
    },
    "frontendVersion": "1.34.6",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true,
    "linkExtensions": [
      {
        "id": 31,
        "parentId": 2
      },
      {
        "id": 33,
        "parentId": 5
      },
      {
        "id": 34,
        "parentId": 6
      }
    ],
    "reroutes": [
      {
        "id": 1,
        "pos": [
          441.0480618422728,
          19.664788129629585
        ],
        "linkIds": [
          31
        ]
      },
      {
        "id": 2,
        "parentId": 1,
        "pos": [
          1771.4727618422721,
          24.549888129629558
        ],
        "linkIds": [
          31
        ]
      },
      {
        "id": 5,
        "pos": [
          1624.7392361419663,
          86.84280922035722
        ],
        "linkIds": [
          33,
          34
        ]
      },
      {
        "id": 6,
        "parentId": 5,
        "pos": [
          2154.447837765052,
          96.76734244275261
        ],
        "linkIds": [
          34
        ]
      }
    ]
  },
  "version": 0.4
}

🟪 text2image
🟦 Upscale Image Byノードでデコードした画像を1.5倍に拡大
🟨 拡大した画像を image2image に入力

Latent のまま拡大する方法

先ほどの workflow では、text2image した画像を一度ピクセル画像にデコードしてから拡大し、再び latent に変換して image2image する流れになっていました。
ここで「わざわざピクセル画像に戻さず、latent のまま拡大できるんじゃない？」という発想が出てきます。

ただし、単純に latent を拡大するだけだと、許容できないほどの劣化が発生します。
そのため長らく実用的ではなかったのですが、「劣化を抑えた latent 拡大」を行うカスタムノードが登場しました。

Goktug/ComfyUI_NNLatentUpscale (forked from Ttl)
- ニューラルネットワークを使ってlatentをアップスケールします。

SD1.5_Hires.fix_NNLatentUpscale.json

{
  "id": "8b9f7796-0873-4025-be3c-0f997f67f866",
  "revision": 0,
  "last_node_id": 18,
  "last_link_id": 40,
  "nodes": [
    {
      "id": 5,
      "type": "EmptyLatentImage",
      "pos": [
        582.1350317382813,
        606.5799999999999
      ],
      "size": [
        244.81999999999994,
        106
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            2
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "EmptyLatentImage"
      },
      "widgets_values": [
        512,
        512,
        1
      ]
    },
    {
      "id": 15,
      "type": "VAEDecode",
      "pos": [
        1797.4334510317049,
        183.00700000000006
      ],
      "size": [
        192,
        46
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 29
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 34
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            30
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 4,
      "type": "CheckpointLoaderSimple",
      "pos": [
        35.04463803391465,
        305.99511645379476
      ],
      "size": [
        315,
        98
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            16,
            31
          ]
        },
        {
          "name": "CLIP",
          "type": "CLIP",
          "slot_index": 1,
          "links": [
            17,
            18
          ]
        },
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 2,
          "links": []
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CheckpointLoaderSimple"
      },
      "widgets_values": [
        "v1-5-pruned-emaonly-fp16.safetensors"
      ]
    },
    {
      "id": 16,
      "type": "SaveImage",
      "pos": [
        2020.9112680422732,
        183.00700000000006
      ],
      "size": [
        440.8026035004723,
        492.16667321788407
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 30
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
      "pos": [
        416.1970166015625,
        392.37848510742185
      ],
      "size": [
        410.75801513671877,
        158.82607910156253
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 18
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            6,
            22
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "worst quality, text, watermark"
      ]
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        415,
        183.00700000000006
      ],
      "size": [
        411.95503173828126,
        151.0030493164063
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 17
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            4,
            32
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "high quality,high detailed, RAW photo of a white fluffy puppy,rimlight,on the desk,blurry background,house plant"
      ]
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        863,
        183.00700000000006
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 16
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 4
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 6
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 2
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            38
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        10000,
        "fixed",
        20,
        8,
        "euler",
        "normal",
        1
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 13,
      "type": "KSampler",
      "pos": [
        1450.9556340211366,
        183.00700000000006
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 31
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 32
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 22
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 39
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            29
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.33",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        10000,
        "fixed",
        20,
        8,
        "euler",
        "normal",
        0.6
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 18,
      "type": "NNLatentUpscale",
      "pos": [
        1209.4778170105683,
        183.00700000000006
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "latent",
          "type": "LATENT",
          "link": 38
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            39
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui_nnlatentupscale",
        "ver": "7657841c7113345ef407c498985c141ffff38eba",
        "Node name for S&R": "NNLatentUpscale"
      },
      "widgets_values": [
        "SD 1.x",
        1.5
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 10,
      "type": "VAELoader",
      "pos": [
        1484.8812538558475,
        66.90901890016424
      ],
      "size": [
        281.0743801652891,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "links": [
            34
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "vae-ft-mse-840000-ema-pruned.safetensors"
      ]
    }
  ],
  "links": [
    [
      2,
      5,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      4,
      6,
      0,
      3,
      1,
      "CONDITIONING"
    ],
    [
      6,
      7,
      0,
      3,
      2,
      "CONDITIONING"
    ],
    [
      16,
      4,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      17,
      4,
      1,
      6,
      0,
      "CLIP"
    ],
    [
      18,
      4,
      1,
      7,
      0,
      "CLIP"
    ],
    [
      22,
      7,
      0,
      13,
      2,
      "CONDITIONING"
    ],
    [
      29,
      13,
      0,
      15,
      0,
      "LATENT"
    ],
    [
      30,
      15,
      0,
      16,
      0,
      "IMAGE"
    ],
    [
      31,
      4,
      0,
      13,
      0,
      "MODEL"
    ],
    [
      32,
      6,
      0,
      13,
      1,
      "CONDITIONING"
    ],
    [
      34,
      10,
      0,
      15,
      1,
      "VAE"
    ],
    [
      38,
      3,
      0,
      18,
      0,
      "LATENT"
    ],
    [
      39,
      18,
      0,
      13,
      3,
      "LATENT"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.8264462809917354,
      "offset": [
        64.95536196608535,
        33.09098109983576
      ]
    },
    "frontendVersion": "1.34.6",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true,
    "reroutes": [
      {
        "id": 1,
        "pos": [
          410.0480618422728,
          106.66478812962959
        ],
        "linkIds": [
          31
        ]
      },
      {
        "id": 2,
        "parentId": 1,
        "pos": [
          1411.8277618422733,
          107.55688812962956
        ],
        "linkIds": [
          31
        ]
      }
    ],
    "linkExtensions": [
      {
        "id": 31,
        "parentId": 2
      }
    ]
  },
  "version": 0.4
}

🟩 text2image から出てきた latent を NNLatentUpscaleノードでそのまま拡大
🟨 拡大した latent を image2image にそのまま流す

あくまで体感ですが、一度ピクセル画像にデコードする方法のほうが品質は良いと思います。

Hires.fix

Hires.fixとは？

なぜこの手法が生まれたか？

計算コストの問題

学習に使った画像サイズの問題

Hires.fixの誕生

ベーシックな方法

Latent のまま拡大する方法

jsonコピーボタンとは？

修正・誤字報告

記事リクエスト

感想・その他

ありがとうございます