Ideogram 4.0

Ideogram 4.0とは？

Ideogram 4.0 は、9.3B の DiT 系モデルです。

最大の特徴は、JSON 形式のキャプションを使うことで、画像内の要素をかなり細かく指定できる点です。

似たようなアプローチで FIBO というモデルがありましたが、Ideogram 4.0 は BBOX による座標指定や色指定に強く、ポスター、ロゴ、UI、パッケージのような、DTP 寄りのデザインタスクに向いています。

コントロール性能と引き換えに、既定の形式でプロンプトを書かなければ本来の性能が出ないので、手軽なモデルではないかもしれませんね。

モデルのダウンロード

diffusion_models
- ideogram4_fp8_scaled.safetensors (9.28 GB)
- ideogram4_nvfp4_mixed.safetensors (5.49 GB)
- ideogram4_unconditional_fp8_scaled.safetensors (9.28 GB)
- ideogram4_unconditional_nvfp4_mixed.safetensors (5.49 GB)
text_encoders
- qwen3vl_8b_fp8_scaled.safetensors (10.6 GB)
vae
- flux2-vae.safetensors (336 MB)

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   ├── ideogram4_fp8_scaled.safetensors
    │   ├── ideogram4_nvfp4_mixed.safetensors
    │   ├── ideogram4_unconditional_fp8_scaled.safetensors
    │   └── ideogram4_unconditional_nvfp4_mixed.safetensors
    ├── 📂text_encoders/
    │   └── qwen3vl_8b_fp8_scaled.safetensors
    └── 📂vae/
        └── flux2-vae.safetensors

詳細はあとで説明しますが、2 つの Diffusion model を読み込む仕組みのため、かなり重いです。

ComfyUI は内部でやりくりしてくれるので、VRAM が足りなくても、まったく生成できないわけではありませんが、非常に時間がかかります。

軽くするために nvfp4 を使う選択肢もありますが、品質は落ちます。

unconditional 側は品質への影響が少ないので、通常側は fp8、unconditional 側は nvfp4 を使うのが良いかもしれません。

プロンプト

単なる自然文でも生成はできますが、既定の JSON schema に従わないと、まともにクオリティが出ません。

基本形はこのようになっています。

{
  "high_level_description": "画像全体の1〜2文の説明。",
  "style_description": {
    "aesthetics": "雰囲気、審美性。",
    "lighting": "ライティング。",
    "medium": "illustration / photograph / graphic_design など。",
    "art_style": "非写真の場合の画風。",
    "color_palette": ["#FFFFFF", "#000000"]
  },
  "compositional_deconstruction": {
    "background": "背景・環境の説明。",
    "elements": [
      {
        "type": "obj",
        "bbox": [100, 200, 800, 700],
        "desc": "物体・人物・要素の説明。",
        "color_palette": ["#FFFFFF", "#000000"]
      },
      {
        "type": "text",
        "bbox": [820, 200, 920, 800],
        "text": "HELLO",
        "desc": "文字の見た目の説明。",
        "color_palette": ["#000000"]
      }
    ]
  }
}

全体の説明、スタイル、背景、各要素の説明、と構成自体はシンプルですが、こんなものを毎回手で書いてはいられません。

特に座標指定は面倒です。画像のどのあたりに、どの要素を置くのかを BBOX で指定する必要があるため、それを頭で想像するのはほぼ無理です。

そこで、いくつかプロンプトを作成するための方法を紹介します。

LLM に任せる

一番楽なのは、公式のプロンプトガイドと、作りたい画像の説明を LLM に渡して、JSON キャプションに変換してもらう方法です。

参考画像だったり、自分が書いたラフを渡して作ってもらってもいいですね。

ComfyUI 上で動かせるレベルのローカルモデルでは性能が足りないので、大人しく ChatGPT や Gemini などに頼った方が良いでしょう。

サンプルチャット with ChatGPT

専用プロンプトビルダーを使う

専用のプロンプトビルダーを使って、視覚的にプロンプトを作る手もあります。

例えば ComfyUI-KJNodes の Ideogram 4 Prompt Builder KJ ノードはよく使われているものの一つです。

生成する画像のサイズを設定し、背景やスタイルといったものを入力していきます。
region 欄でドラッグすると、BBOX を作ることができ、そこに描かせたいもののプロンプト、及びカラーコードを設定します。

text2image

Ideogram_4.0_text2image.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 109,
  "last_link_id": 172,
  "nodes": [
    {
      "id": 76,
      "type": "KSamplerSelect",
      "pos": [
        560.1252292712549,
        324.6528974302894
      ],
      "size": [
        270,
        68.88020833333334
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "SAMPLER",
          "type": "SAMPLER",
          "links": [
            132
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "KSamplerSelect",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        "euler"
      ]
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1180.6146240234375,
        195.84114925861235
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 15,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 164
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            101
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "VAEDecode",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": []
    },
    {
      "id": 92,
      "type": "EmptyFlux2LatentImage",
      "pos": [
        560.1252292712549,
        721.3543590454891
      ],
      "size": [
        270,
        106
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 153
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 154
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            152
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "EmptyFlux2LatentImage"
      },
      "widgets_values": [
        1024,
        1024,
        1
      ]
    },
    {
      "id": 88,
      "type": "DualModelGuider",
      "pos": [
        560.1252292712549,
        119.74227078935617
      ],
      "size": [
        270,
        118
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 172
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 146
        },
        {
          "name": "model_negative",
          "shape": 7,
          "type": "MODEL",
          "link": 149
        },
        {
          "name": "negative",
          "shape": 7,
          "type": "CONDITIONING",
          "link": 145
        }
      ],
      "outputs": [
        {
          "name": "GUIDER",
          "type": "GUIDER",
          "links": [
            144
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "DualModelGuider"
      },
      "widgets_values": [
        7
      ]
    },
    {
      "id": 91,
      "type": "CLIPLoader",
      "pos": [
        -510.80318076960083,
        311.42496280403503
      ],
      "size": [
        289.8073985431536,
        106
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "links": [
            150
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "qwen3vl_8b_fp8_scaled.safetensors",
        "ideogram4",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        904.0583918587276,
        72.88171068869869
      ],
      "size": [
        242.12760404770165,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "VAELoader",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "flux2-vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 90,
      "type": "UNETLoader",
      "pos": [
        -96.592885685151,
        131.5898766922886
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            149
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "UNETLoader",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "ideogram4_unconditional_nvfp4_mixed.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 87,
      "type": "ConditioningZeroOut",
      "pos": [
        278.5119793477361,
        427.0196777116257
      ],
      "size": [
        211.88658923633488,
        26
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 142
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            145
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "ConditioningZeroOut"
      },
      "widgets_values": []
    },
    {
      "id": 79,
      "type": "SamplerCustomAdvanced",
      "pos": [
        904.0583918587276,
        195.84114925861235
      ],
      "size": [
        242.12760404770165,
        106
      ],
      "flags": {},
      "order": 14,
      "mode": 0,
      "inputs": [
        {
          "name": "noise",
          "type": "NOISE",
          "link": 130
        },
        {
          "name": "guider",
          "type": "GUIDER",
          "link": 144
        },
        {
          "name": "sampler",
          "type": "SAMPLER",
          "link": 132
        },
        {
          "name": "sigmas",
          "type": "SIGMAS",
          "link": 167
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 152
        }
      ],
      "outputs": [
        {
          "name": "output",
          "type": "LATENT",
          "links": [
            164
          ]
        },
        {
          "name": "denoised_output",
          "type": "LATENT",
          "links": []
        }
      ],
      "properties": {
        "Node name for S&R": "SamplerCustomAdvanced",
        "cnr_id": "comfy-core",
        "ver": "0.3.60",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": []
    },
    {
      "id": 95,
      "type": "Ideogram4Scheduler",
      "pos": [
        560.1252292712549,
        480.44373240455593
      ],
      "size": [
        270,
        154
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 156
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 157
        }
      ],
      "outputs": [
        {
          "name": "SIGMAS",
          "type": "SIGMAS",
          "links": [
            167
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "Ideogram4Scheduler"
      },
      "widgets_values": [
        20,
        1024,
        1024,
        0,
        1.75
      ]
    },
    {
      "id": 56,
      "type": "SaveImage",
      "pos": [
        1371.5615738427737,
        195.84114925861235
      ],
      "size": [
        436.7195313170437,
        711.2421298391242
      ],
      "flags": {},
      "order": 16,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 101
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.75"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    },
    {
      "id": 94,
      "type": "ResolutionSelector",
      "pos": [
        249.45527396590353,
        701.3543590454891
      ],
      "size": [
        270,
        126
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "width",
          "type": "INT",
          "links": [
            153,
            156
          ]
        },
        {
          "name": "height",
          "type": "INT",
          "links": [
            154,
            157
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "ResolutionSelector"
      },
      "widgets_values": [
        "2:3 (Portrait Photo)",
        1,
        16
      ]
    },
    {
      "id": 83,
      "type": "CLIPTextEncode",
      "pos": [
        -194.71785512781273,
        311.42496280403503
      ],
      "size": [
        408.34315901785703,
        324.50164397511764
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 150
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            142,
            146
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CLIPTextEncode",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        "{\n  \"high_level_description\": \"A cinematic Leica-style twilight photograph shows a tall modern office tower from a dramatic low angle, rising against a deep evening sky. Warm illuminated rooms on the front facade spell \\\"Comfy,\\\" while a small handwritten \\\"Ideogram 4.0\\\" signature appears at the lower right.\",\n  \"style_description\": {\n    \"aesthetics\": \"cinematic, atmospheric, elegant, slightly dreamy, high-end urban photography with strong vertical composition and restrained visual clutter\",\n    \"lighting\": \"blue-hour ambient light with soft haze, gentle street glow near the lower frame, and warm yellow interior window lights standing out against the dark facade\",\n    \"photo\": \"shot like a Leica photograph with a low-angle perspective, subtle filmic contrast, crisp architectural lines, natural depth, and a refined editorial cityscape look\",\n    \"medium\": \"photograph\",\n    \"color_palette\": [\"#1E2148\", \"#4A3F7E\", \"#F3E34B\", \"#C8CEDF\", \"#6E314D\"]\n  },\n  \"compositional_deconstruction\": {\n    \"background\": \"A dusky urban evening sky fills most of the frame with deep navy and violet tones, fading slightly brighter near the horizon. The atmosphere is lightly hazy, with a soft bloom of city light near the lower left. Minimal surrounding street-level structures appear as subdued silhouettes near the bottom edges, keeping the tower dominant in the portrait-oriented composition.\",\n    \"elements\": [\n      {\n        \"type\": \"obj\",\n        \"bbox\": [120, 330, 945, 845],\n        \"desc\": \"A tall dark-glass office tower viewed from below, centered slightly right of frame. The building has sharp modern edges, horizontal floor bands, a subtly reflective facade, and a tapering sense of height emphasized by the perspective. The front-facing plane is the main visual surface, while the right side recedes into shadow.\",\n        \"color_palette\": [\"#161A33\", \"#252C54\", \"#BEC6DC\"]\n      },\n      {\n        \"type\": \"text\",\n        \"bbox\": [170, 455, 785, 615],\n        \"text\": \"Comfy\",\n        \"desc\": \"The word is formed by warm glowing room windows arranged vertically on the front facade of the tower. Each letter is clearly legible through clusters of illuminated office rooms, appearing as bright yellow typographic shapes embedded within the architecture.\",\n        \"color_palette\": [\"#F3E34B\"]\n      },\n      {\n        \"type\": \"obj\",\n        \"bbox\": [40, 85, 90, 130],\n        \"desc\": \"A small crescent moon in the upper left portion of the sky, softly glowing and isolated against the dark twilight background.\",\n        \"color_palette\": [\"#F3E34B\", \"#F8F0A8\"]\n      },\n      {\n        \"type\": \"text\",\n        \"bbox\": [955, 790, 995, 985],\n        \"text\": \"Ideogram 4.0\",\n        \"desc\": \"A small handwritten signature placed at the lower right corner, rendered in a light ink-like white script with a casual, unobtrusive appearance.\",\n        \"color_palette\": [\"#F3F3F0\"]\n      }\n    ]\n  }\n}"
      ]
    },
    {
      "id": 78,
      "type": "RandomNoise",
      "pos": [
        560.1252292712549,
        -49.16835585157703
      ],
      "size": [
        270,
        82
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "NOISE",
          "type": "NOISE",
          "links": [
            130
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "RandomNoise",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        9999,
        "fixed"
      ]
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        -96.592885685151,
        -49.16835585157703
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            148
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "UNETLoader",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "ideogram4_fp8_scaled.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 89,
      "type": "CFGOverride",
      "pos": [
        249.45527396590353,
        -49.16835585157703
      ],
      "size": [
        270,
        106
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 148
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            172
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CFGOverride"
      },
      "widgets_values": [
        3,
        0.7,
        1
      ]
    },
    {
      "id": 71,
      "type": "MarkdownNote",
      "pos": [
        -560.0606701629572,
        -141.99890306621
      ],
      "size": [
        402.9868769880169,
        355.5887797584986
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [],
      "outputs": [],
      "properties": {},
      "widgets_values": [
        "## models\n\n- diffusion_models\n  - [ideogram4_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_fp8_scaled.safetensors) (9.28 GB)\n  - [ideogram4_nvfp4_mixed.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_nvfp4_mixed.safetensors) (5.49 GB)\n  - [ideogram4_unconditional_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_unconditional_fp8_scaled.safetensors) (9.28 GB)\n  - [ideogram4_unconditional_nvfp4_mixed.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_unconditional_nvfp4_mixed.safetensors) (5.49 GB)\n- text_encoders\n  - [qwen3vl_8b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/text_encoders/qwen3vl_8b_fp8_scaled.safetensors) (10.6 GB)\n- vae\n  - [flux2-vae.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/vae/flux2-vae.safetensors) (336 MB)\n\n```text\n📂ComfyUI/\n└── 📂models/\n    ├── 📂diffusion_models/\n    │   ├── ideogram4_fp8_scaled.safetensors\n    │   ├── ideogram4_nvfp4_mixed.safetensors\n    │   ├── ideogram4_unconditional_fp8_scaled.safetensors\n    │   └── ideogram4_unconditional_nvfp4_mixed.safetensors\n    ├── 📂text_encoders/\n    │   └── qwen3vl_8b_fp8_scaled.safetensors\n    └── 📂vae/\n         └── flux2-vae.safetensors\n```"
      ],
      "color": "#323",
      "bgcolor": "#535"
    }
  ],
  "links": [
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      101,
      8,
      0,
      56,
      0,
      "IMAGE"
    ],
    [
      130,
      78,
      0,
      79,
      0,
      "NOISE"
    ],
    [
      132,
      76,
      0,
      79,
      2,
      "SAMPLER"
    ],
    [
      142,
      83,
      0,
      87,
      0,
      "CONDITIONING"
    ],
    [
      144,
      88,
      0,
      79,
      1,
      "GUIDER"
    ],
    [
      145,
      87,
      0,
      88,
      3,
      "CONDITIONING"
    ],
    [
      146,
      83,
      0,
      88,
      1,
      "CONDITIONING"
    ],
    [
      148,
      37,
      0,
      89,
      0,
      "MODEL"
    ],
    [
      149,
      90,
      0,
      88,
      2,
      "MODEL"
    ],
    [
      150,
      91,
      0,
      83,
      0,
      "CLIP"
    ],
    [
      152,
      92,
      0,
      79,
      4,
      "LATENT"
    ],
    [
      153,
      94,
      0,
      92,
      0,
      "INT"
    ],
    [
      154,
      94,
      1,
      92,
      1,
      "INT"
    ],
    [
      156,
      94,
      0,
      95,
      0,
      "INT"
    ],
    [
      157,
      94,
      1,
      95,
      1,
      "INT"
    ],
    [
      164,
      79,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      167,
      95,
      0,
      79,
      3,
      "SIGMAS"
    ],
    [
      172,
      89,
      0,
      88,
      0,
      "MODEL"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.6209213230591553,
      "offset": [
        821.830919056688,
        383.46442973242
      ]
    },
    "frontendVersion": "1.45.15",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

プロンプト以外にも、一般的な workflow と比べると少し特殊な部分があるので、そちらだけ見ていきましょう。

Load Diffusion Model

Ideogram 4.0 では、少し特殊な CFG のために diffusion model を 2 つ読み込みます。

通常の CFG では、プロンプトありの結果と、プロンプトなしの結果を比べることで、プロンプト方向へ生成を寄せます。
一方で Ideogram 4.0 では、unconditional 側に空プロンプトを渡すのではなく、テキスト token を使わない image-only の入力を unconditional 用モデルに通します。
なにが違うんだという感じもしますが、より positive prompt を繊細に扱うための工夫といった感じでしょうか。

CFG

昔からある細かいテクニックですが、サンプリングの前半と後半で CFG の値を変えます。

この workflow では、前半が CFG 7、後半は CFG 3
高い CFG を最初から最後までかけ続けるより、途中で弱めたほうが安定するんですね。
そのために使用するのが CFG Override ノードです。
指定したステップ範囲だけ CFG の値を上書きします。
この workflow では、全体の 70% 以降は cfg が 3 になります。

Ideogram 4 TurboTime LoRA

Ostris さんが公開している、2〜8 step で生成できる LoRA です。

モデルのダウンロード

loras
- ideogram_4_turbotime_v1.safetensors (847 MB)

📂ComfyUI/
└── 📂models/
    └── 📂loras/
        └── ideogram_4_turbotime_v1.safetensors

text2image (8 step)

Ideogram_4.0_text2image_turbotime.json

{
  "id": "d8034549-7e0a-40f1-8c2e-de3ffc6f1cae",
  "revision": 0,
  "last_node_id": 112,
  "last_link_id": 181,
  "nodes": [
    {
      "id": 76,
      "type": "KSamplerSelect",
      "pos": [
        560.1252292712549,
        324.6528974302894
      ],
      "size": [
        270,
        68.88020833333334
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "SAMPLER",
          "type": "SAMPLER",
          "links": [
            132
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "KSamplerSelect",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        "euler"
      ]
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1180.6146240234375,
        195.84114925861235
      ],
      "size": [
        157.56002807617188,
        46
      ],
      "flags": {},
      "order": 14,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 164
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            101
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "VAEDecode",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": []
    },
    {
      "id": 92,
      "type": "EmptyFlux2LatentImage",
      "pos": [
        560.1252292712549,
        721.3543590454891
      ],
      "size": [
        270,
        106
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 153
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 154
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            152
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "EmptyFlux2LatentImage"
      },
      "widgets_values": [
        1024,
        1024,
        1
      ]
    },
    {
      "id": 91,
      "type": "CLIPLoader",
      "pos": [
        -510.80318076960083,
        311.42496280403503
      ],
      "size": [
        289.8073985431536,
        106
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "links": [
            150
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CLIPLoader"
      },
      "widgets_values": [
        "qwen3vl_8b_fp8_scaled.safetensors",
        "ideogram4",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 39,
      "type": "VAELoader",
      "pos": [
        904.0583918587276,
        72.88171068869869
      ],
      "size": [
        242.12760404770165,
        58
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "slot_index": 0,
          "links": [
            76
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "VAELoader",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "flux2-vae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 79,
      "type": "SamplerCustomAdvanced",
      "pos": [
        904.0583918587276,
        195.84114925861235
      ],
      "size": [
        242.12760404770165,
        106
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "noise",
          "type": "NOISE",
          "link": 130
        },
        {
          "name": "guider",
          "type": "GUIDER",
          "link": 173
        },
        {
          "name": "sampler",
          "type": "SAMPLER",
          "link": 132
        },
        {
          "name": "sigmas",
          "type": "SIGMAS",
          "link": 181
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 152
        }
      ],
      "outputs": [
        {
          "name": "output",
          "type": "LATENT",
          "links": [
            164
          ]
        },
        {
          "name": "denoised_output",
          "type": "LATENT",
          "links": []
        }
      ],
      "properties": {
        "Node name for S&R": "SamplerCustomAdvanced",
        "cnr_id": "comfy-core",
        "ver": "0.3.60",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": []
    },
    {
      "id": 56,
      "type": "SaveImage",
      "pos": [
        1371.5615738427737,
        195.84114925861235
      ],
      "size": [
        436.7195313170437,
        711.2421298391242
      ],
      "flags": {},
      "order": 15,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 101
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.75"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    },
    {
      "id": 94,
      "type": "ResolutionSelector",
      "pos": [
        249.45527396590353,
        701.3543590454891
      ],
      "size": [
        270,
        126
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "width",
          "type": "INT",
          "links": [
            153,
            156
          ]
        },
        {
          "name": "height",
          "type": "INT",
          "links": [
            154,
            157
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "ResolutionSelector"
      },
      "widgets_values": [
        "2:3 (Portrait Photo)",
        1,
        16
      ]
    },
    {
      "id": 110,
      "type": "CFGGuider",
      "pos": [
        560.1252292712549,
        131.1414897935515
      ],
      "size": [
        270,
        98
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 178
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 175
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 176
        }
      ],
      "outputs": [
        {
          "name": "GUIDER",
          "type": "GUIDER",
          "links": [
            173
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CFGGuider"
      },
      "widgets_values": [
        1
      ]
    },
    {
      "id": 83,
      "type": "CLIPTextEncode",
      "pos": [
        -194.71785512781273,
        311.42496280403503
      ],
      "size": [
        408.34315901785703,
        324.50164397511764
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 150
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            142,
            175
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "CLIPTextEncode",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        "{\n  \"high_level_description\": \"A clean vertical anime-style illustration of a teenage girl kneeling while holding a penguin stretched across her arms. The scene uses a plain white background, soft linework, and a playful fashion twist with sunglasses, a yellow rain outfit, and blue rain boots.\",\n  \"style_description\": {\n    \"aesthetics\": \"Soft Japanese anime illustration with delicate line art, clean negative space, gentle shading, and a calm slice-of-life mood.\",\n    \"lighting\": \"Soft diffused studio lighting with subtle gray contact shadows beneath the figure.\",\n    \"medium\": \"digital illustration\",\n    \"art_style\": \"contemporary anime character illustration\",\n    \"color_palette\": [\n      \"#FFFFFF\",\n      \"#F4D34F\",\n      \"#1F1F1F\",\n      \"#3F6FB6\",\n      \"#F2CDB3\",\n      \"#EAEAEA\"\n    ]\n  },\n  \"compositional_deconstruction\": {\n    \"background\": \"A plain white seamless backdrop with generous negative space around the kneeling subject. A soft gray shadow near the floor lightly grounds the figure and the penguin within the vertical frame.\",\n    \"elements\": [\n      {\n        \"type\": \"obj\",\n        \"bbox\": [120, 140, 940, 730],\n        \"desc\": \"A teenage girl with light skin and straight dark brown shoulder-length hair with bangs, kneeling on one knee in a casual pose. She wears dark sunglasses, a bright yellow hooded raincoat, matching yellow waterproof rain pants with no blue trousers visible, and glossy blue rain boots. Her pose mirrors the reference composition, with one arm supporting the penguin near her shoulder and the other hand holding its lower body. Her expression is mostly hidden by the sunglasses, giving her a cool and composed attitude.\",\n        \"color_palette\": [\n          \"#F2CDB3\",\n          \"#2F2525\",\n          \"#1F1F1F\",\n          \"#F4D34F\",\n          \"#3F6FB6\",\n          \"#F7F7F7\"\n        ]\n      },\n      {\n        \"type\": \"obj\",\n        \"bbox\": [185, 235, 515, 905],\n        \"desc\": \"A penguin held horizontally across the girl's arms in a playful stretched pose, echoing the original composition. The penguin has a black back, white belly, small pale beak, flipper-like wings extended outward, and dangling feet, appearing calm and slightly floppy while being supported.\",\n        \"color_palette\": [\n          \"#1E1E1E\",\n          \"#FFFFFF\",\n          \"#F1D9A6\",\n          \"#D9D9D9\",\n          \"#8A8A8A\"\n        ]\n      }\n    ]\n  }\n}"
      ]
    },
    {
      "id": 87,
      "type": "ConditioningZeroOut",
      "pos": [
        249.45527396590353,
        427.0196777116257
      ],
      "size": [
        270,
        26
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 142
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            176
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "ConditioningZeroOut"
      },
      "widgets_values": []
    },
    {
      "id": 37,
      "type": "UNETLoader",
      "pos": [
        -96.592885685151,
        -49.16835585157703
      ],
      "size": [
        305.3782043457031,
        82
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "slot_index": 0,
          "links": [
            177
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "UNETLoader",
        "cnr_id": "comfy-core",
        "ver": "0.3.33"
      },
      "widgets_values": [
        "ideogram4_fp8_scaled.safetensors",
        "default"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 78,
      "type": "RandomNoise",
      "pos": [
        560.1252292712549,
        -49.16835585157703
      ],
      "size": [
        270,
        82
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "NOISE",
          "type": "NOISE",
          "links": [
            130
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "RandomNoise",
        "cnr_id": "comfy-core",
        "ver": "0.3.56",
        "enableTabs": false,
        "tabWidth": 65,
        "tabXOffset": 10,
        "hasSecondTab": false,
        "secondTabText": "Send Back",
        "secondTabOffset": 80,
        "secondTabWidth": 65
      },
      "widgets_values": [
        22222,
        "fixed"
      ]
    },
    {
      "id": 95,
      "type": "Ideogram4Scheduler",
      "pos": [
        560.1252292712549,
        480.44373240455593
      ],
      "size": [
        270,
        154
      ],
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "width",
          "type": "INT",
          "widget": {
            "name": "width"
          },
          "link": 156
        },
        {
          "name": "height",
          "type": "INT",
          "widget": {
            "name": "height"
          },
          "link": 157
        }
      ],
      "outputs": [
        {
          "name": "SIGMAS",
          "type": "SIGMAS",
          "links": [
            181
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "Ideogram4Scheduler"
      },
      "widgets_values": [
        8,
        1024,
        1024,
        0.5,
        1.75
      ]
    },
    {
      "id": 71,
      "type": "MarkdownNote",
      "pos": [
        -569.8787976043636,
        -155.54437578349516
      ],
      "size": [
        424.61355549024836,
        362.0090833106222
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [],
      "outputs": [],
      "properties": {},
      "widgets_values": [
        "## models\n\n- diffusion_models\n  - [ideogram4_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_fp8_scaled.safetensors) (9.28 GB)\n  - [ideogram4_nvfp4_mixed.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/diffusion_models/ideogram4_nvfp4_mixed.safetensors) (5.49 GB)\n- loras\n  - [ideogram_4_turbotime_v1.safetensors](https://huggingface.co/ostris/ideogram_4_turbotime_lora/blob/main/ideogram_4_turbotime_v1.safetensors) (847 MB)\n- text_encoders\n  - [qwen3vl_8b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/text_encoders/qwen3vl_8b_fp8_scaled.safetensors) (10.6 GB)\n- vae\n  - [flux2-vae.safetensors](https://huggingface.co/Comfy-Org/Ideogram-4/blob/main/vae/flux2-vae.safetensors) (336 MB)\n\n```text\n📂ComfyUI/\n└── 📂models/\n    ├── 📂diffusion_models/\n    │   ├── ideogram4_fp8_scaled.safetensors\n    │   └── ideogram4_nvfp4_mixed.safetensors\n    ├── 📂loras/\n    │   └── ideogram_4_turbotime_v1.safetensors\n    ├── 📂text_encoders/\n    │   └── qwen3vl_8b_fp8_scaled.safetensors\n    └── 📂vae/\n         └── flux2-vae.safetensors\n```"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 111,
      "type": "LoraLoaderModelOnly",
      "pos": [
        247.84471481703875,
        -50.778860936690556
      ],
      "size": [
        270,
        82
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 177
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            178
          ]
        }
      ],
      "properties": {
        "Node name for S&R": "LoraLoaderModelOnly"
      },
      "widgets_values": [
        "ideogram_4_turbotime_v1.safetensors",
        0.6
      ],
      "color": "#323",
      "bgcolor": "#535"
    }
  ],
  "links": [
    [
      76,
      39,
      0,
      8,
      1,
      "VAE"
    ],
    [
      101,
      8,
      0,
      56,
      0,
      "IMAGE"
    ],
    [
      130,
      78,
      0,
      79,
      0,
      "NOISE"
    ],
    [
      132,
      76,
      0,
      79,
      2,
      "SAMPLER"
    ],
    [
      142,
      83,
      0,
      87,
      0,
      "CONDITIONING"
    ],
    [
      150,
      91,
      0,
      83,
      0,
      "CLIP"
    ],
    [
      152,
      92,
      0,
      79,
      4,
      "LATENT"
    ],
    [
      153,
      94,
      0,
      92,
      0,
      "INT"
    ],
    [
      154,
      94,
      1,
      92,
      1,
      "INT"
    ],
    [
      156,
      94,
      0,
      95,
      0,
      "INT"
    ],
    [
      157,
      94,
      1,
      95,
      1,
      "INT"
    ],
    [
      164,
      79,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      173,
      110,
      0,
      79,
      1,
      "GUIDER"
    ],
    [
      175,
      83,
      0,
      110,
      1,
      "CONDITIONING"
    ],
    [
      176,
      87,
      0,
      110,
      2,
      "CONDITIONING"
    ],
    [
      177,
      37,
      0,
      111,
      0,
      "MODEL"
    ],
    [
      178,
      111,
      0,
      110,
      0,
      "MODEL"
    ],
    [
      181,
      95,
      0,
      79,
      3,
      "SIGMAS"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.5644739300537773,
      "offset": [
        890.2257921733324,
        385.0522096928824
      ]
    },
    "frontendVersion": "1.45.15",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

CFG 1.0 で生成するため、unconditional model は不要になります。
2〜8 step 用とされていますが、step 数が少なすぎると明確に崩れ始めるので、現状は 8 step で使うのがいいでしょう。

Ideogram 4.0

Ideogram 4.0とは？

モデルのダウンロード

プロンプト

LLM に任せる

専用プロンプトビルダーを使う

text2image

Ideogram 4 TurboTime LoRA

モデルのダウンロード

text2image (8 step)

jsonコピーボタンとは？

修正・誤字報告

記事リクエスト

感想・その他

ありがとうございます