Flux.1 Kontext

Flux.1 Kontextとは？

Flux.1 Kontext は、Flux.1 をベースにした指示ベース画像編集モデルです。

nano bananaを始めとするAI画像編集というタスクの流行に火をつけたのは間違いなくこのモデルでしょう。

Flux.1 と同じように pro/max/devの3つのバリエーションがありますが、ローカルで使用出来るのは dev のみです。

指示ベース画像編集とは？

画像とテキストの指示を入力すると、その指示に従って画像を編集してくれるモデルを、このサイトでは 指示ベース画像編集モデル と呼んでいます。

例えば、写真に写っている女性の髪を赤くしたいと思ったとします。
これまでは、髪をマスクし、髪型は変更したくないのでControlNet Cannyを追加、その上で「赤い髪の女性の写真」などというプロンプトで inpainting をしていました。

指示ベース画像編集ならば簡単です。画像をモデルに渡して「女性の髪を赤くして」とプロデューサーがデザイナーに頼むように指示するだけです。

表情を変えたり、邪魔なオブジェクトを削除したり、絵柄を変換したり。

全て、ひとつのモデルとプロンプトだけで実現できてしまうのです。

モデルのダウンロード

Kontext でも、基本的な構成は通常の Flux.1 と同じです。

diffusion_models
- flux1-dev-kontext_fp8_scaled.safetensors
clip / T5 / VAE
gguf（任意）
- flux1-kontext-dev.gguf

📂ComfyUI/
└── 📂models/
    ├── 📂diffusion_models/
    │   └── flux1-dev-kontext_fp8_scaled.safetensors
    ├── 📂clip/
    │   ├── clip_l.safetensors
    │   └── t5xxl_fp8_e4m3fn_scaled.safetensors
    ├── 📂vae/
    │   └── ae.safetensors
    └── 📂unet/
        └── flux1-kontext-dev.gguf      ← gguf を使う場合のみ

workflow（基本形）

Kontext の workflow 自体は、通常の Flux.1 にReferenceLatent を追加しただけのシンプルな構成です。

Flux.1-Kontext.json

{
  "id": "18404b37-92b0-4d11-a39c-ae941838eb83",
  "revision": 0,
  "last_node_id": 77,
  "last_link_id": 131,
  "nodes": [
    {
      "id": 51,
      "type": "ReferenceLatent",
      "pos": [
        883.7505187988281,
        190
      ],
      "size": [
        204.134765625,
        46
      ],
      "flags": {
        "collapsed": false
      },
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 74
        },
        {
          "name": "latent",
          "shape": 7,
          "type": "LATENT",
          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            114
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.41",
        "Node name for S&R": "ReferenceLatent"
      },
      "widgets_values": [],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 68,
      "type": "FluxGuidance",
      "pos": [
        1115.2528076171875,
        190
      ],
      "size": [
        211.3223114013672,
        58
      ],
      "flags": {},
      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "conditioning",
          "type": "CONDITIONING",
          "link": 114
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            115
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.41",
        "Node name for S&R": "FluxGuidance"
      },
      "widgets_values": [
        3.5
      ]
    },
    {
      "id": 69,
      "type": "DualCLIPLoader",
      "pos": [
        174.92930603027344,
        261.95574951171875
      ],
      "size": [
        270,
        130
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
          "type": "CLIP",
          "links": [
            117,
            118
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.41",
        "Node name for S&R": "DualCLIPLoader"
      },
      "widgets_values": [
        "clip_l.safetensors",
        "t5xxl_fp8_e4m3fn.safetensors",
        "flux",
        "default"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 33,
      "type": "CLIPTextEncode",
      "pos": [
        517.7193603515625,
        378
      ],
      "size": [
        336.888427734375,
        103.97698974609375
      ],
      "flags": {
        "collapsed": true
      },
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 118
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            99
          ]
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        ""
      ]
    },
    {
      "id": 52,
      "type": "VAEEncode",
      "pos": [
        719.3842163085938,
        468.98004150390625
      ],
      "size": [
        140,
        46
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "pixels",
          "type": "IMAGE",
          "link": 130
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 77
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            76,
            116
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.41",
        "Node name for S&R": "VAEEncode"
      },
      "widgets_values": [],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 43,
      "type": "VAELoader",
      "pos": [
        462.1297302246094,
        613.5346069335938
      ],
      "size": [
        234.05543518066406,
        58
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
          "type": "VAE",
          "links": [
            62,
            77
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
        "ae.safetensors"
      ],
      "color": "#322",
      "bgcolor": "#533"
    },
    {
      "id": 6,
      "type": "CLIPTextEncode",
      "pos": [
        516.5379638671875,
        190
      ],
      "size": [
        339.84503173828125,
        123.01304626464844
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
          "type": "CLIP",
          "link": 117
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
          "type": "CONDITIONING",
          "slot_index": 0,
          "links": [
            74
          ]
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "CLIPTextEncode"
      },
      "widgets_values": [
        "Change camera angle to a high-angle shot, looking down at the subject from above while keeping the subject's position, scale, and pose identical. Preserve the lighting and overall style."
      ]
    },
    {
      "id": 67,
      "type": "MarkdownNote",
      "pos": [
        76.41261291503906,
        -64.39566040039062
      ],
      "size": [
        368.5166931152344,
        248.5858612060547
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [],
      "properties": {},
      "widgets_values": [
        "## models\n\n- [flux1-dev-kontext_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/flux1-kontext-dev_ComfyUI/tree/main/split_files/diffusion_models)\n- [clip_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors)\n- [t5xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp8_e4m3fn_scaled.safetensors)\n- [ae.safetensors](https://huggingface.co/Comfy-Org/Omnigen2_ComfyUI_repackaged/tree/main/split_files/vae)\n\n```\n📂ComfyUI/\n└── 📂models/\n    ├── 📂clip/\n    │   ├── clip_l.safetensors\n    │   └── t5xxl_fp8_e4m3fn.safetensors\n    ├── 📂diffusion_models/\n    │   └── flux1-dev-kontext_fp8_scaled.safetensors\n    └── 📂vae/\n         └── ae.safetensors\n```"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 53,
      "type": "LoadImage",
      "pos": [
        153.1424102783203,
        468.98004150390625
      ],
      "size": [
        277.51690673828125,
        455.66180419921875
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            129
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.41",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "pexels-photo-28266413.jpg",
        "image"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 76,
      "type": "FluxKontextImageScale",
      "pos": [
        481.14453125,
        468.98004150390625
      ],
      "size": [
        194.9458984375,
        26
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 129
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            130
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.43",
        "Node name for S&R": "FluxKontextImageScale"
      },
      "widgets_values": [],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 31,
      "type": "KSampler",
      "pos": [
        1355.8184814453125,
        194.12423706054688
      ],
      "size": [
        315,
        262
      ],
      "flags": {},
      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 128
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 115
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 99
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 116
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "slot_index": 0,
          "links": [
            52
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        1234,
        "fixed",
        20,
        1,
        "euler",
        "normal",
        1
      ]
    },
    {
      "id": 74,
      "type": "UNETLoader",
      "pos": [
        1056.5750732421875,
        38.088253021240234
      ],
      "size": [
        270,
        82
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            128
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.43",
        "Node name for S&R": "UNETLoader"
      },
      "widgets_values": [
        "Flux.1\\flux1-dev-kontext_fp8_scaled.safetensors",
        "fp8_e4m3fn"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        1700.061767578125,
        194.12423706054688
      ],
      "size": [
        165.4577742454253,
        46
      ],
      "flags": {},
      "order": 12,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 52
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 62
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "slot_index": 0,
          "links": [
            131
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "VAEDecode"
      },
      "widgets_values": []
    },
    {
      "id": 77,
      "type": "SaveImage",
      "pos": [
        1896.4101908313817,
        194.12423706054688
      ],
      "size": [
        413,
        603.9366300000002
      ],
      "flags": {},
      "order": 13,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 131
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.76"
      },
      "widgets_values": [
        "ComfyUI"
      ]
    }
  ],
  "links": [
    [
      52,
      31,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      62,
      43,
      0,
      8,
      1,
      "VAE"
    ],
    [
      74,
      6,
      0,
      51,
      0,
      "CONDITIONING"
    ],
    [
      76,
      52,
      0,
      51,
      1,
      "LATENT"
    ],
    [
      77,
      43,
      0,
      52,
      1,
      "VAE"
    ],
    [
      99,
      33,
      0,
      31,
      2,
      "CONDITIONING"
    ],
    [
      114,
      51,
      0,
      68,
      0,
      "CONDITIONING"
    ],
    [
      115,
      68,
      0,
      31,
      1,
      "CONDITIONING"
    ],
    [
      116,
      52,
      0,
      31,
      3,
      "LATENT"
    ],
    [
      117,
      69,
      0,
      6,
      0,
      "CLIP"
    ],
    [
      118,
      69,
      0,
      33,
      0,
      "CLIP"
    ],
    [
      128,
      74,
      0,
      31,
      0,
      "MODEL"
    ],
    [
      129,
      53,
      0,
      76,
      0,
      "IMAGE"
    ],
    [
      130,
      76,
      0,
      52,
      0,
      "IMAGE"
    ],
    [
      131,
      8,
      0,
      77,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.9090909090909091,
      "offset": [
        23.587387084960938,
        164.39566040039062
      ]
    },
    "frontendVersion": "1.35.0",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

🟪 flux1-dev-kontext_fp8_scaled.safetensors を読み込みます。
🟩 FluxKontextImageScale ノードで、入力画像を Kontext 向けの解像度にリサイズします。
- Flux 推奨の解像度があるのですが、その中から、アスペクト比が近い解像度が自動的に選ばれます。
🟩 リサイズした画像を latent に変換し、ReferenceLatent に接続します。

プロンプトの書き方

基本的に公式のプロンプトガイドに従います。

FLUX.1 Kontext Prompting Guide

とはいえ、特別な記法があるわけではありません。 「◯◯を△△して」 という形で、やりたいことをそのまま英語で書けばだいたい動いてくれます。

もし、変更したくないところまで変わってしまうとき（例：髪型だけ変えたいのに背景まで変わる）には、次のように「変えてほしくない条件」を明示します。

e.g. Keep the person's pose, position, and size the same.

とはいえ、モデルの性能として、指示にうまく従わないこともよくあります。
まだまだ、あまり多くを求めすぎてはいけません。

できること

画像編集

Change the hair to a messy blonde bob.

絵柄変換

This character is made out of Lego blocks.

オブジェクト除去

Remove the woman

テキスト置き換え

Replace [OPEN] with [FLUX]

サブジェクト転送

A photo of a girl who received a stuffed elephant as a Christmas present.

ガイドによる位置指定

Add a sailing ship to the box position.

雑コラのリファイン

手動で作ったコラージュ画像を 溶け込ませる という編集します。

Transform the flat duck sticker into a realistic plush duck toy with the same blue hat and place it in the woman’s arms so she is naturally hugging it. Also turn the outlined pendant lamp into a realistic lamp, removing the white sticker edges and matching the scene’s lighting, color, and perspective.

この能力を底上げするLoRAもあります。

Place it Flux Kontext LoRA

Flux.1 Kontext

Flux.1 Kontextとは？

指示ベース画像編集とは？

モデルのダウンロード

workflow（基本形）

プロンプトの書き方

できること

画像編集

絵柄変換

オブジェクト除去

テキスト置き換え

サブジェクト転送

ガイドによる位置指定

雑コラのリファイン

jsonコピーボタンとは？

修正・誤字報告

記事リクエスト

感想・その他

ありがとうございます

Flux.1 Kontext

Flux.1 Kontextとは？

指示ベース画像編集とは？

モデルのダウンロード

workflow（基本形）

プロンプトの書き方

できること

画像編集

絵柄変換

オブジェクト除去

テキスト置き換え

サブジェクト転送

ガイドによる位置指定

雑コラのリファイン

関連Workflow