AIを使ったマスク生成

inpainting などでマスクを作る場面は多いですが、毎回手書きをしたり、マスク画像を用意するのは大変です。なにより自動化できません。

しかし、単に「この部分をマスクして」と言うだけで、簡単にキレイなマスクが作れる技術は、あんまりありません。

いくつかの AI 技術を組み合わせて考える必要があります。

物体検出 - 画像内の対象がどこにあるかを見つけます。
セグメンテーション - 対象の形をマスクとして切り出します。
マッティング - 前景と背景の境界を、より細かく扱います。

たとえば「対象を見つけるために物体検出を使い、その結果をセグメンテーションに渡してマスク化する」といった流れですね。

どんなものがあるか、見てみましょう。

物体検出 (Detection)

その名の通り、画像内にある特定の物体の位置を特定することができ、BBOXと呼ばれる四角い範囲を出力します。

YOLO系

リアルタイムに物体を検出することを目的としている、超高速な検出技術です。

基本的には、検出したい物体の種類に対して一つのモデル（顔専用、手専用など）を作るため、モデルがなければ自分で作る必要がありますし、複数の種類を検出したい場合には不向きです。

その分、処理が非常に軽いので、高速処理が必要な場合に適しています。

Grounding DINO 他

テキストで指定した物体を検出し、BBOXを出力します。

YOLOとは違い、「white dog」「red car」など好きなテキストで物体を指定できるため使い勝手が良く、同時に複数の物体を検出することもできます。

VLM / MLLM

画像を見る能力を持った LLM が、VLM / MLLM です。

キャプション生成など様々なことができますが、その中の一つに物体検出ができるものがあります。

かなり古いですが、代表的なものには Florence-2 のようなものがあります。

速度は遅いですが、理解力が高いため「画面右側に映っている青い帽子を被った女性」のように、複雑な文章で対象を指定できます。

マッティング (Matting)

「背景除去」と呼ばれる処理の多くは、マッティングです。

手前にあるものと背景を分け、髪の毛のような細かい境界や半透明の部分も扱えます。

ただし、セグメンテーションのように特定の物体を指定して抜くものではありません。

BiRefNet

詳しい使い方は BiRefNet のページで扱っています。

セグメンテーション (Segmentation)

SAM (Segment Anything Model)

現在最も有名なセグメンテーションモデルです。

「物の形」を理解しているため、写真内の車などをテキスト、ポイント、ボックスで指定すると、その輪郭を見つけてマスクにしてくれます。

現在の最新モデルである SAM 3 / 3.1 のページで扱っています。

実践例

上記の技術を組み合わせて、任意のテキストやカテゴリのマスクを生成してみましょう。

以下の workflow は SAM 3 登場以前によく使われていた構成です。対象指定のセグメンテーションが目的なら、現在はまず SAM 3 / 3.1 を試してください。

古い workflow を読み解きたい場合や、既存環境で同じ構成を再現したい場合の参考として残しています。

必要なカスタムノード

以下は、このページの実践例を動かすために必要になることがあるカスタムノードです。

1038lab/ComfyUI-RMBG
- 以前は、マッティングからセグメンテーションまで幅広く使われていました。
- 現在は ComfyUI core に BiRefNet 系の背景除去も入っているため、まず core 側を確認してください。
ltdrdata/ComfyUI-Impact-Pack
ltdrdata/ComfyUI-Impact-Subpack
- Detailer まわりで使われることが多いノード群です。単純なマスク生成だけに使うには少しクセがあります。
kijai/ComfyUI-Florence2
- Florence2 という MLLM を動かします。
kijai/ComfyUI-segment-anything-2
- SAM 2 / 2.1 系のセグメンテーションモデルを動かすために使います。

YOLO × SAM

YOLO_face-SAM.json

{
  "id": "ffcc6c64-e535-4685-ab04-be903b4cdf3c",
  "revision": 0,
  "last_node_id": 8,
  "last_link_id": 6,
  "nodes": [
    {
      "id": 3,
      "type": "UltralyticsDetectorProvider",
      "pos": [
        -131.74129771892854,
        275.10463657117793
      ],
      "size": [
        225.47324988344883,
        100.20074983277442
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "BBOX_DETECTOR",
          "type": "BBOX_DETECTOR",
          "links": [
            2
          ]
        },
        {
          "name": "SEGM_DETECTOR",
          "type": "SEGM_DETECTOR",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-impact-subpack",
        "ver": "1.3.5",
        "Node name for S&R": "UltralyticsDetectorProvider"
      },
      "widgets_values": [
        "segm/person_yolov8m-seg.pt"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 5,
      "type": "SegsToCombinedMask",
      "pos": [
        424.4134665014664,
        275.10463657117793
      ],
      "size": [
        211.851171875,
        26
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "segs",
          "type": "SEGS",
          "link": 3
        }
      ],
      "outputs": [
        {
          "name": "MASK",
          "type": "MASK",
          "links": [
            4
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui-impact-pack",
        "ver": "61bd8397a18e7e7668e6a24e95168967768c2bed",
        "Node name for S&R": "SegsToCombinedMask"
      },
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 6,
      "type": "MaskPreview",
      "pos": [
        679.5682861699395,
        275.10463657117793
      ],
      "size": [
        294.93629499045346,
        258
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "mask",
          "type": "MASK",
          "link": 4
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "MaskPreview"
      },
      "widgets_values": []
    },
    {
      "id": 7,
      "type": "SEGSPreview",
      "pos": [
        424.5080547233428,
        380.8224702427784
      ],
      "size": [
        210,
        314
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "segs",
          "type": "SEGS",
          "link": 5
        },
        {
          "name": "fallback_image_opt",
          "shape": 7,
          "type": "IMAGE",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "shape": 6,
          "type": "IMAGE",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-impact-pack",
        "ver": "61bd8397a18e7e7668e6a24e95168967768c2bed",
        "Node name for S&R": "SEGSPreview"
      },
      "widgets_values": [
        true,
        0.2
      ]
    },
    {
      "id": 8,
      "type": "SAMLoader",
      "pos": [
        -116.2680478354797,
        435.37734731069196
      ],
      "size": [
        210,
        82
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "SAM_MODEL",
          "type": "SAM_MODEL",
          "links": [
            6
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui-impact-pack",
        "ver": "61bd8397a18e7e7668e6a24e95168967768c2bed",
        "Node name for S&R": "SAMLoader"
      },
      "widgets_values": [
        "sam_vit_b_01ec64.pth",
        "AUTO"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 2,
      "type": "LoadImage",
      "pos": [
        -199.16827143603965,
        581.4934848883244
      ],
      "size": [
        288.15658006702404,
        326
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            1
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "1f421a11eb7f46ffcf970787036c5cc1.jpg",
        "image"
      ]
    },
    {
      "id": 1,
      "type": "ImpactSimpleDetectorSEGS",
      "pos": [
        137.03559995799336,
        275.10463657117793
      ],
      "size": [
        244.07421875,
        310
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [
        {
          "name": "bbox_detector",
          "type": "BBOX_DETECTOR",
          "link": 2
        },
        {
          "name": "image",
          "type": "IMAGE",
          "link": 1
        },
        {
          "name": "sam_model_opt",
          "shape": 7,
          "type": "SAM_MODEL",
          "link": 6
        },
        {
          "name": "segm_detector_opt",
          "shape": 7,
          "type": "SEGM_DETECTOR",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "SEGS",
          "type": "SEGS",
          "links": [
            3,
            5
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui-impact-pack",
        "ver": "61bd8397a18e7e7668e6a24e95168967768c2bed",
        "Node name for S&R": "ImpactSimpleDetectorSEGS"
      },
      "widgets_values": [
        0.5,
        0,
        3,
        10,
        0.5,
        0,
        0,
        0.7,
        0
      ],
      "color": "#232",
      "bgcolor": "#353"
    }
  ],
  "links": [
    [
      1,
      2,
      0,
      1,
      1,
      "IMAGE"
    ],
    [
      2,
      3,
      0,
      1,
      0,
      "BBOX_DETECTOR"
    ],
    [
      3,
      1,
      0,
      5,
      0,
      "SEGS"
    ],
    [
      4,
      5,
      0,
      6,
      0,
      "MASK"
    ],
    [
      5,
      1,
      0,
      7,
      0,
      "SEGS"
    ],
    [
      6,
      8,
      0,
      1,
      2,
      "SAM_MODEL"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.839054528882405,
      "offset": [
        431.4600310048111,
        -114.3219362287694
      ]
    },
    "frontendVersion": "1.33.8",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

高速な顔検出（YOLO）と SAM（初期）の組み合わせです。

Grounding DINO × SAM

Grounding_DINO_HQ-SAM.json

{
  "id": "45213769-31e7-40a4-9027-26c67d437c51",
  "revision": 0,
  "last_node_id": 6,
  "last_link_id": 4,
  "nodes": [
    {
      "id": 4,
      "type": "LoadImage",
      "pos": [
        -84.57715485740746,
        436.65995789100543
      ],
      "size": [
        306.56906795083313,
        543.6425774433825
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            1
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "pexels-photo-14705585.jpg",
        "image"
      ]
    },
    {
      "id": 2,
      "type": "SegmentV2",
      "pos": [
        270.53229781565096,
        436.65995789100543
      ],
      "size": [
        340,
        332
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 1
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            3
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        },
        {
          "name": "MASK_IMAGE",
          "type": "IMAGE",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-rmbg",
        "ver": "2.9.4",
        "Node name for S&R": "SegmentV2"
      },
      "widgets_values": [
        "horse",
        "sam_hq_vit_h (2.57GB)",
        "GroundingDINO_SwinT_OGC (694MB)",
        0.35,
        0,
        0,
        false,
        "Color",
        "#00ff00"
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 5,
      "type": "PreviewImage",
      "pos": [
        659.0726825378763,
        436.65995789100543
      ],
      "size": [
        332.83609638042526,
        541.6899599010097
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 3
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    }
  ],
  "links": [
    [
      1,
      4,
      0,
      2,
      0,
      "IMAGE"
    ],
    [
      3,
      2,
      0,
      5,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.7627768444385543,
      "offset": [
        184.57715485740746,
        -336.65995789100543
      ]
    },
    "frontendVersion": "1.33.8",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

Grounding DINO と SAM の改良版である HQ-SAM の組み合わせです。

テキストで対象を指定しつつ、高精度なマスクを生成できるので、最も使われる組み合わせの一つです。

Florence2 × SAM2

Florence2_SAM2.1.json

{
  "id": "b13968f1-cfe5-4646-9f22-ac07831aae2b",
  "revision": 0,
  "last_node_id": 33,
  "last_link_id": 41,
  "nodes": [
    {
      "id": 27,
      "type": "DownloadAndLoadFlorence2Model",
      "pos": [
        797.5498046875,
        435.3081359863281
      ],
      "size": [
        270,
        130
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [
        {
          "name": "lora",
          "shape": 7,
          "type": "PEFTLORA",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "florence2_model",
          "type": "FL2MODEL",
          "links": [
            28
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui-florence2",
        "ver": "de485b65b3e1b9b887ab494afa236dff4bef9a7e",
        "Node name for S&R": "DownloadAndLoadFlorence2Model"
      },
      "widgets_values": [
        "microsoft/Florence-2-base",
        "fp16",
        "sdpa",
        true
      ],
      "color": "#232",
      "bgcolor": "#353"
    },
    {
      "id": 30,
      "type": "Florence2toCoordinates",
      "pos": [
        1548.1920166015625,
        275.46484375
      ],
      "size": [
        270,
        102
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "data",
          "type": "JSON",
          "link": 36
        }
      ],
      "outputs": [
        {
          "name": "center_coordinates",
          "type": "STRING",
          "links": null
        },
        {
          "name": "bboxes",
          "type": "BBOX",
          "links": [
            37
          ]
        }
      ],
      "properties": {
        "cnr_id": "ComfyUI-segment-anything-2",
        "ver": "c59676b008a76237002926f684d0ca3a9b29ac54",
        "Node name for S&R": "Florence2toCoordinates"
      },
      "widgets_values": [
        "0",
        false
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 16,
      "type": "LoadImage",
      "pos": [
        797.5498046875,
        -13.30300235748291
      ],
      "size": [
        270,
        392.65997314453125
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            26,
            34,
            41
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "Clipboard - 2025-05-13 21.27.11.png",
        "image"
      ]
    },
    {
      "id": 29,
      "type": "InvertMask",
      "pos": [
        2183.08349609375,
        215.1739044189453
      ],
      "size": [
        140,
        26
      ],
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "mask",
          "type": "MASK",
          "link": 38
        }
      ],
      "outputs": [
        {
          "name": "MASK",
          "type": "MASK",
          "links": [
            35
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "InvertMask"
      },
      "widgets_values": []
    },
    {
      "id": 23,
      "type": "PreviewImage",
      "pos": [
        2585.65771484375,
        -6.269532203674316
      ],
      "size": [
        374.6875305175781,
        390.1878356933594
      ],
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 32
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    },
    {
      "id": 32,
      "type": "Sam2Segmentation",
      "pos": [
        1870.6756591796875,
        216.38262939453125
      ],
      "size": [
        272.087890625,
        182
      ],
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "sam2_model",
          "type": "SAM2MODEL",
          "link": 40
        },
        {
          "name": "image",
          "type": "IMAGE",
          "link": 41
        },
        {
          "name": "coordinates_positive",
          "shape": 7,
          "type": "STRING",
          "link": null
        },
        {
          "name": "coordinates_negative",
          "shape": 7,
          "type": "STRING",
          "link": null
        },
        {
          "name": "bboxes",
          "shape": 7,
          "type": "BBOX",
          "link": 37
        },
        {
          "name": "mask",
          "shape": 7,
          "type": "MASK",
          "link": null
        }
      ],
      "outputs": [
        {
          "name": "mask",
          "type": "MASK",
          "links": [
            38
          ]
        }
      ],
      "properties": {
        "cnr_id": "ComfyUI-segment-anything-2",
        "ver": "c59676b008a76237002926f684d0ca3a9b29ac54",
        "Node name for S&R": "Sam2Segmentation"
      },
      "widgets_values": [
        true,
        false
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 28,
      "type": "JoinImageWithAlpha",
      "pos": [
        2368.4716796875,
        -6.269532203674316
      ],
      "size": [
        176.86484375,
        46
      ],
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 34
        },
        {
          "name": "alpha",
          "type": "MASK",
          "link": 35
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            32
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.39",
        "Node name for S&R": "JoinImageWithAlpha"
      },
      "widgets_values": []
    },
    {
      "id": 33,
      "type": "DownloadAndLoadSAM2Model",
      "pos": [
        1548.1920166015625,
        82.7560043334961
      ],
      "size": [
        270,
        130
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "sam2_model",
          "type": "SAM2MODEL",
          "links": [
            40
          ]
        }
      ],
      "properties": {
        "cnr_id": "ComfyUI-segment-anything-2",
        "ver": "c59676b008a76237002926f684d0ca3a9b29ac54",
        "Node name for S&R": "DownloadAndLoadSAM2Model"
      },
      "widgets_values": [
        "sam2.1_hiera_base_plus.safetensors",
        "single_image",
        "cuda",
        "fp16"
      ],
      "color": "#432",
      "bgcolor": "#653"
    },
    {
      "id": 25,
      "type": "Florence2Run",
      "pos": [
        1107.8709716796875,
        74.4581298828125
      ],
      "size": [
        400,
        364
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 26
        },
        {
          "name": "florence2_model",
          "type": "FL2MODEL",
          "link": 28
        }
      ],
      "outputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "links": []
        },
        {
          "name": "mask",
          "type": "MASK",
          "links": []
        },
        {
          "name": "caption",
          "type": "STRING",
          "links": null
        },
        {
          "name": "data",
          "type": "JSON",
          "links": [
            36
          ]
        }
      ],
      "properties": {
        "cnr_id": "comfyui-florence2",
        "ver": "de485b65b3e1b9b887ab494afa236dff4bef9a7e",
        "Node name for S&R": "Florence2Run"
      },
      "widgets_values": [
        "goldfish",
        "caption_to_phrase_grounding",
        true,
        false,
        1024,
        3,
        true,
        "",
        1234,
        "fixed"
      ],
      "color": "#232",
      "bgcolor": "#353"
    }
  ],
  "links": [
    [
      26,
      16,
      0,
      25,
      0,
      "IMAGE"
    ],
    [
      28,
      27,
      0,
      25,
      1,
      "FL2MODEL"
    ],
    [
      32,
      28,
      0,
      23,
      0,
      "IMAGE"
    ],
    [
      34,
      16,
      0,
      28,
      0,
      "IMAGE"
    ],
    [
      35,
      29,
      0,
      28,
      1,
      "MASK"
    ],
    [
      36,
      25,
      3,
      30,
      0,
      "JSON"
    ],
    [
      37,
      30,
      1,
      32,
      4,
      "BBOX"
    ],
    [
      38,
      32,
      0,
      29,
      0,
      "MASK"
    ],
    [
      40,
      33,
      0,
      32,
      0,
      "SAM2MODEL"
    ],
    [
      41,
      16,
      0,
      32,
      1,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.620921323059155,
      "offset": [
        -697.5498046875,
        113.30300235748291
      ]
    },
    "reroutes": [
      {
        "id": 1,
        "pos": [
          1829.7442626953125,
          3.2779242992401123
        ],
        "linkIds": [
          34,
          41
        ]
      }
    ],
    "linkExtensions": [
      {
        "id": 34,
        "parentId": 1
      },
      {
        "id": 41,
        "parentId": 1
      }
    ],
    "frontendVersion": "1.33.8",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

Florence2 と SAM2.1 の組み合わせです。

人や動物といった分かりやすい対象ならなんでも良いのですが、「サングラスをかけた男性」「木の下に寝転んだ猫」など複雑な条件で指定したいときは、このような LLM ベースのモデルが力を発揮します。

SAM 3 × BiRefNet

SAM3_BiRefNet.json

{
  "id": "5231bbde-3d9e-483d-9963-63165fedc646",
  "revision": 0,
  "last_node_id": 12,
  "last_link_id": 18,
  "nodes": [
    {
      "id": 2,
      "type": "PreviewImage",
      "pos": [
        1836.5379900055684,
        293.7408968602474
      ],
      "size": [
        554.9600255276209,
        422.8923553539689
      ],
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 17
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    },
    {
      "id": 1,
      "type": "LoadImage",
      "pos": [
        477.2842309638515,
        293.7408968602474
      ],
      "size": [
        526.1926943110356,
        491.5335516952887
      ],
      "flags": {},
      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            18
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
        "pasted/image (35).png",
        "image"
      ]
    },
    {
      "id": 11,
      "type": "BiRefNetRMBG",
      "pos": [
        1445.5176350953413,
        293.7408968602474
      ],
      "size": [
        340,
        254
      ],
      "flags": {},
      "order": 3,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 16
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            17
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": null
        },
        {
          "name": "MASK_IMAGE",
          "type": "IMAGE",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-rmbg",
        "ver": "2.9.4",
        "Node name for S&R": "BiRefNetRMBG"
      },
      "widgets_values": [
        "BiRefNet-general",
        0,
        0,
        false,
        false,
        "Alpha",
        "#222222"
      ],
      "color": "#323",
      "bgcolor": "#535"
    },
    {
      "id": 5,
      "type": "PreviewImage",
      "pos": [
        1448.15746204173,
        611.2211523676546
      ],
      "size": [
        332.392016078781,
        258
      ],
      "flags": {},
      "order": 2,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 4
        }
      ],
      "outputs": [],
      "properties": {
        "cnr_id": "comfy-core",
        "ver": "0.3.71",
        "Node name for S&R": "PreviewImage"
      },
      "widgets_values": []
    },
    {
      "id": 4,
      "type": "SAM3Segment",
      "pos": [
        1054.497280185114,
        293.7408968602474
      ],
      "size": [
        340,
        332
      ],
      "flags": {},
      "order": 1,
      "mode": 0,
      "inputs": [
        {
          "name": "image",
          "type": "IMAGE",
          "link": 18
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            4,
            16
          ]
        },
        {
          "name": "MASK",
          "type": "MASK",
          "links": []
        },
        {
          "name": "MASK_IMAGE",
          "type": "IMAGE",
          "links": null
        }
      ],
      "properties": {
        "cnr_id": "comfyui-rmbg",
        "ver": "2.9.4",
        "Node name for S&R": "SAM3Segment"
      },
      "widgets_values": [
        "the woman on the right",
        "sam3",
        "Auto",
        0.5,
        0,
        7,
        false,
        "Color",
        "#00ff00"
      ],
      "color": "#432",
      "bgcolor": "#653"
    }
  ],
  "links": [
    [
      4,
      4,
      0,
      5,
      0,
      "IMAGE"
    ],
    [
      16,
      4,
      0,
      11,
      0,
      "IMAGE"
    ],
    [
      17,
      11,
      0,
      2,
      0,
      "IMAGE"
    ],
    [
      18,
      1,
      0,
      4,
      0,
      "IMAGE"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {
    "ds": {
      "scale": 0.8390545288824087,
      "offset": [
        -377.2842309638515,
        -193.7408968602474
      ]
    },
    "frontendVersion": "1.33.8",
    "VHS_latentpreview": false,
    "VHS_latentpreviewrate": 0,
    "VHS_MetadataImage": true,
    "VHS_KeepIntermediate": true
  },
  "version": 0.4
}

セグメンテーションはそもそもオブジェクトを区別するものであり、精細な切り抜きに使うものではありません。

対して、マッティングは髪の毛のような微細なものや、ガラスのような半透明なものも扱えます。

これらを組み合わせることでお互いの能力をかけ合わせることができます。

AIを使ったマスク生成