Lumina-Image 2.0 とは?
Lumina-Image 2.0 は、Unified Next-DiT と Flux 系 VAE を組み合わせた 2.6B パラメータの画像生成モデル です。
Gemma 2B 系テキストエンコーダを採用しつつ、モデル本体は SD3 や FLUX Pro よりかなり小さく、AuraFlow と同じく「比較的軽量で日常使いしやすいベースモデル」枠を狙った設計 になっています。
サイズのわりにプロンプト追従性が高いことも特徴で、次世代ベースモデル候補のひとつとして注目されました。
モデルのダウンロード
-
diffusion_models
-
text_encoders
-
vae
📂ComfyUI/
└──📂models/
├── 📂diffusion_models/
│ └── lumina_2_model_bf16.safetensors
├── 📂text_encoders/
│ └── gemma_2_2b_fp16.safetensors
└── 📂vae/
└── ae.safetensors
text2image

{
"id": "18404b37-92b0-4d11-a39c-ae941838eb83",
"revision": 0,
"last_node_id": 47,
"last_link_id": 68,
"nodes": [
{
"id": 33,
"type": "CLIPTextEncode",
"pos": [
507,
378
],
"size": [
339.84503173828125,
102.47611236572266
],
"flags": {
"collapsed": false
},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 64
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
55
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"worst quality"
]
},
{
"id": 27,
"type": "EmptySD3LatentImage",
"pos": [
579.1014404296875,
547
],
"size": [
267.74359130859375,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
51
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "EmptySD3LatentImage"
},
"widgets_values": [
1024,
1024,
1
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
507,
190
],
"size": [
339.84503173828125,
123.01304626464844
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 63
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
67
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"A whimsical 3D illustration of flowers with bulbous red petals and smooth green stems. Soft, diffused lighting and a clean, off-white background"
]
},
{
"id": 31,
"type": "KSampler",
"pos": [
904.2318115234375,
210.53184509277344
],
"size": [
315,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 66
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 67
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 55
},
{
"name": "latent_image",
"type": "LATENT",
"link": 51
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
52
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "KSampler"
},
"widgets_values": [
777,
"fixed",
25,
4,
"res_multistep",
"normal",
1
]
},
{
"id": 44,
"type": "CLIPLoader",
"pos": [
188.4966278076172,
274.4528503417969
],
"size": [
270,
106
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
63,
64
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"gemma_2_2b_fp16.safetensors",
"lumina2",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 41,
"type": "UNETLoader",
"pos": [
310.51028121398076,
36.66623591530498
],
"size": [
270,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
65
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Lumina\\lumina_2_model_bf16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 46,
"type": "MarkdownNote",
"pos": [
-26.957896179097734,
-25.496790894387402
],
"size": [
309.9175109863281,
228.3336181640625
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [lumina_2_model_bf16.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/diffusion_models)\n- [gemma_2_2b_fp16.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/text_encoders)\n- [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/vae)\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── lumina_2_model_bf16.safetensors\n ├── 📂text_encoders/\n │ └── gemma_2_2b_fp16.safetensors\n └── 📂vae/\n └── ae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 43,
"type": "VAELoader",
"pos": [
985.1763763427734,
88.72033833561756
],
"size": [
234.05543518066406,
58
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"links": [
62
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 45,
"type": "ModelSamplingAuraFlow",
"pos": [
613.4235101781512,
36.89046000588115
],
"size": [
233.42152156013003,
58
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 65
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
66
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "ModelSamplingAuraFlow"
},
"widgets_values": [
6.000000000000001
]
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1246.2337646484375,
211.26541137695312
],
"size": [
170,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 52
},
{
"name": "vae",
"type": "VAE",
"link": 62
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
68
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 47,
"type": "SaveImage",
"pos": [
1442.129204705959,
211.26541137695312
],
"size": [
393.70000000000005,
455.90000000000003
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 68
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.76"
},
"widgets_values": [
"ComfyUI"
]
}
],
"links": [
[
51,
27,
0,
31,
3,
"LATENT"
],
[
52,
31,
0,
8,
0,
"LATENT"
],
[
55,
33,
0,
31,
2,
"CONDITIONING"
],
[
62,
43,
0,
8,
1,
"VAE"
],
[
63,
44,
0,
6,
0,
"CLIP"
],
[
64,
44,
0,
33,
0,
"CLIP"
],
[
65,
41,
0,
45,
0,
"MODEL"
],
[
66,
45,
0,
31,
0,
"MODEL"
],
[
67,
6,
0,
31,
1,
"CONDITIONING"
],
[
68,
8,
0,
47,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.8264462809917354,
"offset": [
126.95789617909773,
126.7067908943874
]
},
"frontendVersion": "1.35.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
Neta Lumina
Neta-Lumina は Lumina-Image 2.0 をベースにした アニメ向けファインチューニングモデル です。
アニメモデルらしく Danbooru タグにも対応しており、中国語・英語・日本語と多言語のプロンプトを受け付けるのが特徴です。
モデルのダウンロード
-
diffusion_models
📂ComfyUI/
└──📂models/
└── 📂diffusion_models/
└── neta-lumina-v1.0.safetensors
text2image

{
"id": "18404b37-92b0-4d11-a39c-ae941838eb83",
"revision": 0,
"last_node_id": 47,
"last_link_id": 68,
"nodes": [
{
"id": 27,
"type": "EmptySD3LatentImage",
"pos": [
579.1014404296875,
547
],
"size": [
267.74359130859375,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
51
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "EmptySD3LatentImage"
},
"widgets_values": [
1024,
1024,
1
]
},
{
"id": 44,
"type": "CLIPLoader",
"pos": [
188.4966278076172,
274.4528503417969
],
"size": [
270,
106
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
63,
64
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"gemma_2_2b_fp16.safetensors",
"lumina2",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 43,
"type": "VAELoader",
"pos": [
985.1763763427734,
88.72033833561756
],
"size": [
234.05543518066406,
58
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"links": [
62
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 45,
"type": "ModelSamplingAuraFlow",
"pos": [
613.4235101781512,
36.89046000588115
],
"size": [
233.42152156013003,
58
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 65
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
66
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "ModelSamplingAuraFlow"
},
"widgets_values": [
6.000000000000001
]
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1246.2337646484375,
211.26541137695312
],
"size": [
170,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 52
},
{
"name": "vae",
"type": "VAE",
"link": 62
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
68
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
507,
190
],
"size": [
339.84503173828125,
123.01304626464844
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 63
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
67
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"You are an assistant designed to generate anime images based on textual prompts. <Prompt Start>\n\n1girl, solo, red hair, long hair, wet hair, red spider lily, flower in mouth, casual clothes, white blouse, light cardigan, floating on water, water surface, gentle ripples, lying on back, upper body, top-down view, detailed eyes, cinematic anime style, high-end anime, refined lineart, subtle shading, soft glow, morning, sunrise, golden hour lighting, sparkling water, light particles, best quality,\nA cinematic, high-quality anime illustration of a red-haired young woman floating quietly on the surface of calm water at sunrise, viewed from above in a medium shot. She wears simple, modern clothing—a white blouse layered with a light cardigan—that clings slightly to her as the fabric is soaked, giving a natural sense of weight and texture without feeling eroticized. A vivid red spider lily rests gently between her lips, its petals contrasting against her pale skin and soft, wet hair that fans out around her in the water. Warm golden-hour sunlight streams in from one side, scattering fine sparkles across the water surface and creating delicate bokeh-like highlights around her face and shoulders. The shading and coloring are polished like a high-budget anime film, with refined linework, nuanced gradients, and carefully rendered reflections. Subtle ripples expand from her body, and the overall composition focuses on a serene, poetic mood with precise detail in her expression, hair, clothing folds, and the spider lily."
]
},
{
"id": 33,
"type": "CLIPTextEncode",
"pos": [
507,
378
],
"size": [
339.84503173828125,
102.47611236572266
],
"flags": {
"collapsed": false
},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 64
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
55
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"You are an assistant designed to generate low-quality images based on textual prompts <Prompt Start>\nblurry, worst quality, low quality, deformed hands, bad anatomy,\nextra limbs, poorly drawn face, mutated, extra eyes, bad proportions"
]
},
{
"id": 31,
"type": "KSampler",
"pos": [
904.2318115234375,
210.53184509277344
],
"size": [
315,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 66
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 67
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 55
},
{
"name": "latent_image",
"type": "LATENT",
"link": 51
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
52
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "KSampler"
},
"widgets_values": [
123456,
"fixed",
30,
5.5,
"res_multistep",
"linear_quadratic",
1
]
},
{
"id": 41,
"type": "UNETLoader",
"pos": [
310.51028121398076,
36.66623591530498
],
"size": [
270,
82
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
65
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Lumina\\neta-lumina-v1.0.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 47,
"type": "SaveImage",
"pos": [
1442.129204705959,
211.26541137695312
],
"size": [
380.99299999999994,
449.66200000000003
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 68
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.76"
},
"widgets_values": [
"ComfyUI"
]
},
{
"id": 46,
"type": "MarkdownNote",
"pos": [
-54.62550631501746,
-44.09894580183908
],
"size": [
309.9175109863281,
228.3336181640625
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [neta-lumina-v1.0.safetensors](https://huggingface.co/neta-art/Neta-Lumina/blob/main/Unet/neta-lumina-v1.0.safetensors)\n- [gemma_2_2b_fp16.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/text_encoders)\n- [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/vae)\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── neta-lumina-v1.0.safetensors\n ├── 📂text_encoders/\n │ └── gemma_2_2b_fp16.safetensors\n └── 📂vae/\n └── ae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
}
],
"links": [
[
51,
27,
0,
31,
3,
"LATENT"
],
[
52,
31,
0,
8,
0,
"LATENT"
],
[
55,
33,
0,
31,
2,
"CONDITIONING"
],
[
62,
43,
0,
8,
1,
"VAE"
],
[
63,
44,
0,
6,
0,
"CLIP"
],
[
64,
44,
0,
33,
0,
"CLIP"
],
[
65,
41,
0,
45,
0,
"MODEL"
],
[
66,
45,
0,
31,
0,
"MODEL"
],
[
67,
6,
0,
31,
1,
"CONDITIONING"
],
[
68,
8,
0,
47,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.6830134553650705,
"offset": [
154.62550631501745,
144.09894580183908
]
},
"frontendVersion": "1.35.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
- サンプラーは公式設定に従い、
res_multistep/linear_quadraticを使用します。
プロンプトに少し特徴があり、実際に書かせたいテキストの前に システムプロンプト を書く必要があります。
You are an assistant designed to generate anime images based on textual prompts. <Prompt Start>
1girl, portrait, ...
詳しくは公式のPrompt Bookを参照してください。
NetaYume Lumina
Neta Lumina をベースに、さらにファインチューニングした NetaYume Lumina というモデルもあります。
せっかくなので、こちらもご紹介しましょう。
モデルのダウンロード
-
diffusion_models
📂ComfyUI/
└──📂models/
└── 📂diffusion_models/
└── NetaYumev4_unet.safetensors
text2image

{
"id": "18404b37-92b0-4d11-a39c-ae941838eb83",
"revision": 0,
"last_node_id": 47,
"last_link_id": 68,
"nodes": [
{
"id": 27,
"type": "EmptySD3LatentImage",
"pos": [
579.1014404296875,
547
],
"size": [
267.74359130859375,
106
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
51
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "EmptySD3LatentImage"
},
"widgets_values": [
1024,
1024,
1
]
},
{
"id": 44,
"type": "CLIPLoader",
"pos": [
188.4966278076172,
274.4528503417969
],
"size": [
270,
106
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
63,
64
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "CLIPLoader"
},
"widgets_values": [
"gemma_2_2b_fp16.safetensors",
"lumina2",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 43,
"type": "VAELoader",
"pos": [
985.1763763427734,
88.72033833561756
],
"size": [
234.05543518066406,
58
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"links": [
62
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 45,
"type": "ModelSamplingAuraFlow",
"pos": [
613.4235101781512,
36.89046000588115
],
"size": [
233.42152156013003,
58
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 65
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
66
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "ModelSamplingAuraFlow"
},
"widgets_values": [
6.000000000000001
]
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1246.2337646484375,
211.26541137695312
],
"size": [
170,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 52
},
{
"name": "vae",
"type": "VAE",
"link": 62
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
68
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 33,
"type": "CLIPTextEncode",
"pos": [
507,
378
],
"size": [
339.84503173828125,
102.47611236572266
],
"flags": {
"collapsed": false
},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 64
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
55
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"You are an assistant designed to generate low-quality images based on textual prompts <Prompt Start>\nblurry, worst quality, low quality, deformed hands, bad anatomy,\nextra limbs, poorly drawn face, mutated, extra eyes, bad proportions"
]
},
{
"id": 47,
"type": "SaveImage",
"pos": [
1442.129204705959,
211.26541137695312
],
"size": [
380.99299999999994,
449.66200000000003
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 68
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.76"
},
"widgets_values": [
"ComfyUI"
]
},
{
"id": 41,
"type": "UNETLoader",
"pos": [
310.51028121398076,
36.66623591530498
],
"size": [
270,
82
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
65
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Lumina\\NetaYumev4_unet.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
507,
190
],
"size": [
339.84503173828125,
123.01304626464844
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 63
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
67
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"You are an assistant designed to generate anime images based on textual prompts. <Prompt Start>\n\n1girl, solo, white hair, long hair, wet hair, white spider lily, flowers on water, simple white dress, long dress, floating on red water, crimson sea, side view, close-up, face focus, tilted angle, diagonal composition, detailed eyes, cinematic anime style, high-end anime, refined lineart, dramatic lighting, glowing reflections, best quality,\nA cinematic, high-quality anime illustration of a white-haired young woman floating in a calm crimson sea at twilight, shown in a close-up side view along the water’s surface. She wears a modest, simple long white dress that spreads softly in the red water, its wet fabric drifting and folding with gentle motion. Several delicate white spider lilies float on the surface around her, some catching on the hem of her dress and near her shoulder, their pale petals forming a striking contrast against the deep red sea. The composition uses a slightly tilted, diagonal angle so that her face and the waterline create a dynamic, film-like frame, with her serene expression and detailed eyes as the main focus. Dramatic but controlled lighting makes the red water glow with subtle highlights and reflections, while soft specular light traces the contours of her face, hair, and dress. The rendering style resembles a high-budget anime film, with refined linework, nuanced gradients, and carefully painted reflections, emphasizing the interplay of white and crimson and the quiet, otherworldly atmosphere of the scene."
]
},
{
"id": 31,
"type": "KSampler",
"pos": [
904.2318115234375,
210.53184509277344
],
"size": [
315,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 66
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 67
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 55
},
{
"name": "latent_image",
"type": "LATENT",
"link": 51
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
52
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "KSampler"
},
"widgets_values": [
7777,
"fixed",
30,
5.5,
"res_multistep",
"linear_quadratic",
1
]
},
{
"id": 46,
"type": "MarkdownNote",
"pos": [
-54.62550631501746,
-44.09894580183908
],
"size": [
309.9175109863281,
228.3336181640625
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n- [NetaYumev4_unet.safetensors](https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0/blob/main/Unet/v4/NetaYumev4_unet.safetensors)\n- [gemma_2_2b_fp16.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/text_encoders)\n- [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/tree/main/split_files/vae)\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── NetaYumev4_unet.safetensors\n ├── 📂text_encoders/\n │ └── gemma_2_2b_fp16.safetensors\n └── 📂vae/\n └── ae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
}
],
"links": [
[
51,
27,
0,
31,
3,
"LATENT"
],
[
52,
31,
0,
8,
0,
"LATENT"
],
[
55,
33,
0,
31,
2,
"CONDITIONING"
],
[
62,
43,
0,
8,
1,
"VAE"
],
[
63,
44,
0,
6,
0,
"CLIP"
],
[
64,
44,
0,
33,
0,
"CLIP"
],
[
65,
41,
0,
45,
0,
"MODEL"
],
[
66,
45,
0,
31,
0,
"MODEL"
],
[
67,
6,
0,
31,
1,
"CONDITIONING"
],
[
68,
8,
0,
47,
0,
"IMAGE"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.8264462809917354,
"offset": [
154.62550631501745,
144.09894580183908
]
},
"frontendVersion": "1.35.0",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
NewBie image Exp0.1
NewBie-image(Exp0.1)は、Luminaアーキテクチャ研究の知見を踏まえ、Next-DiTを土台に設計されたNewBie独自アーキテクチャのアニメ向けT2Iモデルです。より強力なテキストエンコーダを使用し、XML形式プロンプト(構造化タグ)でより細かな制御ができるようにしています。
こちらのモデルはまだ20%の訓練しかされていません。今後のアップデートによってworkflowが変更されることがあります。
モデルのダウンロード
-
diffusion models
-
text encoders
-
vae
📂ComfyUI/
└──📂models/
├── 📂diffusion_models/
│ └── NewBie-Image-Exp0.1-bf16.safetensors
├── 📂text_encoders/
│ ├── gemma_3_4b_it_bf16.safetensors
│ └── jina_clip_v2_bf16.safetensors
└── 📂vae/
└── ae.safetensors
text2image

{
"id": "18404b37-92b0-4d11-a39c-ae941838eb83",
"revision": 0,
"last_node_id": 49,
"last_link_id": 70,
"nodes": [
{
"id": 43,
"type": "VAELoader",
"pos": [
985.1763763427734,
88.72033833561756
],
"size": [
234.05543518066406,
58
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"links": [
62
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1246.2337646484375,
211.26541137695312
],
"size": [
170,
46
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 52
},
{
"name": "vae",
"type": "VAE",
"link": 62
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
68
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 41,
"type": "UNETLoader",
"pos": [
310.51028121398076,
36.66623591530498
],
"size": [
270,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
65
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "UNETLoader"
},
"widgets_values": [
"Lumina\\NewBie-Image-Exp0.1-bf16.safetensors",
"default"
],
"color": "#323",
"bgcolor": "#535"
},
{
"id": 33,
"type": "CLIPTextEncode",
"pos": [
507,
378
],
"size": [
339.84503173828125,
102.47611236572266
],
"flags": {
"collapsed": false
},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 70
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
55
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"<e621_tags>furry</e621_tags>\n\n<danbooru_tags>\n furry, english_text, chinese_text, korean_text, speech_bubble, logo, signature, watermark, web_address,\n artist_name, character_name, copyright_name, twitter_username,\n dated, low_score, worst_quality, low_quality, bad_quality, lowres, blurry, blurred, pixelated,\n compression_artifacts, jpeg_artifacts,\n bad_anatomy, deformed_hands, deformed_fingers, fused_fingers, missing_fingers,\n extra_limbs, extra_arms, extra_legs, extra_fingers, extra_digits,\n wrong_hands, ugly_hands, bad_proportions, poorly_drawn_face, extra_eyes, mutated\n</danbooru_tags>\n\n<resolution>low_resolution</resolution>\n"
]
},
{
"id": 27,
"type": "EmptySD3LatentImage",
"pos": [
579.1014404296875,
547
],
"size": [
267.74359130859375,
106
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
51
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "EmptySD3LatentImage"
},
"widgets_values": [
1024,
1536,
1
]
},
{
"id": 45,
"type": "ModelSamplingAuraFlow",
"pos": [
613.4235101781512,
36.89046000588115
],
"size": [
233.42152156013003,
58
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 65
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
66
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.41",
"Node name for S&R": "ModelSamplingAuraFlow"
},
"widgets_values": [
6.000000000000001
]
},
{
"id": 49,
"type": "DualCLIPLoader",
"pos": [
177.29417778458955,
290.9165148987406
],
"size": [
279.4214876033057,
130
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
69,
70
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.5.1",
"Node name for S&R": "DualCLIPLoader"
},
"widgets_values": [
"gemma_3_4b_it_bf16.safetensors",
"jina_clip_v2_bf16.safetensors",
"newbie",
"default"
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 31,
"type": "KSampler",
"pos": [
904.2318115234375,
210.53184509277344
],
"size": [
315,
262
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 66
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 67
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 55
},
{
"name": "latent_image",
"type": "LATENT",
"link": 51
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
52
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "KSampler"
},
"widgets_values": [
7777,
"fixed",
30,
4.5,
"res_multistep",
"linear_quadratic",
1
]
},
{
"id": 47,
"type": "SaveImage",
"pos": [
1442.129204705959,
211.26541137695312
],
"size": [
353.4448074477016,
493.73916861441126
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 68
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.76"
},
"widgets_values": [
"ComfyUI"
]
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
507,
190
],
"size": [
339.84503173828125,
123.01304626464844
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 69
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
67
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.39",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"<character_1>\n <n>character_1</n>\n <gender>1girl</gender>\n <appearance>\n solo, black_hair, long_hair, wet_hair, floating_hair,\n sharp_eyes, intense_gaze\n </appearance>\n <clothing>\n white_dress, wet_clothes\n </clothing>\n <expression>\n serious, stoic, closed_mouth\n </expression>\n <action>\n underwater, sinking, bubbles, bubble_trail, water_droplets\n </action>\n <position>\n portrait, upper_body, dynamic_angle, dutch_angle, diagonal_composition\n </position>\n</character_1>\n\n<general_tags>\n <style>\n anime_style, key_visual, official_art, illustration,\n refined_lineart, clean_lineart, high_contrast\n </style>\n <background>\n underwater, deep_blue_water, water_surface, waterline,\n caustics, light_rays, reflections\n </background>\n <atmosphere>\n cool, dramatic, cinematic, ethereal\n </atmosphere>\n <quality>\n masterpiece, best_quality, very_aesthetic, no_text\n </quality>\n <resolution>max_high_resolution</resolution>\n</general_tags>\n"
]
},
{
"id": 46,
"type": "MarkdownNote",
"pos": [
-77.16495034206478,
-42.42937518851968
],
"size": [
344.97886071896573,
240.85554679796087
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {},
"widgets_values": [
"## models\n* [NewBie-Image-Exp0.1-bf16.safetensors](https://huggingface.co/Comfy-Org/NewBie-image-Exp0.1_repackaged/blob/main/split_files/diffusion_models/NewBie-Image-Exp0.1-bf16.safetensors)\n* [gemma_3_4b_it_bf16.safetensors](https://huggingface.co/Comfy-Org/NewBie-image-Exp0.1_repackaged/blob/main/split_files/text_encoders/gemma_3_4b_it_bf16.safetensors)\n* [jina_clip_v2_bf16.safetensors](https://huggingface.co/Comfy-Org/NewBie-image-Exp0.1_repackaged/blob/main/split_files/text_encoders/jina_clip_v2_bf16.safetensors)\n* [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/blob/main/split_files/vae/ae.safetensors)\n\n```\n📂ComfyUI/\n└──📂models/\n ├── 📂diffusion_models/\n │ └── NewBie-Image-Exp0.1-bf16.safetensors\n ├── 📂text_encoders/\n │ ├── gemma_3_4b_it_bf16.safetensors\n │ └── jina_clip_v2_bf16.safetensors\n └── 📂vae/\n └── ae.safetensors\n\n```"
],
"color": "#323",
"bgcolor": "#535"
}
],
"links": [
[
51,
27,
0,
31,
3,
"LATENT"
],
[
52,
31,
0,
8,
0,
"LATENT"
],
[
55,
33,
0,
31,
2,
"CONDITIONING"
],
[
62,
43,
0,
8,
1,
"VAE"
],
[
65,
41,
0,
45,
0,
"MODEL"
],
[
66,
45,
0,
31,
0,
"MODEL"
],
[
67,
6,
0,
31,
1,
"CONDITIONING"
],
[
68,
8,
0,
47,
0,
"IMAGE"
],
[
69,
49,
0,
6,
0,
"CLIP"
],
[
70,
49,
0,
33,
0,
"CLIP"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.9090909090909091,
"offset": [
308.41858706005223,
258.7552199824561
]
},
"frontendVersion": "1.36.7",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
プロンプトは XML形式(タグで区切る構造化) が推奨されています。
<general_tags>
<style>
anime_style, key_visual, official_art, illustration,
refined_lineart, clean_lineart, high_contrast
</style>
<background>
underwater, deep_blue_water, water_surface, waterline,
caustics, light_rays, reflections
</background>
</general_tags>
とはいえ自然文で書いても問題なく生成できるため、まずは気軽に試してみてください。
詳しくは公式のプロンプトガイドを参照してください。