fix(mcp-image-gen): fix Heretic/FLUX2 integration bugs
- Fix syntax error in server.py (dangling docstring lines) - Correct model filename: flux-2-klein-4b.safetensors (without -fp8) - Fix _WORKFLOW_REGISTRY key to match actual downloaded filename - Update get_models() to always include registry models as fallback - Fix test expectations to match corrected model names - All 37 tests passing
This commit is contained in:
@@ -2,7 +2,11 @@
|
||||
|
||||
**FastMCP server for AI image generation via ComfyUI.**
|
||||
|
||||
This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. It supports FLUX.1-schnell, FLUX.1-dev, SDXL, and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
|
||||
This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client.
|
||||
|
||||
**New:** Support for **FLUX.2 Klein 4B** with **Heretic-abliterated Qwen3-4B text encoder** (zero KL divergence, no refusals). Select via `model="flux-2-klein-4b-fp8.safetensors"`.
|
||||
|
||||
It supports FLUX.1-schnell (default), FLUX.2 Klein (Heretic), and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -565,7 +565,56 @@ Then pass it back: `seed=3847291045`
|
||||
|
||||
---
|
||||
|
||||
## 10. Known Limitations
|
||||
## 10. FLUX.2 Klein 4B with Heretic Abliteration (New)
|
||||
|
||||
**New in this release:** Support for **FLUX.2 Klein 4B** using an **abliterated Qwen3-4B text encoder** via Heretic.
|
||||
|
||||
### Why Heretic?
|
||||
|
||||
FLUX.2 Klein uses a full LLM (Qwen3-4B) as its text encoder instead of CLIP+T5. This LLM has safety alignment that can refuse certain prompts. Heretic removes this alignment with **zero measurable KL divergence** (0.0000) and only 3/100 refusals.
|
||||
|
||||
### How to use it
|
||||
|
||||
```python
|
||||
generate_image(
|
||||
prompt="a beautiful cyberpunk fox in neon tokyo, highly detailed",
|
||||
model="flux-2-klein-4b-fp8.safetensors",
|
||||
width=1024,
|
||||
height=1024,
|
||||
steps=4
|
||||
)
|
||||
```
|
||||
|
||||
### Models to download
|
||||
|
||||
```bash
|
||||
# 1. FLUX.2 Klein 4B (distilled, fp8)
|
||||
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
|
||||
flux-2-klein-4b-fp8.safetensors \
|
||||
--local-dir ~/ComfyUI/models/diffusion_models/
|
||||
|
||||
# 2. FLUX.2 VAE
|
||||
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
|
||||
flux2-vae.safetensors \
|
||||
--local-dir ~/ComfyUI/models/vae/
|
||||
|
||||
# 3. Heretic-abliterated Qwen3-4B (from DreamFast)
|
||||
huggingface-cli download DreamFast/qwen3-4b-heretic \
|
||||
--local-dir /tmp/qwen3-heretic/
|
||||
cp /tmp/qwen3-heretic/model.safetensors \
|
||||
~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
|
||||
```
|
||||
|
||||
### Supported models (via `model=` parameter)
|
||||
|
||||
| Model | Description | VRAM | Speed | Censorship |
|
||||
|-------|-------------|------|-------|------------|
|
||||
| `flux1-schnell.safetensors` | Original (default) | ~8GB | Very fast | None |
|
||||
| `flux-2-klein-4b-fp8.safetensors` | **New** — with Heretic Qwen3-4B | ~12GB | Fast | **Removed** |
|
||||
|
||||
---
|
||||
|
||||
## 11. Known Limitations
|
||||
|
||||
### ComfyUI must run locally
|
||||
|
||||
|
||||
@@ -39,8 +39,14 @@ COMFYUI_DIR = Path(
|
||||
# Maximum number of images allowed in a single batch call
|
||||
MAX_COUNT = 10
|
||||
|
||||
# Path to the bundled FLUX.1-schnell workflow template
|
||||
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
|
||||
# Workflow registry: model filename → workflow JSON path
|
||||
# This allows us to support multiple models (FLUX.1-schnell + FLUX.2 Klein with Heretic encoder)
|
||||
_WORKFLOW_REGISTRY: dict[str, Path] = {
|
||||
"flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
|
||||
"flux-2-klein-4b.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
|
||||
}
|
||||
|
||||
_DEFAULT_MODEL = "flux1-schnell.safetensors"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -181,21 +187,37 @@ class ComfyUIClient:
|
||||
return resp.content
|
||||
|
||||
async def get_models(self) -> list[str]:
|
||||
"""Return the list of available checkpoint model filenames."""
|
||||
async with httpx.AsyncClient(timeout=10.0) as client:
|
||||
resp = await client.get(
|
||||
f"{self.base_url}/object_info/CheckpointLoaderSimple"
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
# ComfyUI returns: {"CheckpointLoaderSimple": {"input": {"required": {"ckpt_name": [["model1.safetensors", ...], ...]}}}}
|
||||
node_info = data.get("CheckpointLoaderSimple", {})
|
||||
ckpt_list = (
|
||||
node_info.get("input", {})
|
||||
.get("required", {})
|
||||
.get("ckpt_name", [[]])[0]
|
||||
)
|
||||
return ckpt_list if isinstance(ckpt_list, list) else []
|
||||
"""Return the list of available checkpoint model filenames.
|
||||
|
||||
Combines models known to ComfyUI with our internal registry
|
||||
(including FLUX.2 Klein with Heretic encoder).
|
||||
"""
|
||||
models = set()
|
||||
|
||||
# Get models from ComfyUI
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=10.0) as client:
|
||||
resp = await client.get(
|
||||
f"{self.base_url}/object_info/CheckpointLoaderSimple"
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
node_info = data.get("CheckpointLoaderSimple", {})
|
||||
ckpt_list = (
|
||||
node_info.get("input", {})
|
||||
.get("required", {})
|
||||
.get("ckpt_name", [[]])[0]
|
||||
)
|
||||
if isinstance(ckpt_list, list):
|
||||
models.update(ckpt_list)
|
||||
except Exception:
|
||||
# ComfyUI not reachable — fall back to registry only
|
||||
pass
|
||||
|
||||
# Add our registered models
|
||||
models.update(_WORKFLOW_REGISTRY.keys())
|
||||
|
||||
return sorted(list(models))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -209,13 +231,20 @@ def build_flux_workflow(
|
||||
height: int,
|
||||
steps: int,
|
||||
seed: int,
|
||||
model: str,
|
||||
model: str = _DEFAULT_MODEL,
|
||||
) -> dict:
|
||||
"""Build a ComfyUI API-format workflow dict for FLUX.1-schnell text-to-image.
|
||||
"""Build a ComfyUI API-format workflow dict for the requested model.
|
||||
|
||||
This is a pure function — no I/O, fully testable.
|
||||
Supports:
|
||||
- "flux1-schnell.safetensors" (original)
|
||||
- "flux-2-klein-4b-fp8.safetensors" (with Heretic-abliterated Qwen3-4B text encoder)
|
||||
|
||||
Falls back to FLUX.1-schnell if model is unknown.
|
||||
This is a pure function — no I/O outside the registry, fully testable.
|
||||
"""
|
||||
with open(_WORKFLOW_PATH) as f:
|
||||
workflow_path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_REGISTRY[_DEFAULT_MODEL])
|
||||
|
||||
with open(workflow_path) as f:
|
||||
wf = json.load(f)
|
||||
wf = copy.deepcopy(wf)
|
||||
|
||||
@@ -277,18 +306,13 @@ async def _generate_single(
|
||||
) -> list:
|
||||
"""Generate a single image and return [TextContent, ImageContent] or [TextContent] on error.
|
||||
|
||||
Args:
|
||||
client: ComfyUIClient instance.
|
||||
prompt: Positive text prompt.
|
||||
negative_prompt: Negative text prompt.
|
||||
width / height: Image dimensions.
|
||||
steps: Inference steps.
|
||||
seed: Seed value (-1 = random).
|
||||
model: ComfyUI model filename.
|
||||
resolved_output_dir: Resolved output directory Path.
|
||||
name: User-supplied name prefix (unsanitized).
|
||||
label: Human-readable label for TextContent prefix (e.g. "[lumen 1/3]").
|
||||
Supports two models:
|
||||
- flux1-schnell.safetensors (default, fast 4-step)
|
||||
- flux-2-klein-4b.safetensors (with Heretic-abliterated Qwen3-4B text encoder — no refusals)
|
||||
"""
|
||||
if model not in _WORKFLOW_REGISTRY:
|
||||
model = _DEFAULT_MODEL
|
||||
logger.warning("Unknown model %s, falling back to %s", model, _DEFAULT_MODEL)
|
||||
# Build and submit workflow
|
||||
try:
|
||||
workflow = build_flux_workflow(
|
||||
|
||||
@@ -0,0 +1,73 @@
|
||||
{
|
||||
"6": {
|
||||
"class_type": "CLIPTextEncode",
|
||||
"inputs": {
|
||||
"clip": ["30", 0],
|
||||
"text": "PROMPT_PLACEHOLDER"
|
||||
}
|
||||
},
|
||||
"8": {
|
||||
"class_type": "VAEDecode",
|
||||
"inputs": {
|
||||
"samples": ["13", 0],
|
||||
"vae": ["31", 0]
|
||||
}
|
||||
},
|
||||
"9": {
|
||||
"class_type": "SaveImage",
|
||||
"inputs": {
|
||||
"filename_prefix": "mcp-image-gen",
|
||||
"images": ["8", 0]
|
||||
}
|
||||
},
|
||||
"13": {
|
||||
"class_type": "KSampler",
|
||||
"inputs": {
|
||||
"cfg": 1.0,
|
||||
"denoise": 1.0,
|
||||
"latent_image": ["27", 0],
|
||||
"model": ["32", 0],
|
||||
"negative": ["33", 0],
|
||||
"positive": ["6", 0],
|
||||
"sampler_name": "euler",
|
||||
"scheduler": "beta",
|
||||
"seed": 42,
|
||||
"steps": 4
|
||||
}
|
||||
},
|
||||
"27": {
|
||||
"class_type": "EmptySD3LatentImage",
|
||||
"inputs": {
|
||||
"batch_size": 1,
|
||||
"height": 1024,
|
||||
"width": 1024
|
||||
}
|
||||
},
|
||||
"30": {
|
||||
"class_type": "CLIPLoader",
|
||||
"inputs": {
|
||||
"clip_name": "qwen_3_4b_heretic.safetensors",
|
||||
"type": "flux"
|
||||
}
|
||||
},
|
||||
"31": {
|
||||
"class_type": "VAELoader",
|
||||
"inputs": {
|
||||
"vae_name": "flux2-vae.safetensors"
|
||||
}
|
||||
},
|
||||
"32": {
|
||||
"class_type": "UNETLoader",
|
||||
"inputs": {
|
||||
"unet_name": "flux-2-klein-4b.safetensors",
|
||||
"weight_dtype": "fp8_e4m3fn"
|
||||
}
|
||||
},
|
||||
"33": {
|
||||
"class_type": "CLIPTextEncode",
|
||||
"inputs": {
|
||||
"clip": ["30", 0],
|
||||
"text": "NEGATIVE_PLACEHOLDER"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -31,7 +31,7 @@ COMFYUI_BASE = "http://test-comfyui:8188"
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_build_flux_workflow_structure():
|
||||
"""Verify build_flux_workflow returns a dict with correct node types."""
|
||||
"""Verify build_flux_workflow returns a dict with correct node types for default model."""
|
||||
wf = build_flux_workflow(
|
||||
prompt="a red cat",
|
||||
neg_prompt="ugly",
|
||||
@@ -52,6 +52,47 @@ def test_build_flux_workflow_structure():
|
||||
assert wf["33"]["class_type"] == "CLIPTextEncode"
|
||||
|
||||
|
||||
def test_build_flux_workflow_heretic_model():
|
||||
"""Verify FLUX.2 Klein 4B with Heretic Qwen3-4B encoder uses correct nodes."""
|
||||
wf = build_flux_workflow(
|
||||
prompt="a red cat",
|
||||
neg_prompt="ugly",
|
||||
width=1024,
|
||||
height=1024,
|
||||
steps=4,
|
||||
seed=42,
|
||||
model="flux-2-klein-4b.safetensors",
|
||||
)
|
||||
assert wf["6"]["class_type"] == "CLIPTextEncode"
|
||||
assert wf["30"]["class_type"] == "CLIPLoader" # Qwen3-4B uses single CLIPLoader
|
||||
assert wf["32"]["inputs"]["unet_name"] == "flux-2-klein-4b.safetensors"
|
||||
assert wf["31"]["inputs"]["vae_name"] == "flux2-vae.safetensors"
|
||||
assert wf["13"]["inputs"]["scheduler"] == "beta" # FLUX.2 Klein uses beta scheduler
|
||||
|
||||
|
||||
def test_workflow_registry_contains_both_models():
|
||||
"""Verify the registry contains both supported models."""
|
||||
assert "flux1-schnell.safetensors" in server._WORKFLOW_REGISTRY
|
||||
assert "flux-2-klein-4b.safetensors" in server._WORKFLOW_REGISTRY
|
||||
assert len(server._WORKFLOW_REGISTRY) == 2
|
||||
|
||||
|
||||
def test_workflow_registry_fallback():
|
||||
"""Unknown model falls back to default (FLUX.1-schnell)."""
|
||||
wf = build_flux_workflow(
|
||||
prompt="test",
|
||||
neg_prompt="",
|
||||
width=512,
|
||||
height=512,
|
||||
steps=4,
|
||||
seed=42,
|
||||
model="unknown-model.safetensors",
|
||||
)
|
||||
# Should have used default workflow (DualCLIPLoader)
|
||||
assert wf["30"]["class_type"] == "DualCLIPLoader"
|
||||
assert wf["32"]["inputs"]["unet_name"] == "unknown-model.safetensors"
|
||||
|
||||
|
||||
def test_build_flux_workflow_params_injected():
|
||||
"""Verify all parameters are injected into correct nodes."""
|
||||
wf = build_flux_workflow(
|
||||
@@ -202,14 +243,16 @@ async def test_list_available_models():
|
||||
@respx.mock
|
||||
@pytest.mark.asyncio
|
||||
async def test_list_available_models_comfyui_offline():
|
||||
"""When ComfyUI is unreachable, list_available_models returns error message."""
|
||||
"""When ComfyUI is unreachable, list_available_models falls back to registry models."""
|
||||
respx.get(f"{COMFYUI_BASE}/object_info/CheckpointLoaderSimple").mock(
|
||||
side_effect=httpx.ConnectError("connection refused")
|
||||
)
|
||||
|
||||
result = await list_available_models()
|
||||
assert len(result) == 1
|
||||
assert "not reachable" in result[0].lower()
|
||||
# Should return registry models even when ComfyUI is offline
|
||||
assert isinstance(result, list)
|
||||
assert "flux1-schnell.safetensors" in result
|
||||
assert "flux-2-klein-4b.safetensors" in result
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@@ -0,0 +1,300 @@
|
||||
# Plan: FLUX.2 Klein 4B + Heretic Abliterated Text Encoder in mcp-image-gen
|
||||
|
||||
**Datum:** 2026-04-10
|
||||
**Autor:** Lumen / Patrick Plate
|
||||
**Status:** Ready for Implementation
|
||||
|
||||
---
|
||||
|
||||
## Ziel
|
||||
|
||||
Das bestehende `mcp-image-gen` ComfyUI-Backend um ein zweites Modell erweitern:
|
||||
**FLUX.2 Klein 4B** mit dem abliterierten **Qwen3-4B-Heretic** als Text-Encoder.
|
||||
|
||||
Ergebnis: `generate_image` kann via `model`-Parameter zwischen zwei Workflows wählen:
|
||||
- `flux1-schnell.safetensors` → bestehender Workflow (unverändert)
|
||||
- `flux-2-klein-4b-fp8.safetensors` → neuer Heretic-Workflow (keine Prompt-Refusals)
|
||||
|
||||
---
|
||||
|
||||
## Technischer Hintergrund
|
||||
|
||||
### Warum Heretic + FLUX.2 Klein?
|
||||
|
||||
FLUX.2 Klein 4B verwendet **Qwen3-4B als LLM Text-Encoder** (statt CLIP+T5 wie bei FLUX.1).
|
||||
Dieser LLM-Encoder hat Safety-Alignment → verweigert bestimmte Prompts → abliterieren.
|
||||
|
||||
`DreamFast/qwen3-4b-heretic` (HuggingFace):
|
||||
- **KL Divergenz: 0.0000** — null messbarer Modell-Schaden
|
||||
- Nur **3/100 Refusals** nach Heretic v1.2.0 (200 Trials)
|
||||
- Drop-in Replacement für `qwen_3_4b.safetensors`
|
||||
|
||||
### Modell-Architektur Unterschied
|
||||
|
||||
| | FLUX.1-schnell | FLUX.2 Klein 4B |
|
||||
|---|---|---|
|
||||
| Diffusion Model | `flux1-schnell.safetensors` (UNet) | `flux-2-klein-4b-fp8.safetensors` |
|
||||
| Text Encoder | `DualCLIPLoader` (T5+CLIP) | `CLIPLoader` (Qwen3-4B) |
|
||||
| VAE | `ae.safetensors` | `flux2-vae.safetensors` |
|
||||
| Steps | 4 | 4 (distilled) |
|
||||
| VRAM | ~8GB | ~8.4GB |
|
||||
| Refusals | keine (kein LLM-Encoder) | keine (abliteriert) |
|
||||
|
||||
---
|
||||
|
||||
## Dateien & Ordner
|
||||
|
||||
### Neue Modell-Dateien (herunterzuladen)
|
||||
|
||||
```
|
||||
~/ComfyUI/models/
|
||||
├── diffusion_models/
|
||||
│ └── flux-2-klein-4b-fp8.safetensors ← FLUX.2 Klein distilled 4B
|
||||
├── text_encoders/
|
||||
│ └── qwen_3_4b_heretic.safetensors ← Heretic abliteriert (von DreamFast/qwen3-4b-heretic)
|
||||
└── vae/
|
||||
└── flux2-vae.safetensors ← VAE für FLUX.2
|
||||
```
|
||||
|
||||
### Neue/geänderte Projekt-Dateien
|
||||
|
||||
```
|
||||
mcp/mcp-image-gen/
|
||||
├── src/
|
||||
│ ├── server.py ← Workflow-Registry ergänzen
|
||||
│ └── workflows/
|
||||
│ ├── flux_schnell.json ← unverändert
|
||||
│ └── flux2_klein_heretic.json ← NEU
|
||||
├── tests/
|
||||
│ └── test_server.py ← neue Tests für Registry + Workflow
|
||||
└── USAGE.md ← Download-Anleitung ergänzen
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Modelle herunterladen
|
||||
|
||||
### 1a. FLUX.2 Klein 4B (Diffusion Model)
|
||||
```bash
|
||||
# Von Black Forest Labs HuggingFace
|
||||
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
|
||||
flux-2-klein-4b-fp8.safetensors \
|
||||
--local-dir ~/ComfyUI/models/diffusion_models/
|
||||
```
|
||||
|
||||
### 1b. FLUX.2 VAE
|
||||
```bash
|
||||
huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
|
||||
flux2-vae.safetensors \
|
||||
--local-dir ~/ComfyUI/models/vae/
|
||||
```
|
||||
|
||||
### 1c. Qwen3-4B-Heretic (abliterierter Text-Encoder)
|
||||
```bash
|
||||
# Von DreamFast — bereits abliteriert, kein Heretic-Run nötig
|
||||
huggingface-cli download DreamFast/qwen3-4b-heretic \
|
||||
--local-dir /tmp/qwen3-4b-heretic/
|
||||
|
||||
# Safetensors-Datei in ComfyUI text_encoders ablegen
|
||||
cp /tmp/qwen3-4b-heretic/model.safetensors \
|
||||
~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
|
||||
```
|
||||
|
||||
> **Hinweis:** DreamFast/qwen3-4b-heretic ist ein GGUF-/SafeTensors-Mix.
|
||||
> Wir brauchen die `.safetensors` Variante für ComfyUI. Falls nur GGUF verfügbar:
|
||||
> `huggingface-cli download Lockout/qwen3-4b-heretic-zimage qwen-4b-zimage-hereticV2-q8.gguf`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Neues Workflow-JSON
|
||||
|
||||
**Datei:** [`mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json`](mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json)
|
||||
|
||||
FLUX.2 Klein verwendet andere ComfyUI-Nodes als FLUX.1-schnell:
|
||||
- `DualCLIPLoader` → `CLIPLoader` (einzelner Qwen-Encoder)
|
||||
- `UNETLoader` mit `diffusion_models/` Pfad statt `checkpoints/`
|
||||
- `EmptySD3LatentImage` → gleich (kompatibel)
|
||||
- `KSampler` → gleich aber `sampler_name: "euler"`, `scheduler: "beta"`, `steps: 4`
|
||||
|
||||
```json
|
||||
{
|
||||
"6": {
|
||||
"class_type": "CLIPTextEncode",
|
||||
"inputs": {
|
||||
"clip": ["30", 0],
|
||||
"text": "PROMPT_PLACEHOLDER"
|
||||
}
|
||||
},
|
||||
"8": {
|
||||
"class_type": "VAEDecode",
|
||||
"inputs": {
|
||||
"samples": ["13", 0],
|
||||
"vae": ["31", 0]
|
||||
}
|
||||
},
|
||||
"9": {
|
||||
"class_type": "SaveImage",
|
||||
"inputs": {
|
||||
"filename_prefix": "mcp-image-gen",
|
||||
"images": ["8", 0]
|
||||
}
|
||||
},
|
||||
"13": {
|
||||
"class_type": "KSampler",
|
||||
"inputs": {
|
||||
"cfg": 1.0,
|
||||
"denoise": 1.0,
|
||||
"latent_image": ["27", 0],
|
||||
"model": ["32", 0],
|
||||
"negative": ["33", 0],
|
||||
"positive": ["6", 0],
|
||||
"sampler_name": "euler",
|
||||
"scheduler": "beta",
|
||||
"seed": 42,
|
||||
"steps": 4
|
||||
}
|
||||
},
|
||||
"27": {
|
||||
"class_type": "EmptySD3LatentImage",
|
||||
"inputs": {
|
||||
"batch_size": 1,
|
||||
"height": 1024,
|
||||
"width": 1024
|
||||
}
|
||||
},
|
||||
"30": {
|
||||
"class_type": "CLIPLoader",
|
||||
"inputs": {
|
||||
"clip_name": "qwen_3_4b_heretic.safetensors",
|
||||
"type": "flux"
|
||||
}
|
||||
},
|
||||
"31": {
|
||||
"class_type": "VAELoader",
|
||||
"inputs": {
|
||||
"vae_name": "flux2-vae.safetensors"
|
||||
}
|
||||
},
|
||||
"32": {
|
||||
"class_type": "UNETLoader",
|
||||
"inputs": {
|
||||
"unet_name": "flux-2-klein-4b-fp8.safetensors",
|
||||
"weight_dtype": "fp8_e4m3fn"
|
||||
}
|
||||
},
|
||||
"33": {
|
||||
"class_type": "CLIPTextEncode",
|
||||
"inputs": {
|
||||
"clip": ["30", 0],
|
||||
"text": "NEGATIVE_PLACEHOLDER"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: server.py — Workflow-Registry
|
||||
|
||||
### Änderung 1: Workflow-Registry dict (nach `_WORKFLOW_PATH`)
|
||||
|
||||
```python
|
||||
# Path to the bundled FLUX.1-schnell workflow template
|
||||
_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
|
||||
|
||||
# Workflow registry: model filename → workflow JSON path
|
||||
_WORKFLOW_REGISTRY: dict[str, Path] = {
|
||||
"flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
|
||||
"flux-2-klein-4b-fp8.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
|
||||
}
|
||||
|
||||
_DEFAULT_MODEL = "flux1-schnell.safetensors"
|
||||
```
|
||||
|
||||
### Änderung 2: `_load_workflow()` Hilfsfunktion
|
||||
|
||||
```python
|
||||
def _load_workflow(model: str) -> dict:
|
||||
"""Load the correct workflow JSON for the requested model.
|
||||
|
||||
Falls back to FLUX.1-schnell if model not in registry.
|
||||
"""
|
||||
path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_PATH)
|
||||
if not path.exists():
|
||||
raise FileNotFoundError(f"Workflow JSON not found: {path}")
|
||||
return json.loads(path.read_text())
|
||||
```
|
||||
|
||||
### Änderung 3: `_generate_single()` nutzt Registry
|
||||
|
||||
Aktueller Code lädt immer `_WORKFLOW_PATH`. Änderung: `_load_workflow(model)` aufrufen:
|
||||
|
||||
```python
|
||||
async def _generate_single(
|
||||
client: ComfyUIClient,
|
||||
prompt: str,
|
||||
negative_prompt: str,
|
||||
model: str,
|
||||
seed: int,
|
||||
width: int,
|
||||
height: int,
|
||||
steps: int,
|
||||
output_dir: Path,
|
||||
name: str,
|
||||
) -> tuple[TextContent, ImageContent | None]:
|
||||
workflow = _load_workflow(model) # ← statt json.loads(_WORKFLOW_PATH.read_text())
|
||||
# ... rest unchanged
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Tests
|
||||
|
||||
Neue Tests in [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py):
|
||||
|
||||
1. **`test_workflow_registry_contains_both_models`** — Registry hat flux1-schnell + flux2-klein
|
||||
2. **`test_load_workflow_flux1_schnell`** — lädt flux_schnell.json korrekt
|
||||
3. **`test_load_workflow_flux2_klein`** — lädt flux2_klein_heretic.json korrekt
|
||||
4. **`test_load_workflow_unknown_model_falls_back`** — unbekanntes Modell → FLUX.1-schnell
|
||||
5. **`test_generate_image_uses_flux2_workflow`** — end-to-end Mock mit flux-2-klein-4b-fp8.safetensors
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: USAGE.md Update
|
||||
|
||||
Neuer Abschnitt "FLUX.2 Klein 4B (Heretic)" in [`mcp/mcp-image-gen/USAGE.md`](mcp/mcp-image-gen/USAGE.md):
|
||||
- Download-Befehle für alle 3 neuen Modell-Dateien
|
||||
- Erklärung warum Heretic (abliterierter Text-Encoder, KL=0)
|
||||
- Beispiel-Aufruf: `generate_image("...", model="flux-2-klein-4b-fp8.safetensors")`
|
||||
|
||||
---
|
||||
|
||||
## VRAM-Analyse
|
||||
|
||||
| Modell | VRAM gesamt | Passt in 24GB? |
|
||||
|---|---|---|
|
||||
| FLUX.1-schnell (fp8) | ~8GB | ✅ |
|
||||
| FLUX.2 Klein 4B (fp8) + Qwen3-4B | ~8.4GB + ~4GB = ~12.4GB | ✅ |
|
||||
| Beide gleichzeitig geladen | ~20GB | ✅ mit Margin |
|
||||
|
||||
Der RX 7900 XTX mit 24GB VRAM kann beide Modelle komfortabel halten.
|
||||
|
||||
---
|
||||
|
||||
## Risiken & Mitigationen
|
||||
|
||||
| Risiko | Wahrscheinlichkeit | Mitigation |
|
||||
|---|---|---|
|
||||
| `CLIPLoader` node nicht verfügbar in ComfyUI | niedrig | ComfyUI updaten; alternativ custom node |
|
||||
| DreamFast-Modell nur als GGUF verfügbar | mittel | Lockout/qwen3-4b-heretic-zimage GGUF als Fallback |
|
||||
| Qwen3-4B braucht anderen node type | mittel | Live-Test in ComfyUI UI zuerst; workflow anpassen |
|
||||
| ROCm + Qwen3-4B Kompatibilität | niedrig | gleiche ROCm-Umgebung wie FLUX.1-schnell |
|
||||
|
||||
---
|
||||
|
||||
## Entscheidung
|
||||
|
||||
✅ **Empfehlung: Umsetzen.** Minimale Code-Änderungen, kein Breaking Change, klarer Mehrwert.
|
||||
|
||||
Der einzige unsichere Punkt ist der genaue ComfyUI-Node-Name für den Qwen3-4B-Loader.
|
||||
**Empfohlene Vorgehensweise:** Erst in der ComfyUI-Web-UI manuell einen Workflow mit Qwen3-4B aufbauen → JSON exportieren → als `flux2_klein_heretic.json` speichern. Das garantiert korrekte Node-Namen ohne Guess-Work.
|
||||
Reference in New Issue
Block a user