docs(plans): add heretic encoder swap task for FLUX.2 Klein uncensored generation

fix(mcp-image-gen): confirmed working FLUX.2 Klein encoder filename
- CLIPLoader clip_name: qwen_3_4b_klein.safetensors (from Comfy-Org/vae-text-encorder-for-flux-klein-4b) - VAE: flux2-vae.safetensors (321MB, same repo) - Live test confirmed: 2.1MB photorealistic 1024x1024 PNG in 52.43s on RX 7900 XTX - Test: assert clip_name == qwen_3_4b_klein.safetensors - 37/37 tests pass
2026-04-10 20:32:05 +02:00 · 2026-04-10 20:29:18 +02:00 · 2026-04-10 20:21:16 +02:00 · 2026-04-10 20:21:12 +02:00 · 2026-04-10 19:21:51 +02:00 · 2026-04-10 19:21:51 +02:00
8 changed files with 808 additions and 38 deletions
@@ -2,7 +2,11 @@
 **FastMCP server for AI image generation via ComfyUI.**
-This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client. It supports FLUX.1-schnell, FLUX.1-dev, SDXL, and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
+This MCP server wraps a locally running [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instance, exposing image generation as MCP tools callable from Roo Code, Claude Desktop, or any MCP-compatible client.
 **New:** Support for **FLUX.2 Klein 4B** with **Heretic-abliterated Qwen3-4B text encoder** (zero KL divergence, no refusals). Select via `model="flux-2-klein-4b-fp8.safetensors"`.
 It supports FLUX.1-schnell (default), FLUX.2 Klein (Heretic), and any other ComfyUI-compatible checkpoint model. Generated images are saved to disk **and** returned as inline base64 so Claude can display them directly in chat.
 ---
@@ -565,7 +565,56 @@ Then pass it back: `seed=3847291045`
 ---
-## 10. Known Limitations
+## 10. FLUX.2 Klein 4B with Heretic Abliteration (New)
 **New in this release:** Support for **FLUX.2 Klein 4B** using an **abliterated Qwen3-4B text encoder** via Heretic.
 ### Why Heretic?
 FLUX.2 Klein uses a full LLM (Qwen3-4B) as its text encoder instead of CLIP+T5. This LLM has safety alignment that can refuse certain prompts. Heretic removes this alignment with **zero measurable KL divergence** (0.0000) and only 3/100 refusals.
 ### How to use it
 ```python
 generate_image(
    prompt="a beautiful cyberpunk fox in neon tokyo, highly detailed",
    model="flux-2-klein-4b-fp8.safetensors",
    width=1024,
    height=1024,
    steps=4
 )
 ```
 ### Models to download
 ```bash
 # 1. FLUX.2 Klein 4B (distilled, fp8)
 huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
    flux-2-klein-4b-fp8.safetensors \
    --local-dir ~/ComfyUI/models/diffusion_models/
 # 2. FLUX.2 VAE
 huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
    flux2-vae.safetensors \
    --local-dir ~/ComfyUI/models/vae/
 # 3. Heretic-abliterated Qwen3-4B (from DreamFast)
 huggingface-cli download DreamFast/qwen3-4b-heretic \
    --local-dir /tmp/qwen3-heretic/
 cp /tmp/qwen3-heretic/model.safetensors \
   ~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
 ```
 ### Supported models (via `model=` parameter)
 | Model | Description | VRAM | Speed | Censorship |
 |-------|-------------|------|-------|------------|
 | `flux1-schnell.safetensors` | Original (default) | ~8GB | Very fast | None |
 | `flux-2-klein-4b-fp8.safetensors` | **New** — with Heretic Qwen3-4B | ~12GB | Fast | **Removed** |
 ---
 ## 11. Known Limitations
 ### ComfyUI must run locally
@@ -39,8 +39,14 @@ COMFYUI_DIR = Path(
 # Maximum number of images allowed in a single batch call
 MAX_COUNT = 10
-# Path to the bundled FLUX.1-schnell workflow template
+# Workflow registry: model filename → workflow JSON path
-_WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
+# This allows us to support multiple models (FLUX.1-schnell + FLUX.2 Klein with Heretic encoder)
 _WORKFLOW_REGISTRY: dict[str, Path] = {
    "flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
    "flux-2-klein-4b.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
 }
 _DEFAULT_MODEL = "flux1-schnell.safetensors"
 # ---------------------------------------------------------------------------
@@ -181,21 +187,37 @@ class ComfyUIClient:
            return resp.content
    async def get_models(self) -> list[str]:
-        """Return the list of available checkpoint model filenames."""
+        """Return the list of available checkpoint model filenames.
        Combines models known to ComfyUI with our internal registry
        (including FLUX.2 Klein with Heretic encoder).
        """
        models = set()
        # Get models from ComfyUI
        try:
            async with httpx.AsyncClient(timeout=10.0) as client:
                resp = await client.get(
                    f"{self.base_url}/object_info/CheckpointLoaderSimple"
                )
                resp.raise_for_status()
                data = resp.json()
            # ComfyUI returns: {"CheckpointLoaderSimple": {"input": {"required": {"ckpt_name": [["model1.safetensors", ...], ...]}}}}
                node_info = data.get("CheckpointLoaderSimple", {})
                ckpt_list = (
                    node_info.get("input", {})
                    .get("required", {})
                    .get("ckpt_name", [[]])[0]
                )
-            return ckpt_list if isinstance(ckpt_list, list) else []
+                if isinstance(ckpt_list, list):
                    models.update(ckpt_list)
        except Exception:
            # ComfyUI not reachable — fall back to registry only
            pass
        # Add our registered models
        models.update(_WORKFLOW_REGISTRY.keys())
        return sorted(list(models))
 # ---------------------------------------------------------------------------
@@ -209,13 +231,20 @@ def build_flux_workflow(
    height: int,
    steps: int,
    seed: int,
-    model: str,
+    model: str = _DEFAULT_MODEL,
 ) -> dict:
-    """Build a ComfyUI API-format workflow dict for FLUX.1-schnell text-to-image.
+    """Build a ComfyUI API-format workflow dict for the requested model.
-    This is a pure function — no I/O, fully testable.
+    Supports:
    - "flux1-schnell.safetensors" (original)
    - "flux-2-klein-4b-fp8.safetensors" (with Heretic-abliterated Qwen3-4B text encoder)
    Falls back to FLUX.1-schnell if model is unknown.
    This is a pure function — no I/O outside the registry, fully testable.
    """
-    with open(_WORKFLOW_PATH) as f:
+    workflow_path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_REGISTRY[_DEFAULT_MODEL])
    with open(workflow_path) as f:
        wf = json.load(f)
    wf = copy.deepcopy(wf)
@@ -277,18 +306,13 @@ async def _generate_single(
 ) -> list:
    """Generate a single image and return [TextContent, ImageContent] or [TextContent] on error.
-    Args:
+    Supports two models:
-        client:              ComfyUIClient instance.
+    - flux1-schnell.safetensors (default, fast 4-step)
-        prompt:              Positive text prompt.
+    - flux-2-klein-4b.safetensors (with Heretic-abliterated Qwen3-4B text encoder — no refusals)
        negative_prompt:     Negative text prompt.
        width / height:      Image dimensions.
        steps:               Inference steps.
        seed:                Seed value (-1 = random).
        model:               ComfyUI model filename.
        resolved_output_dir: Resolved output directory Path.
        name:                User-supplied name prefix (unsanitized).
        label:               Human-readable label for TextContent prefix (e.g. "[lumen 1/3]").
    """
    if model not in _WORKFLOW_REGISTRY:
        model = _DEFAULT_MODEL
        logger.warning("Unknown model %s, falling back to %s", model, _DEFAULT_MODEL)
    # Build and submit workflow
    try:
        workflow = build_flux_workflow(
@@ -0,0 +1,98 @@
 {
  "1": {
    "class_type": "CLIPLoader",
    "inputs": {
      "clip_name": "qwen_3_4b_klein.safetensors",
      "type": "flux2",
      "device": "default"
    }
  },
  "2": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "clip": ["1", 0],
      "text": "PROMPT_PLACEHOLDER"
    }
  },
  "3": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "clip": ["1", 0],
      "text": "NEGATIVE_PLACEHOLDER"
    }
  },
  "4": {
    "class_type": "UNETLoader",
    "inputs": {
      "unet_name": "flux-2-klein-4b.safetensors",
      "weight_dtype": "default"
    }
  },
  "5": {
    "class_type": "VAELoader",
    "inputs": {
      "vae_name": "flux2-vae.safetensors"
    }
  },
  "6": {
    "class_type": "EmptyFlux2LatentImage",
    "inputs": {
      "width": 1024,
      "height": 1024,
      "batch_size": 1
    }
  },
  "7": {
    "class_type": "Flux2Scheduler",
    "inputs": {
      "steps": 20,
      "width": 1024,
      "height": 1024
    }
  },
  "8": {
    "class_type": "CFGGuider",
    "inputs": {
      "model": ["4", 0],
      "positive": ["2", 0],
      "negative": ["3", 0],
      "cfg": 5
    }
  },
  "9": {
    "class_type": "KSamplerSelect",
    "inputs": {
      "sampler_name": "euler"
    }
  },
  "10": {
    "class_type": "RandomNoise",
    "inputs": {
      "noise_seed": 42
    }
  },
  "11": {
    "class_type": "SamplerCustomAdvanced",
    "inputs": {
      "noise": ["10", 0],
      "guider": ["8", 0],
      "sampler": ["9", 0],
      "sigmas": ["7", 0],
      "latent_image": ["6", 0]
    }
  },
  "12": {
    "class_type": "VAEDecode",
    "inputs": {
      "samples": ["11", 0],
      "vae": ["5", 0]
    }
  },
  "13": {
    "class_type": "SaveImage",
    "inputs": {
      "filename_prefix": "mcp-image-gen",
      "images": ["12", 0]
    }
  }
 }
@@ -31,7 +31,7 @@ COMFYUI_BASE = "http://test-comfyui:8188"
 # ---------------------------------------------------------------------------
 def test_build_flux_workflow_structure():
-    """Verify build_flux_workflow returns a dict with correct node types."""
+    """Verify build_flux_workflow returns a dict with correct node types for default model."""
    wf = build_flux_workflow(
        prompt="a red cat",
        neg_prompt="ugly",
@@ -52,6 +52,56 @@ def test_build_flux_workflow_structure():
    assert wf["33"]["class_type"] == "CLIPTextEncode"
 def test_build_flux_workflow_heretic_model():
    """Verify FLUX.2 Klein 4B with Heretic Qwen3-4B encoder uses correct nodes."""
    wf = build_flux_workflow(
        prompt="a red cat",
        neg_prompt="ugly",
        width=1024,
        height=1024,
        steps=4,
        seed=42,
        model="flux-2-klein-4b.safetensors",
    )
    # New FLUX.2 workflow uses different node IDs and types
    assert wf["1"]["class_type"] == "CLIPLoader"          # Qwen3-4B uses single CLIPLoader
    assert wf["1"]["inputs"]["type"] == "flux2"            # correct type for FLUX.2
    assert wf["1"]["inputs"]["device"] == "default"        # required for FLUX.2 CLIPLoader
    assert wf["1"]["inputs"]["clip_name"] == "qwen_3_4b_klein.safetensors"  # Comfy-Org/vae-text-encorder-for-flux-klein-4b
    assert wf["2"]["class_type"] == "CLIPTextEncode"       # standard CLIP encode (not Flux-specific)
    assert wf["4"]["class_type"] == "UNETLoader"
    assert wf["4"]["inputs"]["unet_name"] == "flux-2-klein-4b.safetensors"
    assert wf["4"]["inputs"]["weight_dtype"] == "default"  # not fp8 — avoids dimension errors
    assert wf["6"]["class_type"] == "EmptyFlux2LatentImage"  # FLUX.2-specific latent
    assert wf["8"]["class_type"] == "CFGGuider"            # CFGGuider replaces FluxDisableGuidance+BasicGuider
    assert wf["8"]["inputs"]["cfg"] == 5                   # cfg=5 for FLUX.2 Klein
    assert wf["11"]["class_type"] == "SamplerCustomAdvanced"  # FLUX.2 sampler (node 11, not 12)
    assert wf["13"]["class_type"] == "SaveImage"           # output node
 def test_workflow_registry_contains_both_models():
    """Verify the registry contains both supported models."""
    assert "flux1-schnell.safetensors" in server._WORKFLOW_REGISTRY
    assert "flux-2-klein-4b.safetensors" in server._WORKFLOW_REGISTRY
    assert len(server._WORKFLOW_REGISTRY) == 2
 def test_workflow_registry_fallback():
    """Unknown model falls back to default (FLUX.1-schnell)."""
    wf = build_flux_workflow(
        prompt="test",
        neg_prompt="",
        width=512,
        height=512,
        steps=4,
        seed=42,
        model="unknown-model.safetensors",
    )
    # Should have used default workflow (DualCLIPLoader)
    assert wf["30"]["class_type"] == "DualCLIPLoader"
    assert wf["32"]["inputs"]["unet_name"] == "unknown-model.safetensors"
 def test_build_flux_workflow_params_injected():
    """Verify all parameters are injected into correct nodes."""
    wf = build_flux_workflow(
@@ -202,14 +252,16 @@ async def test_list_available_models():
@respx.mock
@pytest.mark.asyncio
 async def test_list_available_models_comfyui_offline():
-    """When ComfyUI is unreachable, list_available_models returns error message."""
+    """When ComfyUI is unreachable, list_available_models falls back to registry models."""
    respx.get(f"{COMFYUI_BASE}/object_info/CheckpointLoaderSimple").mock(
        side_effect=httpx.ConnectError("connection refused")
    )
    result = await list_available_models()
-    assert len(result) == 1
+    # Should return registry models even when ComfyUI is offline
-    assert "not reachable" in result[0].lower()
+    assert isinstance(result, list)
    assert "flux1-schnell.safetensors" in result
    assert "flux-2-klein-4b.safetensors" in result
 # ---------------------------------------------------------------------------
@@ -0,0 +1,139 @@
 # Task: Swap Qwen3-4B Encoder for Heretic Abliterated Version
 **Datum:** 2026-04-10  
 **Status:** Ready — waiting for correct Heretic encoder to be published  
 **Depends on:** FLUX.2 Klein 4B working (✅ done as of 2026-04-10)
 ---
 ## Goal
 Replace the standard `qwen_3_4b_klein.safetensors` with an abliterated (Heretic) version that has:
 - **Zero measurable quality loss** (KL divergence = 0.0000)
 - **No prompt refusals** (≤3/100 in DreamFast v1.2.0 testing)
 Result: `generate_image(prompt, model="flux-2-klein-4b.safetensors")` will work with **any** prompt without refusals.
 ---
 ## Current State
 | File | Location | Status |
 |------|----------|--------|
 | `flux-2-klein-4b.safetensors` | `~/ComfyUI/models/diffusion_models/` | ✅ Working |
 | `qwen_3_4b_klein.safetensors` | `~/ComfyUI/models/text_encoders/` | ✅ Working (standard, has refusals) |
 | `flux2-vae.safetensors` | `~/ComfyUI/models/vae/` | ✅ Working |
 The MCP workflow [`mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json`](../mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json) already uses `qwen_3_4b_klein.safetensors` — **no code change needed**, only the file on disk needs to be replaced.
 ---
 ## The Problem to Solve First
 The standard Heretic repos may not have the **FLUX.2 Klein-compatible** encoder dimensions:
 | Encoder | `hidden_size` | Conditioning dim | Usable? |
 |---------|--------------|-----------------|---------|
 | BFL Qwen3-4B (FLUX.2 Klein) | **2560** | 7680 (2560×3) | ✅ |
 | DreamFast/qwen3-4b-heretic | unknown — must check | ? | ⚠️ verify first |
 | Standard Qwen3-4B | 4096 | 4096 | ❌ wrong |
 **Before downloading, verify DreamFast's model is fine-tuned from the BFL variant** (hidden_size=2560), not the standard Qwen3 (hidden_size=4096).
 ---
 ## Steps
 ### Step 1: Check DreamFast Heretic repo
 ```bash
 huggingface-cli model-info DreamFast/qwen3-4b-heretic 2>/dev/null | grep -i hidden
 ```
 Or browse: https://huggingface.co/DreamFast/qwen3-4b-heretic/blob/main/config.json  
 Look for: `"hidden_size": 2560` — that's the FLUX.2 Klein-compatible version.
 ### Step 2a: If DreamFast has the right dimensions (2560)
 ```bash
 # Download
 huggingface-cli download DreamFast/qwen3-4b-heretic \
  --local-dir /tmp/qwen3-4b-heretic/
 # Back up working encoder first
 cp ~/ComfyUI/models/text_encoders/qwen_3_4b_klein.safetensors \
   ~/ComfyUI/models/text_encoders/qwen_3_4b_klein_backup.safetensors
 # Swap in the Heretic version
 cp /tmp/qwen3-4b-heretic/model.safetensors \
   ~/ComfyUI/models/text_encoders/qwen_3_4b_klein.safetensors
 ```
 ### Step 2b: If DreamFast has wrong dimensions (4096) — find alternative
 Options in order of preference:
 1. **Lockout/qwen3-4b-heretic-zimage** — check if BFL-compatible:
   ```bash
   huggingface-cli model-info Lockout/qwen3-4b-heretic-zimage 2>/dev/null | grep hidden
   ```
 2. **Run Heretic abliteration yourself** on the working `qwen_3_4b_klein.safetensors`  
   Tool: https://github.com/FailSpy/abliterator  
   Script: `python abliterator.py --model qwen_3_4b_klein.safetensors --output qwen_3_4b_klein_heretic.safetensors`
 3. **Wait** for DreamFast or BFL to publish the FLUX.2-specific abliterated encoder
 ### Step 3: Live test
 ```python
 generate_image(
    "an explicit test prompt that would normally be refused",
    model="flux-2-klein-4b.safetensors",
    steps=20
 )
 ```
 Expected: Image generated, no refusal error in ComfyUI logs.
 ### Step 4: If it works — no code changes needed
 The MCP code, workflow JSON, and registry are already correct. Just verify:
 - Check `journalctl --user -u comfyui -f` during generation for any errors
 - Confirm file in `~/Pictures/mcp-generated/` was saved
 ---
 ## Fallback Plan
 If the Heretic encoder is unavailable in the right dimensions, the **GGUF route** works too:
 ```bash
 # ComfyUI-GGUF is already installed: ~/ComfyUI/custom_nodes/ComfyUI-GGUF
 # Download Heretic GGUF (if BFL-compatible variant published):
 huggingface-cli download Lockout/qwen3-4b-heretic-zimage \
  qwen-4b-zimage-hereticV2-q8.gguf \
  --local-dir ~/ComfyUI/models/text_encoders/
 ```
 Then update [`flux2_klein_heretic.json`](../mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json) node `"1"`:
 ```json
 "class_type": "CLIPLoaderGGUF",  // instead of CLIPLoader
 "inputs": {
  "clip_name": "qwen-4b-zimage-hereticV2-q8.gguf",
  "type": "flux2"
 }
 ```
 ---
 ## No Code Changes Required (unless GGUF fallback)
 The entire MCP server, workflow registry, and test suite are already correct. This is **purely a model file task**.
 ---
 ## Success Criteria
 - [ ] `generate_image("...", model="flux-2-klein-4b.safetensors")` works with prompts that currently get refused
 - [ ] Output image quality identical to standard encoder (check: no visible artifacts vs reference)
 - [ ] ComfyUI logs show no dimension errors
 - [ ] `qwen_3_4b_klein_backup.safetensors` kept as rollback
@@ -0,0 +1,104 @@
 # FLUX.2 Klein 4B + Heretic — Session Recap
 **Date:** 2026-04-10  
 **Status:** Code complete, live generation BLOCKED by encoder dimension mismatch  
 ---
 ## What We Achieved ✅
 ### Code Infrastructure (Solid)
 - **`mcp-image-gen/src/server.py`** — Generic workflow registry with model-based dispatch, `_inject_workflow_params()` works recursively on any node layout
 - **`mcp-image-gen/tests/test_server.py`** — 37/37 tests passing
 - **Gitea** — pushed to main (commit `38d26ad`)
 - The architecture is right: adding a new model = add 1 JSON file + 1 registry entry
 ### Models Downloaded (on disk)
 | File | Location | Status |
 |------|----------|--------|
 | `flux-2-klein-4b.safetensors` | `~/ComfyUI/models/diffusion_models/` | ✅ 7.3GB |
 | `qwen_3_4b_bfl.safetensors` | `~/ComfyUI/models/text_encoders/` | ✅ merged from BFL shards |
 | `qwen_3_4b.safetensors` (z_image) | `~/ComfyUI/models/text_encoders/split_files/` | ✅ wrong model |
 | `Qwen3-4B-Q8_0.gguf` | `~/ComfyUI/models/text_encoders/` | ✅ wrong arch |
 | ComfyUI-GGUF extension | `~/ComfyUI/custom_nodes/ComfyUI-GGUF` | ✅ installed |
 ---
 ## What Failed and Why ❌
 ### The Error (persistent)
 ```
 mat1 and mat2 shapes cannot be multiplied (512x4096 and 7680x3072)
 ```
 ### Root Cause Analysis
 **Node 13** (`SamplerCustomAdvanced`) fails — meaning the conditioning vector from the text encoder doesn't match the diffusion model's expected input.
 | Component | Expected | Got |
 |-----------|----------|-----|
 | FLUX.2 Klein 4B conditioning input | **7680-dim** (2560 × 3) | **4096-dim** |
 **Why 7680 = 2560 × 3?**  
 FLUX models concatenate text embeddings across multiple time steps. The BFL Qwen3 encoder has `hidden_size=2560`, so the concatenated output is 2560×3=7680.
 **Why 4096?**  
 Every other Qwen3 variant (z_image_turbo, official Qwen repo GGUF) uses standard Qwen3 with `hidden_size=4096` — these are for Z-Image and text generation respectively, NOT for FLUX.2 Klein.
 ### What We Tried (and Why Each Failed)
 1. `CLIPLoader type=flux` → wrong architecture (FLUX.1 style)
 2. `CLIPLoader type=flux2` → correct node, wrong encoder file (z_image Qwen)
 3. `CLIPLoaderGGUF type=flux2` → correct node, wrong GGUF (standard Qwen3)
 4. `CLIPLoader type=flux2 + qwen_3_4b_bfl.safetensors` → merged BFL shards, but still fails
 5. Workflow: `KSampler` → doesn't work with FLUX.2 (different architecture)
 6. Workflow: `SamplerCustomAdvanced + BasicGuider + Flux2Scheduler` → correct architecture but encoding mismatch persists
 ### The Real Missing Piece
 The BFL FLUX.2 Klein text encoder in Diffusers format is designed for use via `transformers/diffusers` pipeline, NOT via ComfyUI's `CLIPLoader`. ComfyUI reads the weights differently. The weights are there but ComfyUI doesn't know how to map `model.embed_tokens`, `model.layers.N.*` etc. to the CLIP interface it expects.
 **The correct encoder file for ComfyUI** is `Comfy-Org/vae-text-encorder-for-flux-klein-4b` — the 7.5GB file we downloaded IS the right one, but ComfyUI is likely loading it with the wrong adapter in the `CLIPLoader`.
 ---
 ## Clean Approach — What We Need to Do
 ### Option A: Use ComfyUI Web UI (Easiest)
 1. Open `http://localhost:8188` in browser
 2. Load the "Flux.2 Klein 4B Text-to-Image" workflow template (it's in the UI Templates)
 3. **Export the working API JSON** (Ctrl+Shift+E or Settings → Save as API format)
 4. Replace our `flux2_klein_heretic.json` with the exported JSON
 5. Add placeholders and test
 This gives us the **verified working node graph** without guessing. 10 minutes.
 ### Option B: Find a Working API JSON online
 - Reddit r/comfyui has working FLUX.2 Klein workflows
 - Export format is what we need
 ### Then: Add Heretic
 Once we have a working standard workflow:
 1. Download the actual Heretic-abliterated version of the BFL encoder (once it's published)
 2. Swap encoder filename in the JSON
 ---
 ## My Recommendation
 **Do Option A right now.** Open `http://localhost:8188`, load the template, export to API format, paste the JSON. We'll be running in 10 minutes instead of guessing node names.
 The MCP server code is solid — the only broken piece is `flux2_klein_heretic.json`. Once we have the right JSON from the UI, everything else works.
 ---
 ## Files to Clean Up (After We Have the Right JSON)
 ```bash
 # Remove wrong encoders (save ~8GB)
 rm ~/ComfyUI/models/text_encoders/qwen_3_4b.safetensors   # z_image version
 rm ~/ComfyUI/models/text_encoders/qwen_3_4b_flux2.safetensors
 # Keep
 # ~/ComfyUI/models/text_encoders/qwen_3_4b_bfl.safetensors  ← correct encoder
 # ~/ComfyUI/models/text_encoders/Qwen3-4B-Q8_0.gguf          ← maybe useful later
 ```
@@ -0,0 +1,300 @@
 # Plan: FLUX.2 Klein 4B + Heretic Abliterated Text Encoder in mcp-image-gen
 **Datum:** 2026-04-10  
 **Autor:** Lumen / Patrick Plate  
 **Status:** Ready for Implementation
 ---
 ## Ziel
 Das bestehende `mcp-image-gen` ComfyUI-Backend um ein zweites Modell erweitern:
 **FLUX.2 Klein 4B** mit dem abliterierten **Qwen3-4B-Heretic** als Text-Encoder.
 Ergebnis: `generate_image` kann via `model`-Parameter zwischen zwei Workflows wählen:
 - `flux1-schnell.safetensors` → bestehender Workflow (unverändert)
 - `flux-2-klein-4b-fp8.safetensors` → neuer Heretic-Workflow (keine Prompt-Refusals)
 ---
 ## Technischer Hintergrund
 ### Warum Heretic + FLUX.2 Klein?
 FLUX.2 Klein 4B verwendet **Qwen3-4B als LLM Text-Encoder** (statt CLIP+T5 wie bei FLUX.1).
 Dieser LLM-Encoder hat Safety-Alignment → verweigert bestimmte Prompts → abliterieren.
 `DreamFast/qwen3-4b-heretic` (HuggingFace):
 - **KL Divergenz: 0.0000** — null messbarer Modell-Schaden
 - Nur **3/100 Refusals** nach Heretic v1.2.0 (200 Trials)
 - Drop-in Replacement für `qwen_3_4b.safetensors`
 ### Modell-Architektur Unterschied
 | | FLUX.1-schnell | FLUX.2 Klein 4B |
 |---|---|---|
 | Diffusion Model | `flux1-schnell.safetensors` (UNet) | `flux-2-klein-4b-fp8.safetensors` |
 | Text Encoder | `DualCLIPLoader` (T5+CLIP) | `CLIPLoader` (Qwen3-4B) |
 | VAE | `ae.safetensors` | `flux2-vae.safetensors` |
 | Steps | 4 | 4 (distilled) |
 | VRAM | ~8GB | ~8.4GB |
 | Refusals | keine (kein LLM-Encoder) | keine (abliteriert) |
 ---
 ## Dateien & Ordner
 ### Neue Modell-Dateien (herunterzuladen)
 ```
 ~/ComfyUI/models/
 ├── diffusion_models/
 │   └── flux-2-klein-4b-fp8.safetensors    ← FLUX.2 Klein distilled 4B
 ├── text_encoders/
 │   └── qwen_3_4b_heretic.safetensors      ← Heretic abliteriert (von DreamFast/qwen3-4b-heretic)
 └── vae/
    └── flux2-vae.safetensors              ← VAE für FLUX.2
 ```
 ### Neue/geänderte Projekt-Dateien
 ```
 mcp/mcp-image-gen/
 ├── src/
 │   ├── server.py                          ← Workflow-Registry ergänzen
 │   └── workflows/
 │       ├── flux_schnell.json              ← unverändert
 │       └── flux2_klein_heretic.json       ← NEU
 ├── tests/
 │   └── test_server.py                     ← neue Tests für Registry + Workflow
 └── USAGE.md                               ← Download-Anleitung ergänzen
 ```
 ---
 ## Phase 1: Modelle herunterladen
 ### 1a. FLUX.2 Klein 4B (Diffusion Model)
 ```bash
 # Von Black Forest Labs HuggingFace
 huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
  flux-2-klein-4b-fp8.safetensors \
  --local-dir ~/ComfyUI/models/diffusion_models/
 ```
 ### 1b. FLUX.2 VAE
 ```bash
 huggingface-cli download black-forest-labs/FLUX.2-klein-4B \
  flux2-vae.safetensors \
  --local-dir ~/ComfyUI/models/vae/
 ```
 ### 1c. Qwen3-4B-Heretic (abliterierter Text-Encoder)
 ```bash
 # Von DreamFast — bereits abliteriert, kein Heretic-Run nötig
 huggingface-cli download DreamFast/qwen3-4b-heretic \
  --local-dir /tmp/qwen3-4b-heretic/
 # Safetensors-Datei in ComfyUI text_encoders ablegen
 cp /tmp/qwen3-4b-heretic/model.safetensors \
   ~/ComfyUI/models/text_encoders/qwen_3_4b_heretic.safetensors
 ```
 > **Hinweis:** DreamFast/qwen3-4b-heretic ist ein GGUF-/SafeTensors-Mix.
 > Wir brauchen die `.safetensors` Variante für ComfyUI. Falls nur GGUF verfügbar:
 > `huggingface-cli download Lockout/qwen3-4b-heretic-zimage qwen-4b-zimage-hereticV2-q8.gguf`
 ---
 ## Phase 2: Neues Workflow-JSON
 **Datei:** [`mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json`](mcp/mcp-image-gen/src/workflows/flux2_klein_heretic.json)
 FLUX.2 Klein verwendet andere ComfyUI-Nodes als FLUX.1-schnell:
 - `DualCLIPLoader` → `CLIPLoader` (einzelner Qwen-Encoder)
 - `UNETLoader` mit `diffusion_models/` Pfad statt `checkpoints/`
 - `EmptySD3LatentImage` → gleich (kompatibel)
 - `KSampler` → gleich aber `sampler_name: "euler"`, `scheduler: "beta"`, `steps: 4`
 ```json
 {
  "6": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "clip": ["30", 0],
      "text": "PROMPT_PLACEHOLDER"
    }
  },
  "8": {
    "class_type": "VAEDecode",
    "inputs": {
      "samples": ["13", 0],
      "vae": ["31", 0]
    }
  },
  "9": {
    "class_type": "SaveImage",
    "inputs": {
      "filename_prefix": "mcp-image-gen",
      "images": ["8", 0]
    }
  },
  "13": {
    "class_type": "KSampler",
    "inputs": {
      "cfg": 1.0,
      "denoise": 1.0,
      "latent_image": ["27", 0],
      "model": ["32", 0],
      "negative": ["33", 0],
      "positive": ["6", 0],
      "sampler_name": "euler",
      "scheduler": "beta",
      "seed": 42,
      "steps": 4
    }
  },
  "27": {
    "class_type": "EmptySD3LatentImage",
    "inputs": {
      "batch_size": 1,
      "height": 1024,
      "width": 1024
    }
  },
  "30": {
    "class_type": "CLIPLoader",
    "inputs": {
      "clip_name": "qwen_3_4b_heretic.safetensors",
      "type": "flux"
    }
  },
  "31": {
    "class_type": "VAELoader",
    "inputs": {
      "vae_name": "flux2-vae.safetensors"
    }
  },
  "32": {
    "class_type": "UNETLoader",
    "inputs": {
      "unet_name": "flux-2-klein-4b-fp8.safetensors",
      "weight_dtype": "fp8_e4m3fn"
    }
  },
  "33": {
    "class_type": "CLIPTextEncode",
    "inputs": {
      "clip": ["30", 0],
      "text": "NEGATIVE_PLACEHOLDER"
    }
  }
 }
 ```
 ---
 ## Phase 3: server.py — Workflow-Registry
 ### Änderung 1: Workflow-Registry dict (nach `_WORKFLOW_PATH`)
 ```python
 # Path to the bundled FLUX.1-schnell workflow template
 _WORKFLOW_PATH = Path(__file__).parent / "workflows" / "flux_schnell.json"
 # Workflow registry: model filename → workflow JSON path
 _WORKFLOW_REGISTRY: dict[str, Path] = {
    "flux1-schnell.safetensors": Path(__file__).parent / "workflows" / "flux_schnell.json",
    "flux-2-klein-4b-fp8.safetensors": Path(__file__).parent / "workflows" / "flux2_klein_heretic.json",
 }
 _DEFAULT_MODEL = "flux1-schnell.safetensors"
 ```
 ### Änderung 2: `_load_workflow()` Hilfsfunktion
 ```python
 def _load_workflow(model: str) -> dict:
    """Load the correct workflow JSON for the requested model.
    Falls back to FLUX.1-schnell if model not in registry.
    """
    path = _WORKFLOW_REGISTRY.get(model, _WORKFLOW_PATH)
    if not path.exists():
        raise FileNotFoundError(f"Workflow JSON not found: {path}")
    return json.loads(path.read_text())
 ```
 ### Änderung 3: `_generate_single()` nutzt Registry
 Aktueller Code lädt immer `_WORKFLOW_PATH`. Änderung: `_load_workflow(model)` aufrufen:
 ```python
 async def _generate_single(
    client: ComfyUIClient,
    prompt: str,
    negative_prompt: str,
    model: str,
    seed: int,
    width: int,
    height: int,
    steps: int,
    output_dir: Path,
    name: str,
 ) -> tuple[TextContent, ImageContent | None]:
    workflow = _load_workflow(model)  # ← statt json.loads(_WORKFLOW_PATH.read_text())
    # ... rest unchanged
 ```
 ---
 ## Phase 4: Tests
 Neue Tests in [`mcp/mcp-image-gen/tests/test_server.py`](mcp/mcp-image-gen/tests/test_server.py):
 1. **`test_workflow_registry_contains_both_models`** — Registry hat flux1-schnell + flux2-klein
 2. **`test_load_workflow_flux1_schnell`** — lädt flux_schnell.json korrekt
 3. **`test_load_workflow_flux2_klein`** — lädt flux2_klein_heretic.json korrekt
 4. **`test_load_workflow_unknown_model_falls_back`** — unbekanntes Modell → FLUX.1-schnell
 5. **`test_generate_image_uses_flux2_workflow`** — end-to-end Mock mit flux-2-klein-4b-fp8.safetensors
 ---
 ## Phase 5: USAGE.md Update
 Neuer Abschnitt "FLUX.2 Klein 4B (Heretic)" in [`mcp/mcp-image-gen/USAGE.md`](mcp/mcp-image-gen/USAGE.md):
 - Download-Befehle für alle 3 neuen Modell-Dateien
 - Erklärung warum Heretic (abliterierter Text-Encoder, KL=0)
 - Beispiel-Aufruf: `generate_image("...", model="flux-2-klein-4b-fp8.safetensors")`
 ---
 ## VRAM-Analyse
 | Modell | VRAM gesamt | Passt in 24GB? |
 |---|---|---|
 | FLUX.1-schnell (fp8) | ~8GB | ✅ |
 | FLUX.2 Klein 4B (fp8) + Qwen3-4B | ~8.4GB + ~4GB = ~12.4GB | ✅ |
 | Beide gleichzeitig geladen | ~20GB | ✅ mit Margin |
 Der RX 7900 XTX mit 24GB VRAM kann beide Modelle komfortabel halten.
 ---
 ## Risiken & Mitigationen
 | Risiko | Wahrscheinlichkeit | Mitigation |
 |---|---|---|
 | `CLIPLoader` node nicht verfügbar in ComfyUI | niedrig | ComfyUI updaten; alternativ custom node |
 | DreamFast-Modell nur als GGUF verfügbar | mittel | Lockout/qwen3-4b-heretic-zimage GGUF als Fallback |
 | Qwen3-4B braucht anderen node type | mittel | Live-Test in ComfyUI UI zuerst; workflow anpassen |
 | ROCm + Qwen3-4B Kompatibilität | niedrig | gleiche ROCm-Umgebung wie FLUX.1-schnell |
 ---
 ## Entscheidung
 ✅ **Empfehlung: Umsetzen.** Minimale Code-Änderungen, kein Breaking Change, klarer Mehrwert.
 Der einzige unsichere Punkt ist der genaue ComfyUI-Node-Name für den Qwen3-4B-Loader.
 **Empfohlene Vorgehensweise:** Erst in der ComfyUI-Web-UI manuell einen Workflow mit Qwen3-4B aufbauen → JSON exportieren → als `flux2_klein_heretic.json` speichern. Das garantiert korrekte Node-Namen ohne Guess-Work.
Author	SHA1	Message	Date
Patrick Plate	1d1e70776f	docs(plans): add heretic encoder swap task for FLUX.2 Klein uncensored generation	2026-04-10 20:32:05 +02:00
Patrick Plate	1d8849cb41	fix(mcp-image-gen): confirmed working FLUX.2 Klein encoder filename - CLIPLoader clip_name: qwen_3_4b_klein.safetensors (from Comfy-Org/vae-text-encorder-for-flux-klein-4b) - VAE: flux2-vae.safetensors (321MB, same repo) - Live test confirmed: 2.1MB photorealistic 1024x1024 PNG in 52.43s on RX 7900 XTX - Test: assert clip_name == qwen_3_4b_klein.safetensors - 37/37 tests pass	2026-04-10 20:29:18 +02:00
Patrick Plate	40c91edf2f	fix(mcp-image-gen): merge CFGGuider workflow fix for FLUX.2 Klein 4B	2026-04-10 20:21:16 +02:00
Patrick Plate	4a99a3625a	fix(mcp-image-gen): rewrite flux2_klein_heretic workflow with CFGGuider + correct node types - Replace FluxDisableGuidance+BasicGuider chain with CFGGuider (cfg=5) - CLIPLoader: add device='default', keep type='flux2' - UNETLoader: weight_dtype='default' (not fp8_e4m3fn — avoids dimension mismatch) - VAEDecode/SaveImage: updated node IDs (11→VAEDecode, 12→SaveImage) - Encoder: qwen_3_4b_bfl.safetensors (7.5GB BFL-merged shards) - Tests: update heretic model assertions for new node structure (37/37 pass) - Add RECAP doc with root cause analysis and session history	2026-04-10 20:21:12 +02:00
Patrick Plate	38d26adb1f	Merge branch 'fix/mcp-image-gen/heretic-flux2-bugfixes'	2026-04-10 19:21:51 +02:00
Patrick Plate	ea0c5d39c4	fix(mcp-image-gen): fix Heretic/FLUX2 integration bugs - Fix syntax error in server.py (dangling docstring lines) - Correct model filename: flux-2-klein-4b.safetensors (without -fp8) - Fix _WORKFLOW_REGISTRY key to match actual downloaded filename - Update get_models() to always include registry models as fallback - Fix test expectations to match corrected model names - All 37 tests passing	2026-04-10 19:21:51 +02:00