Model Merging / Saving

### Custom Node Testing

- [x] I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)

### Expected Behavior

For the Model Merge Simple and Model Save-notes to work with all officially supported model architectures.

### Actual Behavior

Recently, the Model Merge, and Model Save nodes are not working as they should. They don't work for all models at least. Possibly some architectures are not supported, but it doesn't really indicate this. This is troublesome when trying to create custom model merges.

The following model used to work, but it stopped working:
Flux 2 Klein
It used to be fine, but after an update, I started receiving the following in my console:
```
Requested to load Flux2
Model Flux2 prepared for dynamic VRAM loading. 9124MB Staged. 112 patches attached.
Windows fatal exception: access violation

Stack (most recent call first):
  File "C:\AI\ComfyUI\venv\Lib\site-packages\safetensors\torch.py", line 460 in _tobytes
  File "C:\AI\ComfyUI\venv\Lib\site-packages\safetensors\torch.py", line 500 in _flatten
  File "C:\AI\ComfyUI\venv\Lib\site-packages\safetensors\torch.py", line 286 in save_file
  File "C:\AI\ComfyUI\comfy\utils.py", line 171 in save_torch_file
  File "C:\AI\ComfyUI\comfy\sd.py", line 1851 in save_checkpoint
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 227 in save_checkpoint
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 359 in save
  File "C:\AI\ComfyUI\execution.py", line 296 in process_inputs
  File "C:\AI\ComfyUI\execution.py", line 308 in _async_map_node_over_list
  File "C:\AI\ComfyUI\execution.py", line 334 in get_output_data
  File "C:\AI\ComfyUI\execution.py", line 534 in execute
  File "C:\AI\ComfyUI\execution.py", line 770 in execute_async
  File "C:\Python312\Lib\asyncio\events.py", line 88 in _run
  File "C:\Python312\Lib\asyncio\base_events.py", line 1987 in _run_once
  File "C:\Python312\Lib\asyncio\base_events.py", line 641 in run_forever
  File "C:\Python312\Lib\asyncio\windows_events.py", line 322 in run_forever
  File "C:\Python312\Lib\asyncio\base_events.py", line 674 in run_until_complete
  File "C:\Python312\Lib\asyncio\runners.py", line 118 in run
  File "C:\Python312\Lib\asyncio\runners.py", line 194 in run
  File "C:\AI\ComfyUI\execution.py", line 711 in execute
  File "C:\AI\ComfyUI\custom_nodes\ComfyUI-SaveImageWithMetaData\py\__init__.py", line 12 in run
  File "C:\AI\ComfyUI\custom_nodes\ComfyUI-SaveImageWithMetaData\py\__init__.py", line 12 in run
  File "C:\AI\ComfyUI\main.py", line 313 in prompt_worker
  File "C:\Python312\Lib\threading.py", line 1010 in run
  File "C:\Python312\Lib\threading.py", line 1073 in _bootstrap_inner
  File "C:\Python312\Lib\threading.py", line 1030 in _bootstrap
```

I also got this when using the Model Save node to merge down some LoRAs:
```
Model Flux2 prepared for dynamic VRAM loading. 5494MB Staged. 121 patches attached.
!!! Exception during processing !!! Error while serializing: I/O error: The supplied user buffer is not valid for the requested operation. (os error 1784)
Traceback (most recent call last):
  File "C:\AI\ComfyUI\execution.py", line 534, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 334, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 308, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "C:\AI\ComfyUI\execution.py", line 296, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 359, in save
    save_checkpoint(model, filename_prefix=filename_prefix, output_dir=self.output_dir, prompt=prompt, extra_pnginfo=extra_pnginfo)
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 227, in save_checkpoint
    comfy.sd.save_checkpoint(output_checkpoint, model, clip, vae, clip_vision, metadata=metadata, extra_keys=extra_keys)
  File "C:\AI\ComfyUI\comfy\sd.py", line 1868, in save_checkpoint
    comfy.utils.save_torch_file(sd, output_path, metadata=metadata)
  File "C:\AI\ComfyUI\comfy\utils.py", line 171, in save_torch_file
    safetensors.torch.save_file(sd, ckpt, metadata=metadata)
  File "C:\AI\ComfyUI\venv\Lib\site-packages\safetensors\torch.py", line 318, in save_file
    serialize_file(
safetensors_rust.SafetensorError: Error while serializing: I/O error: The supplied user buffer is not valid for the requested operation. (os error 1784)
```


I tried to tinker with it and add support but I could not get it working.



The following models never worked for me:
Z-Image Base:
This one doesn't give me any error, but the output model just produces noise, but the 2 input models are working fine individually.


Ernie:
Model Save node logs the following:
```
Requested to load ErnieImage
Model ErnieImage prepared for dynamic VRAM loading. 4558MB Staged. 72 patches attached.
!!! Exception during processing !!! Error while serializing: I/O error: The supplied user buffer is not valid for the requested operation. (os error 1784)
Traceback (most recent call last):
  File "C:\AI\ComfyUI\execution.py", line 534, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 334, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 308, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "C:\AI\ComfyUI\execution.py", line 296, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 359, in save
    save_checkpoint(model, filename_prefix=filename_prefix, output_dir=self.output_dir, prompt=prompt, extra_pnginfo=extra_pnginfo)
  File "C:\AI\ComfyUI\comfy_extras\nodes_model_merging.py", line 227, in save_checkpoint
    comfy.sd.save_checkpoint(output_checkpoint, model, clip, vae, clip_vision, metadata=metadata, extra_keys=extra_keys)
  File "C:\AI\ComfyUI\comfy\sd.py", line 1868, in save_checkpoint
    comfy.utils.save_torch_file(sd, output_path, metadata=metadata)
  File "C:\AI\ComfyUI\comfy\utils.py", line 171, in save_torch_file
    safetensors.torch.save_file(sd, ckpt, metadata=metadata)
  File "C:\AI\ComfyUI\venv\Lib\site-packages\safetensors\torch.py", line 318, in save_file
    serialize_file(
safetensors_rust.SafetensorError: Error while serializing: I/O error: The supplied user buffer is not valid for the requested operation. (os error 1784)

Prompt executed in 6.86 second
```

And using the output model logs:
```
Requested to load ErnieImage
Model ErnieImage prepared for dynamic VRAM loading. 4558MB Staged. 72 patches attached.
100%|██████████████████████████████████████████████████████████████████████████████| 12/12 [00:02<00:00,  4.22it/s]
Requested to load AutoencoderKL
Model AutoencoderKL prepared for dynamic VRAM loading. 160MB Staged. 0 patches attached.
Prompt executed in 19.73 seconds
got prompt
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.bfloat16, manual cast: torch.bfloat16
model_type FLOW
!!! Exception during processing !!! 'utf-32-be' codec can't decode bytes in position 16-18: truncated data
Traceback (most recent call last):
  File "C:\AI\ComfyUI\execution.py", line 534, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 334, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\execution.py", line 308, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "C:\AI\ComfyUI\execution.py", line 296, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "C:\AI\ComfyUI\nodes.py", line 973, in load_unet
    model = comfy.sd.load_diffusion_model(unet_path, model_options=model_options)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy\sd.py", line 1829, in load_diffusion_model
    model = load_diffusion_model_state_dict(sd, model_options=model_options, metadata=metadata, disable_dynamic=disable_dynamic)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy\sd.py", line 1821, in load_diffusion_model_state_dict
    model.load_model_weights(new_sd, "", assign=model_patcher.is_dynamic())
  File "C:\AI\ComfyUI\comfy\model_base.py", line 329, in load_model_weights
    m, u = self.diffusion_model.load_state_dict(to_load, strict=False, assign=assign)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 2615, in load_state_dict
    load(self, state_dict)
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 2603, in load
    load(child, child_state_dict, child_prefix)  # noqa: F821
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 2603, in load
    load(child, child_state_dict, child_prefix)  # noqa: F821
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 2603, in load
    load(child, child_state_dict, child_prefix)  # noqa: F821
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 1 more time]
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 2586, in load
    module._load_from_state_dict(
  File "C:\AI\ComfyUI\comfy\ops.py", line 938, in _load_from_state_dict
    layer_conf = json.loads(layer_conf.numpy().tobytes())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\json\__init__.py", line 341, in loads
    s = s.decode(detect_encoding(s), 'surrogatepass')
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\encodings\utf_32_be.py", line 11, in decode
    return codecs.utf_32_be_decode(input, errors, True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-32-be' codec can't decode bytes in position 16-18: truncated data
decoding with 'utf-32-be' codec failed

Prompt executed in 0.41 seconds
Warning: state dict on uninitialized op
Exception ignored in: <function ModelPatcher.__del__ at 0x0000029AF064D6C0>
Traceback (most recent call last):
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 1463, in __del__
    self.unpin_all_weights()
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 1509, in unpin_all_weights
    self.partially_unload_ram(1e32)
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 1671, in partially_unload_ram
    loading = self._load_list(for_dynamic=True, default_device=self.offload_device)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 755, in _load_list
    module_offload_mem += check_module_offload_mem("{}.weight".format(n))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 749, in check_module_offload_mem
    weight, _, _ = get_key_weight(self.model, key)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\comfy\model_patcher.py", line 157, in get_key_weight
    weight = getattr(op, op_keys[1])
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1965, in __getattr__
    raise AttributeError(
AttributeError: 'Linear' object has no attribute 'weight'
```


### Steps to Reproduce

For Flux 2 Klein, I can successfully use flux-2-klein-9b-bf16, and load a lora, and then save the model. But it doesn't work with fp8 and nvfp4. I'm expecting it's due to the mixed precision. Is there no way to provide support for this somehow?

<img width="2131" height="235" alt="Image" src="https://github.com/user-attachments/assets/e40c52d1-15db-47e5-a9d8-024a0384fde7" />

### Debug Logs

```powershell
See report above.
```

### Other

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Merging / Saving #13637

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model Merging / Saving #13637

Description

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions