Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On windows I am getting this error just when trying to edit any photo: RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' #84

Open
6 tasks
JmMndz opened this issue Dec 12, 2024 · 4 comments

Comments

@JmMndz
Copy link

JmMndz commented Dec 12, 2024

PLEASE READ BEFORE SUBMITTING AN ISSUE

MagicQuill is not a commercial software but a research project. While we strive to improve and maintain it, support is provided on a best-effort basis. Please be patient and respectful in your communications.
To help us respond faster and better, please ensure the following:

  1. Search Existing Resources: Have you looked through the documentation (e.g., hardware requirement and setup steps), and searched online for potential solutions?
  2. Avoid Duplication: Check if a similar issue already exists.

If the issue persists, fill out the details below.


Checklist

  • I have searched the documentation and FAQs.
  • I have searched for similar issues but couldn’t find a solution.
  • I have provided clear and detailed information about the issue.

Issue/Feature Request Description

Type of Issue:

  • Bug
  • Feature Request
  • Question

Summary:


Steps to Reproduce (For Bugs Only)

Expected Behavior:

Actual Behavior:


Additional Context/Details


Environment

  • OS:
  • Version:
  • Any Relevant Dependencies:

Feature Request Specifics (If Applicable)

  • What problem does this solve?:
  • How will this feature improve the project?:
@JmMndz JmMndz changed the title On windows I am getting this error just when trying to edit any photo: On windows I am getting this error just when trying to edit any photo: RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' Dec 12, 2024
@JmMndz
Copy link
Author

JmMndz commented Dec 12, 2024

D:\AI\MagicQuill\MagicQuill\pidi.py:334: UserWarning: The torch.cuda.DtypeTensor constructors are no longer recommended. It's best to use methods such as torch.tensor(data, dtype=, device='cuda') to create tensors. (Triggered internally at ..\torch\csrc\tensor\python_tensor.cpp:85.)
buffer = torch.cuda.FloatTensor(shape[0], shape[1], 5 * 5).fill_(0)
Base model type: SD1.5
BrushNet image.shape = torch.Size([1, 512, 767, 3]) mask.shape = torch.Size([1, 512, 767])
Requested to load AutoencoderKL
Loading 1 new model
loading in lowvram mode 64.0
BrushNet CL: image_latents shape = torch.Size([1, 4, 64, 95]) interpolated_mask shape = torch.Size([1, 1, 64, 95])
Requested to load ControlNet
Requested to load BaseModel
Loading 2 new models
loading in lowvram mode 64.0
loading in lowvram mode 64.0
0%| | 0/20 [00:00<?, ?it/s]BrushNet inference, step = 0: image batch = 1, got 1 latents, starting from 0
BrushNet inference: sample torch.Size([1, 4, 64, 95]) , CL torch.Size([1, 5, 64, 95]) dtype torch.float16
0%| | 0/20 [00:03<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\gradio\queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\gradio\route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\gradio\blocks.py", line 2018, in process_api
result = await self.call_function(
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\gradio\blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\anyio_backends_asyncio.py", line 2505, in run_sync_in_worker_thread
return await future
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\anyio_backends_asyncio.py", line 1005, in run
result = context.run(func, *args)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\gradio\utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "D:\AI\MagicQuill\gradio_run.py", line 155, in generate_image_handler
res = generate(
File "D:\AI\MagicQuill\gradio_run.py", line 123, in generate
latent_samples, final_image, lineart_output, color_output = scribbleColorEditModel.process(
File "D:\AI\MagicQuill\MagicQuill\scribble_color_edit.py", line 110, in process
latent_samples = self.ksampler.sample(
File "D:\AI\MagicQuill\MagicQuill\comfyui_utils.py", line 154, in sample
return self.common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
File "D:\AI\MagicQuill\MagicQuill\comfyui_utils.py", line 146, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "D:\AI\MagicQuill\MagicQuill\comfy\sample.py", line 43, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 794, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "D:\AI\MagicQuill\MagicQuill\model_patch.py", line 120, in modified_sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 683, in sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 662, in inner_sample
samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 567, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\AI\MagicQuill\MagicQuill\comfy\k_diffusion\sampling.py", line 159, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 291, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 649, in call
return self.predict_noise(*args, **kwargs)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 652, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 277, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
File "D:\AI\MagicQuill\MagicQuill\comfy\samplers.py", line 224, in calc_cond_batch
output = model_options['model_function_wrapper'](model.apply_model, {"input": input_x, "timestep": timestep_, "c": c, "cond_or_uncond": cond_or_uncond}).chunk(batch_chunks)
File "D:\AI\MagicQuill\MagicQuill\model_patch.py", line 50, in brushnet_model_function_wrapper
method(unet, xc, t, to, control)
File "D:\AI\MagicQuill\MagicQuill\brushnet_nodes.py", line 1022, in brushnet_forward
input_samples, mid_sample, output_samples = brushnet_inference(x, timesteps, transformer_options, debug)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\AI\MagicQuill\MagicQuill\brushnet_nodes.py", line 933, in brushnet_inference
return brushnet(x,
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI\MagicQuill\MagicQuill\brushnet\brushnet.py", line 785, in forward
emb = self.time_embedding(t_emb, timestep_cond)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\diffusers\models\embeddings.py", line 807, in forward
sample = self.linear_1(sample)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Jaime\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: "addmm_impl_cpu
" not implemented for 'Half'

@matok64
Copy link

matok64 commented Dec 13, 2024

same error for me

@JmMndz
Copy link
Author

JmMndz commented Dec 13, 2024

I think it is happening when CPU tries to perform a matrix multiplication (addmm is a combination of addition and matrix multiplication) with tensors that are in float16 (half-precision) format. This suggests that some parts of the model or data are being processed in float16 on the CPU, which is not well-supported.

@JmMndz
Copy link
Author

JmMndz commented Dec 16, 2024

It is finally working for me. No more RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

Summary of Changes:

Here's a breakdown of the modifications, categorized by file and the issues they addressed:

  1. gradio_run.py

Global device Variable:

Defined device globally to ensure consistent device usage throughout the code:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

ScribbleColorEditModel Initialization:

Passed the device to the constructor when creating the ScribbleColorEditModel instance:

scribbleColorEditModel = ScribbleColorEditModel(device)

prepare_images_and_masks:

Convert the base64 to tensors using:

total_mask = create_alpha_mask(total_mask).to(device)
original_image_tensor = load_and_preprocess_image(original_image).to(device)
if add_color_image:
add_color_image_tensor = load_and_preprocess_image(add_color_image).to(device)
else:
add_color_image_tensor = original_image_tensor

add_edge_mask = create_alpha_mask(add_edge_image).to(device) if add_edge_image else torch.zeros_like(total_mask).to(device)
remove_edge_mask = create_alpha_mask(remove_edge_image).to(device) if remove_edge_image else torch.zeros_like(total_mask).to(device)

return add_color_image_tensor, original_image_tensor, total_mask, add_edge_mask, remove_edge_mask

generate_image_handler:

Removed the lines that were moving ms_data['total_mask'] and ms_data['original_image'] to the device because that is already being handled on the prepare_images_and_masks function.

  1. scribble_color_edit.py

ScribbleColorEditModel.init:

Modified the constructor to take the device as an argument.

Moved the model within the ModelPatcher to the device:

self.model.model = self.model.model.to(self.device)

Initialized the CLIPTextEncode object without passing the clip model initially:

self.clip_text_encoder = CLIPTextEncode() # Initialize here

Set the clip attribute of the clip_text_encoder after loading the CLIP model:

self.clip_text_encoder.clip = self.clip # Set clip after loading

ScribbleColorEditModel.load_models:

Removed the dtype argument from the function definition.

Correctly extracted the brushnet model from the dictionary returned by self.brushnet_loader.brushnet_loading().

Moved edge_controlnet, color_controlnet, and brushnet to the device.

ScribbleColorEditModel.process:

Moved the newly loaded models to the device if the ckpt_name changes.

Ensured that self.clip_text_encoder.clip is set to the new self.clip when a new checkpoint is loaded.

Correctly called the encode method of CLIPTextEncode:

positive = self.clip_text_encoder.encode(positive_prompt, self.device)[0]
negative = self.clip_text_encoder.encode(negative_prompt, self.device)[0]

Moved input tensors (image, colored_image, mask, add_mask, remove_mask) to the device with dtype=torch.float32 before using them.

  1. comfyui_utils.py

CLIPTextEncode.encode:

Removed the clip argument from the method definition.

Used self.clip.tokenize(text) to tokenize the text, relying on the clip attribute set in ScribbleColorEditModel.

Correctly converted the tokenized output (which is a dictionary) into tensors and moved them to the device:

tokens = self.clip.tokenize(text)
if "l" in tokens:
tokens["l"] = torch.tensor(tokens["l"], device=device)
if "s" in tokens:
tokens["s"] = torch.tensor(tokens["s"], device=device)

  1. brushnet_nodes.py

BrushNetLoader.brushnet_loading:

Removed the dtype argument from the method definition.

Enforced torch_dtype = torch.float32.

Moved the loaded brushnet_model to the device:

brushnet_model = load_checkpoint_and_dispatch(
# ... other arguments
).to(device) # Move to the determined device (GPU or CPU)

Returned the dictionary containing model info in a tuple: ({"brushnet": brushnet_model, ...}, )

BrushNet.model_update:

Passed is_SDXL and is_PP to check_compatibility function.

check_compatibility:

Modified to take is_SDXL and is_PP as arguments instead of a dictionary, to access the SDXL and PP values directly using the variables.

  1. comfy/sd1_clip.py

set_up_textual_embeddings:

Added a check isinstance(y, torch.Tensor) to ensure that y is a tensor before accessing its shape.

Added a check y.numel() > 0 to ensure that y is not an empty tensor.

Added a check len(y.shape) > 0 to ensure that y has at least one dimension.

Added an else branch to the shape check to log a more informative warning message if y is not a valid tensor or if it has an unexpected shape.

Added a check if not tokens_temp: to handle cases where the input prompt might result in an empty list of tokens. In such cases, a padding token is added to ensure that the list is not empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants