-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error load model SDXL #552
Comments
Can you try running it with q8_0 quantization to see how it goes? (add |
Hi and thanks, actually with q8_0 quantization it works, but I forgot to specify that with some previous version of sd.cpp without quantization and only with --vae-on-cpu I was able to load the sdxl models. an example with the old version: /build.cuda/bin/sd -m /media/dati003/MODEL_DIFFUSION/sdxlUnstableDiffusers_v11.safetensors --vae /media/dati003/MODEL_DIFFUSION/sdxl_vae.safetensors -p 'a lovely cat' -v |
Hi guys, thanks for the great work.
I recently downloaded the latest master from git
to try out the new features like inpaint.
I noticed that when I compile with CUDA and VULKAN and try to load SDXL models I get a segmentation fault.
I think it all happens near model.cpp
when i compile with cpu backend this error does not appear.
attach verbose cuda backend:
Option:
n_threads: 6
mode: txt2img
model_path: /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors
wtype: unspecified
clip_l_path:
clip_g_path:
t5xxl_path:
diffusion_model_path:
vae_path: /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors
taesd_path:
esrgan_path:
controlnet_path:
embeddings_path:
stacked_id_embeddings_path:
input_id_images_path:
style ratio: 20.00
normalize input image : false
output_path: output.png
init_img:
mask_img:
control_image:
clip on cpu: false
controlnet cpu: false
vae decoder on cpu:true
diffusion flash attention:false
strength(control): 0.90
prompt: a lovely cat
negative_prompt:
min_cfg: 1.00
cfg_scale: 7.00
slg_scale: 0.00
guidance: 3.50
clip_skip: -1
width: 512
height: 512
sample_method: euler_a
schedule: default
sample_steps: 20
strength(img2img): 0.75
rng: cuda
seed: 42
batch_count: 1
vae_tiling: false
upscale_repeats: 1
System Info:
SSE3 = 1
AVX = 1
AVX2 = 1
AVX512 = 1
AVX512_VBMI = 0
AVX512_VNNI = 0
FMA = 1
NEON = 0
ARM_FMA = 0
F16C = 1
FP16_VA = 0
WASM_SIMD = 0
VSX = 0
[DEBUG] stable-diffusion.cpp:163 - Using CUDA backend
[INFO ] stable-diffusion.cpp:195 - loading model from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:888 - load /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:230 - loading vae from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors'
[INFO ] model.cpp:888 - load /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors'
[INFO ] stable-diffusion.cpp:242 - Version: SDXL
[INFO ] stable-diffusion.cpp:275 - Weight type: f16
[INFO ] stable-diffusion.cpp:276 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:277 - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:278 - VAE weight type: f32
[DEBUG] stable-diffusion.cpp:280 - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:171 - vocab size: 49408
[DEBUG] clip.hpp:182 - trigger word img already in vocab
[DEBUG] ggml_extend.hpp:1107 - clip params backend buffer size = 469.44 MB(VRAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1107 - clip params backend buffer size = 2649.92 MB(VRAM) (517 tensors)
ggml_backend_cuda_buffer_type_alloc_buffer: allocating 4900.07 MiB on device 0: cudaMalloc failed: out of memory
[ERROR] ggml_extend.hpp:1101 - unet alloc params backend buffer failed, num_tensors = 1680
[INFO ] stable-diffusion.cpp:354 - VAE Autoencoder: Using CPU backend
[DEBUG] ggml_extend.hpp:1107 - vae params backend buffer size = 94.47 MB(RAM) (140 tensors)
[DEBUG] stable-diffusion.cpp:417 - loading weights
[DEBUG] model.cpp:1698 - loading tensors from /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors
|=============> | 713/2641 - 11.36it/s
Errore di segmentazione (core dump creato)
attach verbose vulkan backend:
./build.vulkan/bin/sd -m /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors --vae /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors -p 'a lovely cat' --vae-on-cpu -v
Option:
n_threads: 6
mode: txt2img
model_path: /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors
wtype: unspecified
clip_l_path:
clip_g_path:
t5xxl_path:
diffusion_model_path:
vae_path: /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors
taesd_path:
esrgan_path:
controlnet_path:
embeddings_path:
stacked_id_embeddings_path:
input_id_images_path:
style ratio: 20.00
normalize input image : false
output_path: output.png
init_img:
mask_img:
control_image:
clip on cpu: false
controlnet cpu: false
vae decoder on cpu:true
diffusion flash attention:false
strength(control): 0.90
prompt: a lovely cat
negative_prompt:
min_cfg: 1.00
cfg_scale: 7.00
slg_scale: 0.00
guidance: 3.50
clip_skip: -1
width: 512
height: 512
sample_method: euler_a
schedule: default
sample_steps: 20
strength(img2img): 0.75
rng: cuda
seed: 42
batch_count: 1
vae_tiling: false
upscale_repeats: 1
System Info:
SSE3 = 1
AVX = 1
AVX2 = 1
AVX512 = 1
AVX512_VBMI = 0
AVX512_VNNI = 0
FMA = 1
NEON = 0
ARM_FMA = 0
F16C = 1
FP16_VA = 0
WASM_SIMD = 0
VSX = 0
[DEBUG] stable-diffusion.cpp:172 - Using Vulkan backend
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce GTX 1070 Ti (NVIDIA) | uma: 0 | fp16: 0 | warp size: 32
ggml_vulkan: Compiling shaders..............................Done!
[INFO ] stable-diffusion.cpp:195 - loading model from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:888 - load /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors'
[INFO ] stable-diffusion.cpp:230 - loading vae from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors'
[INFO ] model.cpp:888 - load /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors using safetensors format
[DEBUG] model.cpp:959 - init from '/media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sdxl_vae.safetensors'
[INFO ] stable-diffusion.cpp:242 - Version: SDXL
[INFO ] stable-diffusion.cpp:275 - Weight type: f16
[INFO ] stable-diffusion.cpp:276 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:277 - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:278 - VAE weight type: f32
[DEBUG] stable-diffusion.cpp:280 - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:171 - vocab size: 49408
[DEBUG] clip.hpp:182 - trigger word img already in vocab
[DEBUG] ggml_extend.hpp:1107 - clip params backend buffer size = 469.44 MB(VRAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1107 - clip params backend buffer size = 2649.92 MB(VRAM) (517 tensors)
ggml_vulkan: Device memory allocation of size 847096320 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
[ERROR] ggml_extend.hpp:1101 - unet alloc params backend buffer failed, num_tensors = 1680
[INFO ] stable-diffusion.cpp:354 - VAE Autoencoder: Using CPU backend
[DEBUG] ggml_extend.hpp:1107 - vae params backend buffer size = 94.47 MB(RAM) (140 tensors)
[DEBUG] stable-diffusion.cpp:417 - loading weights
[DEBUG] model.cpp:1698 - loading tensors from /media/dati003/MODEL_DIFFUSION/MODEL_DIFFUSION_SD3/sd_xl_base_1.0.safetensors
|=============> | 713/2641 - 71.43it/s
Errore di segmentazione (core dump creato)
I run sd.cpp with gdb to try to trace the error, but I'm not sure if that's the right place.
gdb cuda:
Thread 1 "sd" received signal SIGSEGV, Segmentation fault.
0x0000555555940af2 in ggml_fp16_to_fp32_row ()
(gdb) where
#0 0x0000555555940af2 in ggml_fp16_to_fp32_row ()
#1 0x00005555555eb689 in ModelLoader::load_tensors(std::function<bool (TensorStorage const&, ggml_tensor**)>, ggml_backend*) ()
#2 0x00005555555ec8b9 in ModelLoader::load_tensors(std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, ggml_tensor*, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, ggml_tensor*> > >&, ggml_backend*, std::set<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >)
()
#3 0x00005555556bf935 in StableDiffusionGGML::load_from_file(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool, ggml_type, schedule_t, bool, bool, bool, bool) ()
#4 0x0000555555627ccc in new_sd_ctx ()
#5 0x00005555555736bc in main ()
gdb vulkan:
Thread 1 "sd" received signal SIGSEGV, Segmentation fault.
0x00005555557a81d5 in ggml_backend_tensor_set ()
(gdb) where
#0 0x00005555557a81d5 in ggml_backend_tensor_set ()
#1 0x00005555555f4593 in ModelLoader::load_tensors(std::function<bool (TensorStorage const&, ggml_tensor**)>, ggml_backend*) ()
#2 0x00005555555f4d59 in ModelLoader::load_tensors(std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, ggml_tensor*, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, ggml_tensor*> > >&, ggml_backend*, std::set<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >)
()
#3 0x00005555556c7d35 in StableDiffusionGGML::load_from_file(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool, ggml_type, schedule_t, bool, bool, bool, bool) ()
#4 0x000055555563015c in new_sd_ctx ()
#5 0x00005555555991cc in main ()
in the cuda case if I follow gdb suggestion
it seems that the point is in convert_tensor model.cpp:
https://github.com/leejet/stable-diffusion.cpp/blob/master/model.cpp#L735
while for vulkan I arrive here:
https://github.com/leejet/stable-diffusion.cpp/blob/master/model.cpp#L1822
I'm not sure how to debug this right now.
the verbose makes me think it was a loading error.
cuda:
ggml_backend_cuda_buffer_type_alloc_buffer: allocating 4900.07 MiB on device 0: cudaMalloc failed: out of memory
vulkan:
gml_vulkan: Device memory allocation of size 847096320 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
any ideas to try to investigate?
thanks Dario
The text was updated successfully, but these errors were encountered: