Unable to run Speech T5 on XPU #10025

nedo99 · 2024-01-29T08:24:16Z

Hello,

I am trying to run Speech T5 on XPU but am unable to. It is this model https://huggingface.co/microsoft/speecht5_tts and here is my code:

from bigdl.llm.transformers import AutoModelForSpeechSeq2Seq, AutoModelForCausalLM
import intel_extension_for_pytorch as ipex
from bigdl.llm import optimize_model

model = AutoModelForCausalLM.from_pretrained("microsoft/speecht5_tts",
                                                    torch_dtype="auto",
                                                    trust_remote_code=True,
                                                    low_cpu_mem_usage=True
                                                    )
model = optimize_model(model)
model = model.to('xpu')

and I am getting the following error:

raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.speecht5.configuration_speecht5.SpeechT5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

Is there support for text-to-speech by BigDL? Or am I missing something?

Regards,
Nedim

The text was updated successfully, but these errors were encountered:

shane-huang · 2024-01-30T03:06:03Z

about the `ValueError: Unrecognized configuration class`

This error is not quite relevant to bigdl's support. The error message indicates the AutoModelForCausalLM does not support loading the speech t5 model. It could be that you're using the incorrect AutoClass, or the transformers version is not updated, or transformers does not support using AutoClasses to load this model.

about bigdl-llm usage

Regarding to bigdl-llm usage, 1) auto classes in bigdl.llm.transformers and 2) optimize_model are two sets of APIs and you can choose either of them (not both) to optimize a model.

use auto classes from bigdl.llm.transformers if the model supports auto class loading (take AutoModelForCasualLM as example):

from bigdl.llm.transformers import AutoModelForCausalLM
# specifcy load_in_4bit=True to load the model directly in 4bit
bigdl_model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_4bit=True)

use optimize_model for arbitrary pytorch model loading:

# normal way of loading a model using a model loader class XXXModel, e.g. `SpeechT5Model`, 
from transformers import XXXModel
original_model = XXXModel.from_pretrained(...) 

from bigdl.llm import optimize_model
bigdl_model = optimize_model(original_model)

Speech T5 using optimize_model

According to transformers speecht5 doc, you can use SpeechT5Model to load the model, you may try loading the model using optimize_model like following.

from transformers import SpeechT5Model
...
model = SpeechT5Model.from_pretrained("microsoft/speecht5_tts")

from bigdl.llm import optimize_model
bigdl_model = optimize_model(model) 
bigdl_model.to('xpu')

shane-huang · 2024-01-30T05:47:21Z

SpeechT5 model can be successfully loaded using bigdl using AutoModelForSpeechSeq2Seq, instead of AutoModelForCasualLM. Below code works in our test (using transformers version 4.31.0 and bigdl version 2.5.0b20240124).

from bigdl.llm.transformers import AutoModelForSpeechSeq2Seq
...
model = AutoModelForSpeechSeq2Seq.from_pretrained("microsoft/speecht5_tts", load_in_4bit=True)
model = model.to("xpu")
...

nedo99 · 2024-01-30T11:33:43Z

SpeechT5 model can be successfully loaded using bigdl using AutoModelForSpeechSeq2Seq, instead of AutoModelForCasualLM. Below code works in our test (using transformers version 4.31.0 and bigdl version 2.5.0b20240124).

use optimize_model
from bigdl.llm.transformers import AutoModelForSpeechSeq2Seq
...
model = AutoModelForSpeechSeq2Seq.from_pretrained("microsoft/speecht5_tts", load_in_4bit=True)
model = model.to("xpu")
...

I updated my bigdl version, but now I am getting segfault. Here is backtrace:

in xpu::dpcpp::initGlobalDevicePoolState() () from /envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
(gdb) bt
#0  0x00007fff17d3a9cb in xpu::dpcpp::initGlobalDevicePoolState() ()
   from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#1  0x00007ffff7c99ee8 in __pthread_once_slow (once_control=0x7fff2a0888e0 <xpu::dpcpp::init_device_flag>, 
    init_routine=0x7fffc24dac90 <std::__once_proxy()>) at ./nptl/pthread_once.c:116
#2  0x00007fff17d37291 in xpu::dpcpp::dpcppGetDeviceCount(int*) ()
   from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#3  0x00007fff17cf0912 in xpu::dpcpp::device_count()::{lambda()#1}::operator()() const ()
   from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#4  0x00007fff17cf08d8 in xpu::dpcpp::device_count() ()

According to the backtrace, it seems like issue with finding the GPU. sycl-ls shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?

jason-dai · 2024-01-30T16:31:30Z

According to the backtrace, it seems like issue with finding the GPU. sycl-ls shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?

Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.

Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).

nedo99 · 2024-01-30T21:00:04Z

According to the backtrace, it seems like issue with finding the GPU. sycl-ls shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?

Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.

Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).

But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message: ImportError: libsycl.so.6: cannot open shared object file: No such file or directory. I tried both oneAPI v 2024.0.0.49564 and 2024.0.2-49895 with Pytorch 2.1.

shane-huang · 2024-01-31T02:01:12Z

According to the backtrace, it seems like issue with finding the GPU. sycl-ls shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?

Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.
Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).

But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message: ImportError: libsycl.so.6: cannot open shared object file: No such file or directory. I tried both oneAPI v 2024.0.0.49564 and 2024.0.2-49895 with Pytorch 2.1.

Did you correctly configure the OneAPI env variables (refer to the instructions here)? And also pay attention to the runtime configurations instructions here which may prevent lots of runtime issues.

nedo99 · 2024-01-31T10:47:14Z

According to the backtrace, it seems like issue with finding the GPU. sycl-ls shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?

Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.
Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).

But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message: ImportError: libsycl.so.6: cannot open shared object file: No such file or directory. I tried both oneAPI v 2024.0.0.49564 and 2024.0.2-49895 with Pytorch 2.1.

Did you correctly configure the OneAPI env variables (refer to the instructions here)? And also pay attention to the runtime configurations instructions here which may prevent lots of runtime issues.

Yes and yes, but the issue is still there.

shane-huang · 2024-02-01T02:12:57Z

Could you provide the os, kernel and python version?

Oscilloscope98 · 2024-02-01T05:58:26Z

With oneAPI 2024 I still did not get anything working since I am getting an error message: ImportError: libsycl.so.6: cannot open shared object file: No such file or directory. I tried both oneAPI v 2024.0.0.49564 and 2024.0.2-49895 with Pytorch 2.1.

To resolve this problem and use oneAPI 2024.0, it is recommended creating a new conda env through:

conda create -n new-llm-env python=3.9
conda activate new-llm-env

pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu

Or if you would like to use BigDL-LLM with oneAPI 2024.0 in your old conda environment, you could:

pip uninstall bigdl-core-xe
pip uninstall bigdl-core-xe-21
pip uninstall bigdl-core-xe-esimd
pip uninstall bigdl-core-xe-esimd-21
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu

Note that bigdl-llm, bigdl-core-xe-21 and bigdl-core-xe-esimd-21 should have the same version if bigdl-llm has been upgraded to the one with oneAPI 2024.0/PyTorch 2.1 correctly.

nedo99 · 2024-02-01T13:38:59Z

Could you provide the os, kernel and python version?

OS: Ubuntu 22.04
kernel: 5.19.0-41-generic
Python 3.9.18
oneAPI: 2024.0.0.49564
bigdl-llm: 2.5.0b20240201
bigdl_core_xe: 2.5.0b20240201
bigdl_core_xe_esimd: 2.5.0b20240201
intel_extension_for_pytorch: 2.1.10+xpu

Oscilloscope98 · 2024-02-04T10:20:09Z

Hi @nedo99,

For bigdl-llm>=2.5.0b20240204, you could run speech t5 with BigDL-LLM optimization as below :)

Env (PyTorch 2.1 with oneAPI 2024.0):

conda create -n speecht5-test python=3.9
conda activate speecht5-test

pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
pip install datasets soundfile

Runtime Configuration: following here

Code:

import torch
from transformers import SpeechT5Processor, SpeechT5HifiGan, SpeechT5ForTextToSpeech
from datasets import load_dataset
import soundfile as sf
import time

processor = SpeechT5Processor.from_pretrained("microsoft/speecht5_tts")
model = SpeechT5ForTextToSpeech.from_pretrained("microsoft/speecht5_tts")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")

from bigdl.llm import optimize_model
model = optimize_model(model, modules_to_not_convert=["speech_decoder_postnet.feat_out",
                                                      "speech_decoder_postnet.prob_out"]) 
model = model.to('xpu')
vocoder = vocoder.to('xpu')

text = "On a cold winter night, a lonely traveler found a shimmering stone in the snow, unaware that it would lead him to a world full of wonders."
inputs = processor(text=text, return_tensors="pt").to('xpu')

# load xvector containing speaker's voice characteristics from a dataset
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors",
                                  split="validation")
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0).to('xpu')

with torch.inference_mode():
  # wamrup
  st = time.perf_counter()
  speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
  print(f'Warmup time: {time.perf_counter() - st}')

  st1 = time.perf_counter()
  speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
  torch.xpu.synchronize()
  st2 = time.perf_counter()
  print(f"Inference time: {st2-st1}")

sf.write("speech_bigdl_llm.wav", speech.to('cpu').numpy(), samplerate=16000)

Please let us know for any further problems :)

Oscilloscope98 · 2024-02-04T10:24:36Z

If you would be also interested in other TTS models we support, you can run Bark with BigDL-LLM optimization as follows :)

Env (PyTorch 2.1 with oneAPI 2024.0):

conda create -n bark-test python=3.9
conda activate bark-test

pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
pip install scipy

Runtime Configuration: following here

Code:

from transformers import AutoProcessor, BarkModel
import torch
import time

processor = AutoProcessor.from_pretrained("suno/bark-small")
model = BarkModel.from_pretrained("suno/bark-small")

from bigdl.llm import optimize_model
model = optimize_model(model).to('xpu')

voice_preset = "v2/en_speaker_6"

text = "Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe."
inputs = processor(text, voice_preset=voice_preset).to('xpu')

# warmup
st = time.time()
with torch.inference_mode():
    model.generate(**inputs)
torch.xpu.synchronize()
print(f"Warmup time: {time.time() - st}")

st = time.time()
with torch.inference_mode():
  audio_array = model.generate(**inputs)
torch.xpu.synchronize()
print(f"Inference time: {time.time() - st}")

audio_array = audio_array.cpu().numpy().squeeze()

from scipy.io.wavfile import write as write_wav
sample_rate = model.generation_config.sample_rate
write_wav("output/bark_generation_bigdl_llm.wav", sample_rate, audio_array)

nedo99 · 2024-02-06T14:09:25Z

Speech T5 sample works.

Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.

Oscilloscope98 · 2024-02-07T01:39:18Z

Speech T5 sample works.

Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.

Hi @nedo99 ,

Could you let me know your test env for Bark?

Could you provide the os, kernel and python version?

OS: Ubuntu 22.04 kernel: 5.19.0-41-generic Python 3.9.18 oneAPI: 2024.0.0.49564 bigdl-llm: 2.5.0b20240201 bigdl_core_xe: 2.5.0b20240201 bigdl_core_xe_esimd: 2.5.0b20240201 intel_extension_for_pytorch: 2.1.10+xpu

What shows here seems not be a correct PyTorch 2.1 env for me :) You could try the steps here for a correct PyTorch 2.1 + oneAPI 2024.0 env for bigdl-llm: #10025 (comment)

nedo99 · 2024-02-07T11:26:48Z

Speech T5 sample works.
Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.

Hi @nedo99 ,

Could you let me know your test env for Bark?

Could you provide the os, kernel and python version?

OS: Ubuntu 22.04 kernel: 5.19.0-41-generic Python 3.9.18 oneAPI: 2024.0.0.49564 bigdl-llm: 2.5.0b20240201 bigdl_core_xe: 2.5.0b20240201 bigdl_core_xe_esimd: 2.5.0b20240201 intel_extension_for_pytorch: 2.1.10+xpu

What shows here seems not be a correct PyTorch 2.1 env for me :) You could try the steps here for a correct PyTorch 2.1 + oneAPI 2024.0 env for bigdl-llm: #10025 (comment)

Here is the updated environment:

pip list | grep bigdl
bigdl-core-xe-21            2.5.0b20240206
bigdl-core-xe-esimd-21      2.5.0b20240206
bigdl-llm                   2.5.0b20240206
Name: intel-extension-for-pytorch
Version: 2.1.10+xpu
oneAPI 2024

jason-dai added the user issue label Jan 29, 2024

This was referenced Feb 5, 2024

LLM: add speech t5 GPU example #10090

Merged

LLM: add bark gpu example #10091

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run Speech T5 on XPU #10025

Unable to run Speech T5 on XPU #10025

nedo99 commented Jan 29, 2024

shane-huang commented Jan 30, 2024 •

edited

Loading

shane-huang commented Jan 30, 2024 •

edited

Loading

nedo99 commented Jan 30, 2024

use `optimize_model`

jason-dai commented Jan 30, 2024

nedo99 commented Jan 30, 2024 •

edited

Loading

shane-huang commented Jan 31, 2024

nedo99 commented Jan 31, 2024

shane-huang commented Feb 1, 2024

Oscilloscope98 commented Feb 1, 2024

nedo99 commented Feb 1, 2024

Oscilloscope98 commented Feb 4, 2024 •

edited

Loading

Oscilloscope98 commented Feb 4, 2024 •

edited

Loading

nedo99 commented Feb 6, 2024

Oscilloscope98 commented Feb 7, 2024

nedo99 commented Feb 7, 2024

Unable to run Speech T5 on XPU #10025

Unable to run Speech T5 on XPU #10025

Comments

nedo99 commented Jan 29, 2024

shane-huang commented Jan 30, 2024 • edited Loading

about the ValueError: Unrecognized configuration class

about bigdl-llm usage

Speech T5 using optimize_model

shane-huang commented Jan 30, 2024 • edited Loading

nedo99 commented Jan 30, 2024

use optimize_model

jason-dai commented Jan 30, 2024

nedo99 commented Jan 30, 2024 • edited Loading

shane-huang commented Jan 31, 2024

nedo99 commented Jan 31, 2024

shane-huang commented Feb 1, 2024

Oscilloscope98 commented Feb 1, 2024

nedo99 commented Feb 1, 2024

Oscilloscope98 commented Feb 4, 2024 • edited Loading

Oscilloscope98 commented Feb 4, 2024 • edited Loading

nedo99 commented Feb 6, 2024

Oscilloscope98 commented Feb 7, 2024

nedo99 commented Feb 7, 2024

shane-huang commented Jan 30, 2024 •

edited

Loading

about the `ValueError: Unrecognized configuration class`

shane-huang commented Jan 30, 2024 •

edited

Loading

use `optimize_model`

nedo99 commented Jan 30, 2024 •

edited

Loading

Oscilloscope98 commented Feb 4, 2024 •

edited

Loading

Oscilloscope98 commented Feb 4, 2024 •

edited

Loading