Add initial support for Metal Performance Shaders backend. #181

secYOUre · 2024-01-08T10:49:39Z

It works generally well on Apple Silicon. However there are pytorch operators, such as 'aten::_weight_norm_interface', which are not currently implemented for the MPS device. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op.

Similarly, when the XTTS engine is selected, MPS does not compose nicely with deepspeed, because features such as redirects (i.e., torch.distributed.elastic.multiprocessing.redirects) are not implemented on the CPU backend for Windows and MacOs at the time of writing. Run alongside --no-deepspeed option at the command line, where needed.

Everything else stays the same.

Committer: Alfonso De Gregorio [email protected]

It works generally well on Apple Silicon. However there are pytorch operators, such as 'aten::_weight_norm_interface', which are not currently implemented for the MPS device. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. Similarly, when the XTTS engine is selected, MPS does not compose nicely with deepspeed, because features such as redirects (i.e., torch.distributed.elastic.multiprocessing.redirects) are not implemented on the CPU backend for Windows and MacOs at the time of writing. Run alongside --no-deepspeed option at the command line, where needed. Everything else stays the same. Committer: Alfonso De Gregorio <[email protected]>

aedocw · 2024-01-08T17:33:02Z

This is really interesting, thanks for making this PR. I have only had a few minutes to play with it, still need to figure out exactly which scenarios require PYTORCH_ENABLE_MPS_FALLBACK=1 to work. Also it should probably disable deepspeed if using MPS (maybe line 391 becomes if self.no_deepspeed or self.device == "mps":)

Is there any way to set PYTORCH_ENABLE_MPS_FALLBACK=1 proactively, without the user having to know they need to do that? Otherwise this could break folks who are currently using apple silicon without their having to know to set the env var.

I'm definitely interested in figuring out how to merge this work though, thanks again!

secYOUre · 2024-01-09T03:59:33Z

Thanks Christopher for epub2tts! Nowadays the quality of the TTS is so great, that I am using epub2tts to go through the long tail of publications in my bookshelf for which an audiobook edition was never released. Sounds amazing, literally!

As per setting the environment in a proactive way: absolutely, after importing the os module with 'import os', something like os.environ['PYTORCH_ENABLE_MPS_FALLBACK']='1' will do the job.

With regard the composability of PyTorch and Deepspeed and the resulting conditional code branches in epub2tts, today disabling deepspeed when running on the MPS backend is necessary; as you correctly said, this can be done in the code, saving the user from the complexities of invoking the command line in the correct way. However, looking ahead, I expect to see further progress by Deepspeed in this space. And, as a result of that, its composability with PyTorch will improve when run on backends that today are not fully supported. Hence it is probably worth monitoring the main epub2tts dependancies, so as to adjust how their code is called. A command line option, conversely, puts greater burden on the users but gets the script code ready for the day in which PyTorch and Deepspace will compose at a fuller extent even when on MPS.

If there is anything I can do to help you with this, please don't hesitate to let me know. Best!

aedocw · 2024-01-09T23:14:03Z

From my brief testing, I think the environment variable would need to be set before Python loads pytorch. It seems like setting it in the script doesn't work:

        elif torch.backends.mps.is_available():
            self.device = "mps"
            os.environ['PYTORCH_ENABLE_MPS_FALLBACK']='1'

More importantly though, have you seen any advantage to using MPS? I ran some relatively short tests and it was not any faster (M1 Macbook Pro) than running without.

I am definitely interested in adding support for MPS when it works fully and offers speed improvements. But so far it doesn't seem like it does - but if there is a use-case where it makes a difference (maybe it's faster over a longer time frame?) please let me know. Otherwise I think it makes sense to keep tracking torch/mps progress, and we add this as a feature when all needed operators are enabled.

jooray · 2024-07-10T11:50:43Z

This branch no longer needs nightly build, it works with pytorch that I get with normal pip install .

So I think this should be merged, optionally with the:

PYTORCH_ENABLE_MPS_FALLBACK=1

set as an environment variable by default.

dougcooper · 2024-12-30T22:41:16Z

Would definitely like to see this merged. I would suggest adding something to the README.

aedocw · 2024-12-30T22:55:44Z

This functionality was actually merged in October with 14a3373b

aedocw mentioned this pull request Mar 28, 2024

Enable integration with Metal API for accelerated generation on Apple Silicon #227

Open

aedocw mentioned this pull request Apr 7, 2024

Add support for TTS which fully support Apple Silicon #233

Open

aedocw closed this Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial support for Metal Performance Shaders backend. #181

Add initial support for Metal Performance Shaders backend. #181

secYOUre commented Jan 8, 2024

aedocw commented Jan 8, 2024

secYOUre commented Jan 9, 2024

aedocw commented Jan 9, 2024

jooray commented Jul 10, 2024

dougcooper commented Dec 30, 2024

aedocw commented Dec 30, 2024

Add initial support for Metal Performance Shaders backend. #181

Add initial support for Metal Performance Shaders backend. #181

Conversation

secYOUre commented Jan 8, 2024

aedocw commented Jan 8, 2024

secYOUre commented Jan 9, 2024

aedocw commented Jan 9, 2024

jooray commented Jul 10, 2024

dougcooper commented Dec 30, 2024

aedocw commented Dec 30, 2024