Skip to content

Commit

Permalink
feat: support macOS with Apple Silicon  (#155)
Browse files Browse the repository at this point in the history
* feat: macOS support (#143)

* Support for running on Apple Silicon Macs with MPS

* Minor typo fix: s/provicer/provider/

* Another typo fix: s/concact/concat/

* s/cudaexecutionprovider/CUDAExecutionProvider/

* Add requirements_apple.txt

* doc: macOS support

* chore: refine the structure and doc

* doc: update readme

* doc: update readme

* doc: update readme

* doc: update readme

---------

Co-authored-by: Jeethu Rao <[email protected]>
Co-authored-by: zzzweakman <[email protected]>
  • Loading branch information
3 people authored Jul 17, 2024
1 parent 54e5098 commit 0f83984
Show file tree
Hide file tree
Showing 13 changed files with 103 additions and 56 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ __pycache__/

pretrained_weights/*.md
pretrained_weights/docs
pretrained_weights/liveportrait

# Ipython notebook
*.ipynb
Expand All @@ -19,3 +20,4 @@ pretrained_weights/docs
animations/*
tmp/*
.vscode/launch.json
**/*.DS_Store
4 changes: 2 additions & 2 deletions inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ def main():
fast_check_args(args)

# specify configs for inference
inference_cfg = partial_fields(InferenceConfig, args.__dict__) # use attribute of args to initial InferenceConfig
crop_cfg = partial_fields(CropConfig, args.__dict__) # use attribute of args to initial CropConfig
inference_cfg = partial_fields(InferenceConfig, args.__dict__)
crop_cfg = partial_fields(CropConfig, args.__dict__)

live_portrait_pipeline = LivePortraitPipeline(
inference_cfg=inference_cfg,
Expand Down
27 changes: 21 additions & 6 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@


## 🔥 Updates
- **`2024/07/17`**: 🍎 We support macOS with Apple Silicon, modified from [jeethu](https://github.com/jeethu)'s PR [#143](https://github.com/KwaiVGI/LivePortrait/pull/143).
- **`2024/07/10`**: 💪 We support audio and video concatenating, driving video auto-cropping, and template making to protect privacy. More to see [here](assets/docs/changelog/2024-07-10.md).
- **`2024/07/09`**: 🤗 We released the [HuggingFace Space](https://huggingface.co/spaces/KwaiVGI/liveportrait), thanks to the HF team and [Gradio](https://github.com/gradio-app/gradio)!
- **`2024/07/04`**: 😊 We released the initial version of the inference code and models. Continuous updates, stay tuned!
Expand All @@ -55,20 +56,25 @@ cd LivePortrait
# create env using conda
conda create -n LivePortrait python==3.9.18
conda activate LivePortrait
# install dependencies with pip

# install dependencies with pip (for Linux and Windows)
pip install -r requirements.txt
# for macOS with Apple Silicon
pip install -r requirements_macOS.txt
```

**Note:** make sure your system has [FFmpeg](https://ffmpeg.org/) installed!
**Note:** make sure your system has [FFmpeg](https://ffmpeg.org/download.html) installed, including both `ffmpeg` and `ffprobe`!

### 2. Download pretrained weights

The easiest way to download the pretrained weights is from HuggingFace:
```bash
# first, ensure git-lfs is installed, see: https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage
git lfs install
# clone the weights
git clone https://huggingface.co/KwaiVGI/liveportrait pretrained_weights
# clone and move the weights
git clone https://huggingface.co/KwaiVGI/liveportrait temp_pretrained_weights
mv temp_pretrained_weights/* pretrained_weights/
rm -rf temp_pretrained_weights
```

Alternatively, you can download all pretrained weights from [Google Drive](https://drive.google.com/drive/folders/1UtKgzKjFAOmZkhNK-OYT0caJ_w2XAnib) or [Baidu Yun](https://pan.baidu.com/s/1MGctWmNla_vZxDbEp2Dtzw?pwd=z5cn). Unzip and place them in `./pretrained_weights`.
Expand Down Expand Up @@ -96,7 +102,11 @@ pretrained_weights

#### Fast hands-on
```bash
# For Linux and Windows
python inference.py

# For macOS with Apple Silicon, Intel not supported, this maybe 20x slower than RTX 4090
PYTORCH_ENABLE_MPS_FALLBACK=1 python inference.py
```

If the script runs successfully, you will get an output mp4 file named `animations/s6--d0_concat.mp4`. This file includes the following results: driving video, input image, and generated result.
Expand Down Expand Up @@ -145,7 +155,11 @@ python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/
We also provide a Gradio <a href='https://github.com/gradio-app/gradio'><img src='https://img.shields.io/github/stars/gradio-app/gradio'></a> interface for a better experience, just run by:

```bash
# For Linux and Windows:
python app.py

# For macOS with Apple Silicon, Intel not supported, this maybe 20x slower than RTX 4090
PYTORCH_ENABLE_MPS_FALLBACK=1 python app.py
```

You can specify the `--server_port`, `--share`, `--server_name` arguments to satisfy your needs!
Expand All @@ -155,14 +169,15 @@ You can specify the `--server_port`, `--share`, `--server_name` arguments to sat
# enable torch.compile for faster inference
python app.py --flag_do_torch_compile
```
**Note**: This method has not been fully tested. e.g., on Windows.
**Note**: This method is not supported on Windows and macOS.

**Or, try it out effortlessly on [HuggingFace](https://huggingface.co/spaces/KwaiVGI/LivePortrait) 🤗**

### 5. Inference speed evaluation 🚀🚀🚀
We have also provided a script to evaluate the inference speed of each module:

```bash
# For NVIDIA GPU
python speed.py
```

Expand All @@ -184,9 +199,9 @@ Discover the invaluable resources contributed by our community to enhance your L

- [ComfyUI-LivePortraitKJ](https://github.com/kijai/ComfyUI-LivePortraitKJ) by [@kijai](https://github.com/kijai)
- [comfyui-liveportrait](https://github.com/shadowcz007/comfyui-liveportrait) by [@shadowcz007](https://github.com/shadowcz007)
- [LivePortrait In ComfyUI](https://www.youtube.com/watch?v=aFcS31OWMjE) by [@Benji](https://www.youtube.com/@TheFutureThinker)
- [LivePortrait hands-on tutorial](https://www.youtube.com/watch?v=uyjSTAOY7yI) by [@AI Search](https://www.youtube.com/@theAIsearch)
- [ComfyUI tutorial](https://www.youtube.com/watch?v=8-IcDDmiUMM) by [@Sebastian Kamph](https://www.youtube.com/@sebastiankamph)
- [LivePortrait In ComfyUI](https://www.youtube.com/watch?v=aFcS31OWMjE) by [@Benji](https://www.youtube.com/@TheFutureThinker)
- [Replicate Playground](https://replicate.com/fofr/live-portrait) and [cog-comfyui](https://github.com/fofr/cog-comfyui) by [@fofr](https://github.com/fofr)

And many more amazing contributions from our community!
Expand Down
22 changes: 1 addition & 21 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,22 +1,2 @@
--extra-index-url https://download.pytorch.org/whl/cu118
torch==2.3.0
torchvision==0.18.0
torchaudio==2.3.0

numpy==1.26.4
pyyaml==6.0.1
opencv-python==4.10.0.84
scipy==1.13.1
imageio==2.34.2
lmdb==1.4.1
tqdm==4.66.4
rich==13.7.1
ffmpeg-python==0.2.0
-r requirements_base.txt
onnxruntime-gpu==1.18.0
onnx==1.16.1
scikit-image==0.24.0
albumentations==1.4.10
matplotlib==3.9.0
imageio-ffmpeg==0.5.1
tyro==0.8.5
gradio==4.37.1
21 changes: 21 additions & 0 deletions requirements_base.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
--extra-index-url https://download.pytorch.org/whl/cu118
torch==2.3.0
torchvision==0.18.0
torchaudio==2.3.0

numpy==1.26.4
pyyaml==6.0.1
opencv-python==4.10.0.84
scipy==1.13.1
imageio==2.34.2
lmdb==1.4.1
tqdm==4.66.4
rich==13.7.1
ffmpeg-python==0.2.0
onnx==1.16.1
scikit-image==0.24.0
albumentations==1.4.10
matplotlib==3.9.0
imageio-ffmpeg==0.5.1
tyro==0.8.5
gradio==4.37.1
2 changes: 2 additions & 0 deletions requirements_macOS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-r requirements_base.txt
onnxruntime-silicon==1.16.3
6 changes: 3 additions & 3 deletions src/live_portrait_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,14 +216,14 @@ def execute(self, args: ArgumentConfig):
wfp_concat = None
flag_has_audio = (not flag_load_from_template) and has_audio_stream(args.driving_info)

######### build final concact result #########
######### build final concat result #########
# driving frame | source image | generation, or source image | generation
frames_concatenated = concat_frames(driving_rgb_crop_256x256_lst, img_crop_256x256, I_p_lst)
wfp_concat = osp.join(args.output_dir, f'{basename(args.source_image)}--{basename(args.driving_info)}_concat.mp4')
images2video(frames_concatenated, wfp=wfp_concat, fps=output_fps)

if flag_has_audio:
# final result with concact
# final result with concat
wfp_concat_with_audio = osp.join(args.output_dir, f'{basename(args.source_image)}--{basename(args.driving_info)}_concat_with_audio.mp4')
add_audio_to_video(wfp_concat, args.driving_info, wfp_concat_with_audio)
os.replace(wfp_concat_with_audio, wfp_concat)
Expand All @@ -247,7 +247,7 @@ def execute(self, args: ArgumentConfig):
if wfp_template not in (None, ''):
log(f'Animated template: {wfp_template}, you can specify `-d` argument with this template path next time to avoid cropping video, motion making and protecting privacy.', style='bold green')
log(f'Animated video: {wfp}')
log(f'Animated video with concact: {wfp_concat}')
log(f'Animated video with concat: {wfp_concat}')

return wfp, wfp_concat

Expand Down
41 changes: 25 additions & 16 deletions src/live_portrait_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
Wrapper for LivePortrait core functions
"""

import contextlib
import os.path as osp
import numpy as np
import cv2
Expand All @@ -28,7 +29,10 @@ def __init__(self, inference_cfg: InferenceConfig):
if inference_cfg.flag_force_cpu:
self.device = 'cpu'
else:
self.device = 'cuda:' + str(self.device_id)
if torch.backends.mps.is_available():
self.device = 'mps'
else:
self.device = 'cuda:' + str(self.device_id)

model_config = yaml.load(open(inference_cfg.models_config, 'r'), Loader=yaml.SafeLoader)
# init F
Expand Down Expand Up @@ -57,6 +61,14 @@ def __init__(self, inference_cfg: InferenceConfig):

self.timer = Timer()

def inference_ctx(self):
if self.device == "mps":
ctx = contextlib.nullcontext()
else:
ctx = torch.autocast(device_type=self.device[:4], dtype=torch.float16,
enabled=self.inference_cfg.flag_use_half_precision)
return ctx

def update_config(self, user_args):
for k, v in user_args.items():
if hasattr(self.inference_cfg, k):
Expand Down Expand Up @@ -105,9 +117,8 @@ def extract_feature_3d(self, x: torch.Tensor) -> torch.Tensor:
""" get the appearance feature of the image by F
x: Bx3xHxW, normalized to 0~1
"""
with torch.no_grad():
with torch.autocast(device_type=self.device[:4], dtype=torch.float16, enabled=self.inference_cfg.flag_use_half_precision):
feature_3d = self.appearance_feature_extractor(x)
with torch.no_grad(), self.inference_ctx():
feature_3d = self.appearance_feature_extractor(x)

return feature_3d.float()

Expand All @@ -117,9 +128,8 @@ def get_kp_info(self, x: torch.Tensor, **kwargs) -> dict:
flag_refine_info: whether to trandform the pose to degrees and the dimention of the reshape
return: A dict contains keys: 'pitch', 'yaw', 'roll', 't', 'exp', 'scale', 'kp'
"""
with torch.no_grad():
with torch.autocast(device_type=self.device[:4], dtype=torch.float16, enabled=self.inference_cfg.flag_use_half_precision):
kp_info = self.motion_extractor(x)
with torch.no_grad(), self.inference_ctx():
kp_info = self.motion_extractor(x)

if self.inference_cfg.flag_use_half_precision:
# float the dict
Expand Down Expand Up @@ -264,15 +274,14 @@ def warp_decode(self, feature_3d: torch.Tensor, kp_source: torch.Tensor, kp_driv
kp_driving: BxNx3
"""
# The line 18 in Algorithm 1: D(W(f_s; x_s, x′_d,i))
with torch.no_grad():
with torch.autocast(device_type=self.device[:4], dtype=torch.float16, enabled=self.inference_cfg.flag_use_half_precision):
if self.compile:
# Mark the beginning of a new CUDA Graph step
torch.compiler.cudagraph_mark_step_begin()
# get decoder input
ret_dct = self.warping_module(feature_3d, kp_source=kp_source, kp_driving=kp_driving)
# decode
ret_dct['out'] = self.spade_generator(feature=ret_dct['out'])
with torch.no_grad(), self.inference_ctx():
if self.compile:
# Mark the beginning of a new CUDA Graph step
torch.compiler.cudagraph_mark_step_begin()
# get decoder input
ret_dct = self.warping_module(feature_3d, kp_source=kp_source, kp_driving=kp_driving)
# decode
ret_dct['out'] = self.spade_generator(feature=ret_dct['out'])

# float the dict
if self.inference_cfg.flag_use_half_precision:
Expand Down
2 changes: 1 addition & 1 deletion src/modules/dense_motion.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def create_heatmap_representations(self, feature, kp_driving, kp_source):
heatmap = gaussian_driving - gaussian_source # (bs, num_kp, d, h, w)

# adding background feature
zeros = torch.zeros(heatmap.shape[0], 1, spatial_size[0], spatial_size[1], spatial_size[2]).type(heatmap.type()).to(heatmap.device)
zeros = torch.zeros(heatmap.shape[0], 1, spatial_size[0], spatial_size[1], spatial_size[2]).type(heatmap.dtype).to(heatmap.device)
heatmap = torch.cat([zeros, heatmap], dim=1)
heatmap = heatmap.unsqueeze(2) # (bs, 1+num_kp, 1, d, h, w)
return heatmap
Expand Down
15 changes: 11 additions & 4 deletions src/utils/cropper.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import cv2; cv2.setNumThreads(0); cv2.ocl.setUseOpenCL(False)
import numpy as np
import torch

from ..config.crop_config import CropConfig
from .crop import (
Expand Down Expand Up @@ -43,10 +44,16 @@ def __init__(self, **kwargs) -> None:
flag_force_cpu = kwargs.get("flag_force_cpu", False)
if flag_force_cpu:
device = "cpu"
face_analysis_wrapper_provicer = ["CPUExecutionProvider"]
face_analysis_wrapper_provider = ["CPUExecutionProvider"]
else:
device = "cuda"
face_analysis_wrapper_provicer = ["CUDAExecutionProvider"]
if torch.backends.mps.is_available():
# Shape inference currently fails with CoreMLExecutionProvider
# for the retinaface model
device = "mps"
face_analysis_wrapper_provider = ["CPUExecutionProvider"]
else:
device = "cuda"
face_analysis_wrapper_provider = ["CUDAExecutionProvider"]
self.landmark_runner = LandmarkRunner(
ckpt_path=make_abs_path(self.crop_cfg.landmark_ckpt_path),
onnx_provider=device,
Expand All @@ -57,7 +64,7 @@ def __init__(self, **kwargs) -> None:
self.face_analysis_wrapper = FaceAnalysisDIY(
name="buffalo_l",
root=make_abs_path(self.crop_cfg.insightface_root),
providers=face_analysis_wrapper_provicer,
providers=face_analysis_wrapper_provider,
)
self.face_analysis_wrapper.prepare(ctx_id=device_id, det_size=(512, 512))
self.face_analysis_wrapper.warmup()
Expand Down
2 changes: 1 addition & 1 deletion src/utils/dependencies/insightface/model_zoo/model_zoo.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def find_onnx_file(dir_path):
return paths[-1]

def get_default_providers():
return ['CUDAExecutionProvider', 'CPUExecutionProvider']
return ['CUDAExecutionProvider', 'CoreMLExecutionProvider', 'CPUExecutionProvider']

def get_default_provider_options():
return None
Expand Down
6 changes: 6 additions & 0 deletions src/utils/landmark_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,12 @@ def __init__(self, **kwargs):
('CUDAExecutionProvider', {'device_id': device_id})
]
)
elif onnx_provider.lower() == 'mps':
self.session = onnxruntime.InferenceSession(
ckpt_path, providers=[
'CoreMLExecutionProvider'
]
)
else:
opts = onnxruntime.SessionOptions()
opts.intra_op_num_threads = 4 # 默认线程数为 4
Expand Down
9 changes: 7 additions & 2 deletions src/utils/video.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,13 @@ def has_audio_stream(video_path: str) -> bool:
# Check if there is any output from ffprobe command
return bool(result.stdout.strip())
except Exception as e:
log(f"Error occurred while probing video: {video_path}, you may need to install ffprobe! Now set audio to false!", style="bold red")
return False
log(
f"Error occurred while probing video: {video_path}, "
"you may need to install ffprobe! (https://ffmpeg.org/download.html) "
"Now set audio to false!",
style="bold red"
)
return False


def add_audio_to_video(silent_video_path: str, audio_video_path: str, output_video_path: str):
Expand Down

0 comments on commit 0f83984

Please sign in to comment.