Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLI]add onnxruntime infer for cli #2222

Merged
merged 6 commits into from
Aug 5, 2022

Conversation

yt605155624
Copy link
Collaborator

@yt605155624 yt605155624 commented Aug 3, 2022

  1. add run_frontend function

  2. update parameters of get_sess function

  3. use use_onnx to control whether to use onnxruntime inference, use cpu by default cause we install cpu version of onnxruntime in setup.py (Mac cannot install gpu version), cpu_threads is 2 by default

    CLI:

    paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True
    paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True
    paddlespeech tts --voc mb_melgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output mb.wav --use_onnx True
    paddlespeech tts --voc pwgan_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_aishell3 --voc pwgan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_aishell3 --voc hifigan_aishell3 --input "你好,欢迎使用百度飞桨深度学习框架!" --spk_id 0 --output aishell3_fs2_hifigan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_ljspeech --voc pwgan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True
    paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True

    Python API:

    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_1 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='1.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    time_2 = time.time()
    print("time of first time:", time_2-time_1)
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='2.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of second time:", time.time()-time_2)
    time of first time: 14.543321371078491 (needs to download models for the first time)
    time of second time: 0.5376265048980713
    

    use specified model files:

    # use specified model files
    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_3 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='3.wav',
        am='fastspeech2_csmsc',
        am_ckpt='./fastspeech2_csmsc_onnx_0.2.0/fastspeech2_csmsc.onnx',
        phones_dict='./fastspeech2_csmsc_onnx_0.2.0/phone_id_map.txt',
        voc='hifigan_csmsc',
        voc_ckpt='./hifigan_csmsc_onnx_0.2.0/hifigan_csmsc.onnx',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of third time:", time.time()-time_3)
    time_4 = time.time()
    wav_file = tts_executor(
        text='对数据集进行预处理',
        output='4.wav',
        am='fastspeech2_csmsc',
        voc='hifigan_csmsc',
        lang='zh',
        use_onnx=True,
        cpu_threads=2)
    print("time of forth time:", time.time()-time_4)
    time of third time: 8.955731391906738
    time of forth time: 0.565178394317627
    

    use specified model files for ljspeech:

    # NOTE: You must set `fs` to `22050` for ljspeech when using specified model files for the first time,
    #       cause the defualt value of fs  in cli is 24000 but ljspeech's fs is 22050
    from paddlespeech.cli.tts import TTSExecutor
    import time
    tts_executor = TTSExecutor()
    time_3 = time.time()
    wav_file = tts_executor(
        text="Life was like a box of chocolates, you never know what you're gonna get.",
        output='lj_test1.wav',
        am='fastspeech2_ljspeech',
        am_ckpt='./fastspeech2_ljspeech_onnx_1.1.0/fastspeech2_ljspeech.onnx',
        phones_dict='./fastspeech2_ljspeech_onnx_1.1.0/phone_id_map.txt',
        voc='hifigan_ljspeech',
        voc_ckpt='./hifigan_ljspeech_onnx_1.1.0/hifigan_ljspeech.onnx',
        lang='en',
        use_onnx=True,
        cpu_threads=2,
        fs=22050)
    print("time of third time:", time.time()-time_3)
    time_4 = time.time()
    wav_file = tts_executor(
        text="Life was like a box of chocolates, you never know what you're gonna get.",
        output='lj_test2.wav',
        am='fastspeech2_ljspeech',
        voc='hifigan_ljspeech',
        lang='en',
        use_onnx=True,
        cpu_threads=2)
    print("time of forth time:", time.time()-time_4)
    time of third time: 3.591158390045166
    time of forth time: 1.7778213024139404
    

@yt605155624 yt605155624 added this to the r1.1.0 milestone Aug 3, 2022
@yt605155624 yt605155624 requested a review from lym0302 August 3, 2022 12:17
@yt605155624 yt605155624 self-assigned this Aug 3, 2022
@mergify mergify bot added the Server label Aug 4, 2022
@yt605155624 yt605155624 merged commit 2f9bdf2 into PaddlePaddle:develop Aug 5, 2022
@yt605155624 yt605155624 deleted the add_onnx_cli branch September 8, 2022 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants