Skip to content
oxygen-dioxide edited this page Feb 13, 2023 · 5 revisions

OpenUtau ENUNU X Phonemizer Guide

About

ENUNU X Phonemizer for OpenUtau is an implementation of the ENUNU phonemizer based on ONNX. The main purpose of developing this phonemizer is to provide a useable multi-language phonemizer for upcoming AI renderers and their developers, and remove the dependency on Python.

About the name: "X" means onnX, eXtensible and eXchange.

Planned features:

  • Enunu timelag and duration inference in ONNX
  • Builtin timing corrector (functioning like https://github.com/UtaUtaUtau/enunu_timing_corrector)
  • Multi-syllable language support
  • Phoneme redirection and combination. Some singing synthesizers may use a different dictionary than Enunu's. Voicebank developers can define a dictionary that redirects phonemes output from the Enunu timing model to the destination renderer's format.

The C# code of this phonemizer is being developed in this branch: https://github.com/oxygen-dioxide/OpenUtau/tree/nnsvs-onnx

If you're running into a bug, feel free to provide feedback. It's suggested to provide a full snapshot of your OpenUtau window, your ustx file and OpenUtau log file.

Usage

ENUNU X phonemizer hasn't been merged into OpenUtau official releases now. You can use it in OpenUtau for Diffsinger

ENUNU X EN is a phonemizer for English AI voicebanks, compatible with EN VCCV.

Known issues and limitations

  • Consonants at the tail of a long note may be incorrectly elongated, especially when the syllable contains slur notes.

Voicebank Development

If you want to make a voicebank that supports ENUNU X Phonemizer, you need to do these things:

  1. Convert your voicebank
  2. write enunux.yaml

Convert Your Voicebank

Python Codes here use the same dependencies as ENUNU. If you already have ENUNU or ENUNU Server for OpenUtau installed, then you already have all the dependencies installed.

Download export_onnx.py and export_onnx.bat. Put them into where your ENUNU is installed (the same folder with enunu.py)

Run export_onnx.bat. Input the location of your voicebank (the folder containing enuconfig.yaml), press Enter.

write enunux.yaml

enunux.yaml is the necessary file for ENUNU X phonemizer, similar to arpasing.yaml. It should be located in the same folder with enuconfig.yaml. You can copy enunu.yaml from another voicebank that sings in the same language.

enunux.yaml should contain:

  • symbols: phoneme definitions. Should contain all the phonemes supported by this voicebank
  • entries: grapheme-to-phoneme dictionary. If you already have ENUNU dict in your voicebank, you can keep this part empty.
  • redirections: The difference between ENUNU phoneme defination and renderer's phoneme defination. For example, the Diffsinger equivalents of ENUNU's pau and br are SP and AP. Multiple-to-one combination is supported.

Here are some examples:

Using this phonemizer with a non-ENUNU voicebank

To use this phonemizer with a non-ENUNU voicebank, you should have a converted ENUNU voicebank with the method above.

Copy the full folder of your ENUNU voicebank into the destination voicebank as a subfolder, and rename it to "enunux". You can delete all the .pth and .joblib file to reduce size.