-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Change test for openAI compatible endpoints #440
Comments
The voices endpoint appears to be non-standard. I don't see it in OpenAI api documentation. If we switch to that, I'm afraid will break function for existing users. But we can allow user to specify the URL to the voices endpoint, which could even be a static file. This endpoint is used for validation, as well as to retrieve the voice list and associated model. Output needs to be standardized like: [{ |
I see. That makes sense. So for this to work though, you'd have to manually specify the model also. So you might need to add a new entry for this specific use case to make this work. I made another issue on the other TTS project, but I'm not sure if that'll end up getting supported. Maybe to make it so that it doesn't break anything else, just adding the extra entry for a manual model could work as an optional parameter? The downside to this though would be it would end up breaking the check since you currently use the model endpoint to test for the connection to the api. So on top of that, maybe even making another parameter that allows you to just ignore the connection check would allow for this to work too? This model otherwise uses all of the same open AI endpoints for the actual audio streaming. |
@ken107 The developer of the kokoro-fastapi actually just added the models endpoint! However, I noticed a new issue that the voices list in read-aloud doesn't seem to recognize the proper list. Is the voice list hard coded for the openai endpoint? That's what it seems like at the moment because I have some of the voices work and just outright break for others. But I do have something working now. Right now the endpoint outputs this:
The kokoro model is especially important for this, however, I'm not sure how read-aloud polls for voices. Does it use the models list for it? If so, the developer seems receptive to modify the output further. |
Since there's no standard API for listing voices, a simple solution is to let user edit the list of voices. Currently it's hardcoded to the OpenAI voice list. If user is savvy enough to run Kokoro locally, they should be able to modify a simple JSON document, I presume. |
Yeah, I think that's fair enough. If implemented that way, where would that JSON document be accessible though? |
Could you test this version see if it works for you? You should be able to set the API url (to localhost:8880 for kokoro-fastapi), leave the apikey blank, and edit the voicelist to match kokoro's voice list. |
Tested on chrome and it works great! As for Firefox, I couldn't get it to build for some reason. Probably just me not knowing how to do it right. But the foundation is totally there, It works great! 2025-02-13.08-09-19.mp4 |
Great! Firefox is a different branch with separate code. Once this is checked in, I'll merge it to the FF branch. Thanks for testing. |
Some models like https://github.com/remsky/Kokoro-FastAPI don't support the models endpoint. They do however, respond with the voices endpoint. If the test could be done directly with that and the api could allow for manually specifiying the model, this would allow for using this extremely high quality new TTS model.
The text was updated successfully, but these errors were encountered: