Feature Request: Change test for openAI compatible endpoints #440

luckycold · 2025-02-09T23:22:38Z

Some models like https://github.com/remsky/Kokoro-FastAPI don't support the models endpoint. They do however, respond with the voices endpoint. If the test could be done directly with that and the api could allow for manually specifiying the model, this would allow for using this extremely high quality new TTS model.

ken107 · 2025-02-10T01:05:44Z

The voices endpoint appears to be non-standard. I don't see it in OpenAI api documentation. If we switch to that, I'm afraid will break function for existing users.

But we can allow user to specify the URL to the voices endpoint, which could even be a static file. This endpoint is used for validation, as well as to retrieve the voice list and associated model. Output needs to be standardized like:

[{
voice: "af_bella",
model: "kokoro"
}, ...]

luckycold · 2025-02-10T01:24:52Z

I see. That makes sense. So for this to work though, you'd have to manually specify the model also. So you might need to add a new entry for this specific use case to make this work. I made another issue on the other TTS project, but I'm not sure if that'll end up getting supported. Maybe to make it so that it doesn't break anything else, just adding the extra entry for a manual model could work as an optional parameter? The downside to this though would be it would end up breaking the check since you currently use the model endpoint to test for the connection to the api. So on top of that, maybe even making another parameter that allows you to just ignore the connection check would allow for this to work too? This model otherwise uses all of the same open AI endpoints for the actual audio streaming.

luckycold · 2025-02-11T17:32:47Z

@ken107 The developer of the kokoro-fastapi actually just added the models endpoint! However, I noticed a new issue that the voices list in read-aloud doesn't seem to recognize the proper list. Is the voice list hard coded for the openai endpoint? That's what it seems like at the moment because I have some of the voices work and just outright break for others. But I do have something working now.

Right now the endpoint outputs this:

{
  "object": "list",
  "data": [
    {
      "id": "tts-1",
      "object": "model",
      "created": 1686935002,
      "owned_by": "kokoro"
    },
    {
      "id": "tts-1-hd",
      "object": "model",
      "created": 1686935002,
      "owned_by": "kokoro"
    },
    {
      "id": "kokoro",
      "object": "model",
      "created": 1686935002,
      "owned_by": "kokoro"
    }
  ]
}

The kokoro model is especially important for this, however, I'm not sure how read-aloud polls for voices. Does it use the models list for it? If so, the developer seems receptive to modify the output further.

ken107 · 2025-02-12T01:17:49Z

Since there's no standard API for listing voices, a simple solution is to let user edit the list of voices. Currently it's hardcoded to the OpenAI voice list. If user is savvy enough to run Kokoro locally, they should be able to modify a simple JSON document, I presume.

luckycold · 2025-02-12T02:33:16Z

Since there's no standard API for listing voices, a simple solution is to let user edit the list of voices. Currently it's hardcoded to the OpenAI voice list. If user is savvy enough to run Kokoro locally, they should be able to modify a simple JSON document, I presume.

Yeah, I think that's fair enough. If implemented that way, where would that JSON document be accessible though?

ken107 · 2025-02-13T11:10:21Z

Could you test this version see if it works for you? You should be able to set the API url (to localhost:8880 for kokoro-fastapi), leave the apikey blank, and edit the voicelist to match kokoro's voice list.
https://github.com/ken107/read-aloud/tree/kokoro

luckycold · 2025-02-13T14:12:54Z

Tested on chrome and it works great! As for Firefox, I couldn't get it to build for some reason. Probably just me not knowing how to do it right. But the foundation is totally there, It works great!

2025-02-13.08-09-19.mp4

ken107 · 2025-02-13T14:19:25Z

Great! Firefox is a different branch with separate code. Once this is checked in, I'll merge it to the FF branch. Thanks for testing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Change test for openAI compatible endpoints #440

Feature Request: Change test for openAI compatible endpoints #440

luckycold commented Feb 9, 2025

ken107 commented Feb 10, 2025 •

edited

Loading

luckycold commented Feb 10, 2025

luckycold commented Feb 11, 2025

ken107 commented Feb 12, 2025

luckycold commented Feb 12, 2025 •

edited

Loading

ken107 commented Feb 13, 2025

luckycold commented Feb 13, 2025

ken107 commented Feb 13, 2025

Feature Request: Change test for openAI compatible endpoints #440

Feature Request: Change test for openAI compatible endpoints #440

Comments

luckycold commented Feb 9, 2025

ken107 commented Feb 10, 2025 • edited Loading

luckycold commented Feb 10, 2025

luckycold commented Feb 11, 2025

ken107 commented Feb 12, 2025

luckycold commented Feb 12, 2025 • edited Loading

ken107 commented Feb 13, 2025

luckycold commented Feb 13, 2025

ken107 commented Feb 13, 2025

ken107 commented Feb 10, 2025 •

edited

Loading

luckycold commented Feb 12, 2025 •

edited

Loading