Skip to content

TigreGotico/ovos-voice-embeddings-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoiceEmbeddingsRecognitionPlugin

The VoiceEmbeddingsRecognitionPlugin is a plugin for recognizing and managing voice embeddings.

It uses Resemblyzer to extract speaker embeddings and integrates with ovos-chromadb-embeddings-plugin for storing and retrieving voice embeddings.

Features

  • Voice Embeddings Extraction: Converts audio data into voice embeddings using the VoiceEncoder from resemblyzer.
  • Voice Data Storage: Stores and retrieves voice embeddings using ChromaEmbeddingsDB.
  • Voice Data Management: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
  • Supports Multiple Audio Formats: Can handle audio data in various formats, including wav and flac.

Usage

Here is a quick example of how to use the VoiceEmbeddingsRecognitionPlugin:

from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB

db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)

a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"

with AudioFile(a) as source:
    audio = Recognizer().record(source)
v.add_voice("user", audio)

wav = preprocess_wav(b)
v.add_voice("donald", wav)

wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages