Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
related: #244
This PR aims to improve and fix the Discord bot's voice functionality and enhance the code structure for better usability.
Fix Regression in Discord Voice Channel
Fixed a regression issue that previously prevented users from speaking in the Discord voice channel.
Add shouldRespond Function
Added a shouldRespond function for the bot's voice feature to control when the bot should respond to user voice inputs.
Refactor Transcription Process
Refactored the transcription process by adding a debounce function, ensuring voice messages are processed only when silence is detected.
Enable Bot to Respond to Text Messages in Voice Channels
Updated the bot to handle text messages sent in voice channels.
Add Optional DISCORD_VOICE_CHANNEL_ID in .env
Introduced a new constant, DISCORD_VOICE_CHANNEL_ID, in the .env file to allow users to specify a voice channel the bot should join.
Implemented Audio Playback Interrupt Mechanism
Add a sliding window buffer that monitors the audio volume while the agent is speaking. If the average volume of the user's audio exceeds the defined threshold, it indicates active speaking. When active speaking is detected, stop the agent's current audio playback to avoid overlap.