Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example using received audio data? #100

Closed
asg0451 opened this issue Oct 12, 2021 · 5 comments · Fixed by #114
Closed

Example using received audio data? #100

asg0451 opened this issue Oct 12, 2021 · 5 comments · Fixed by #114
Labels
events Relates to driver event handling/generation. question Further information is requested

Comments

@asg0451
Copy link
Contributor

asg0451 commented Oct 12, 2021

I'm hacking on this example (https://github.com/serenity-rs/songbird/blob/current/examples/serenity/voice_receive/src/main.rs), and I can receive and buffer voice audio, but I can't figure out what format it's in or how to use it.

I've got some of these https://serenity-rs.github.io/songbird/current/songbird/events/context_data/struct.VoiceData.html#structfield.audio but what is this? How can I turn this into, say, a .wav file?

I've been banging my head on this one for a couple hours so any help would be greatly appreciated!

@FelixMcFelix FelixMcFelix added events Relates to driver event handling/generation. question Further information is requested labels Oct 12, 2021
@FelixMcFelix
Copy link
Member

This should be documented somewhere, you're right!

Each event contains up to 20ms of mono 16-bit PCM audio from a single user at 48kHZ -- each i16 is a sample.

If you want to make a wave file per user:

  • You need to emit a suitable wav header.
  • The i16 data should just go into the data subchunk without any issues, just be careful around endianness. However, you need to pad silent regions with zeroes by yourself, because clients don't send packets when they aren't talking. Mostly. 🙂

If you want to combine them all:

  • As above, except you need to need to add all packets which arrive in the same 20ms window together.
  • Mixing two audio packets is simply summing over all input measurements.
  • Take care not to overflow/clip the additions -- you might need to do volume reduction.
  • Some other users have tackled this for e.g., bridging Discord<->TS3.

@asg0451
Copy link
Contributor Author

asg0451 commented Oct 12, 2021

wow, thanks! this is incredibly helpful.
re your second bullet, can you elaborate on being careful around endianness? This PCM data is big-endian, right? i think i saw that implied somewhere in the discord docs, such as they are.

@FelixMcFelix
Copy link
Member

FelixMcFelix commented Oct 12, 2021 via email

@asg0451
Copy link
Contributor Author

asg0451 commented Oct 12, 2021

Ah gotcha. Thanks again, you have saved me hours of struggling with opus and discord documentation. If you want, I could do a quick PR to add basically what you said to the bit of documentation I linked initially.

@FelixMcFelix
Copy link
Member

FelixMcFelix commented Oct 12, 2021 via email

FelixMcFelix added a commit to FelixMcFelix/songbird that referenced this issue Feb 14, 2022
FelixMcFelix added a commit that referenced this issue Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
events Relates to driver event handling/generation. question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants