Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ffmpeg demuxer instead of separate streams #108

Merged
merged 38 commits into from
Oct 28, 2024

Conversation

longnguyen2004
Copy link
Collaborator

@longnguyen2004 longnguyen2004 commented Oct 19, 2024

Tracking PR, there will be more changes as I continue working on this.

I've chosen to use libav.js as a demuxer only, so no audio decoding (as much as I'd like to have AAC decoding, there's no prebuilt up-to-date package for it, and we'd still need an external ffmpeg for hardware video encoding anyway).

Checklist:

  • Add ffmpeg demuxer (preferred container is mkv due to its versatility, which allows us to do H.264/H.265/VP8/VP9/AV1 + raw audio Opus in a single stream)
  • Adapt VideoStream and AudioStream to use this new stream
  • Syncing of presentation timestamps between the AV streams

Testing

  • H.264
  • H.265
  • VP8

@longnguyen2004 longnguyen2004 marked this pull request as draft October 19, 2024 09:05
@longnguyen2004
Copy link
Collaborator Author

During the implementation of this, I find the streamOptions object to be a bit unwieldy, given how the user can set important stream properties (width, height, frame rate, codec) arbitrarily, which can cause the set values and the actual values to not match. It also makes the overall flow a bit confusing, since the same object controls both the MediaUdp settings and the encoder output, instead of MediaUdp automatically detecting the values from the encoder.

Since we now directly uses the ffmpeg API, which gives us frame information, I'm thinking of providing another function, playStream(Client, Readable) that automatically sets up the stream properties and initialize the stream, and remove all MediaUdp stuff from the streamLivestreamVideo function. Does that sound good?

With raw audio formats, we can't maintain the strict 1 Opus frame per write() guarantee, so instead force the user to provide Opus audio
Again, the situation with `streamOptions` isn't really what I want, but let's progress little by little for now
@BitcircuitEU
Copy link

Using the updated version in my script gives error "Unable to copy the codec: No suitable stream found"

@longnguyen2004
Copy link
Collaborator Author

This is not finished yet, please wait

Without an explicit input format, ffmpeg errors with "Invalid data when processing input"
Lower values lead to lower latency, but might increase overhead of repeatedly calling the function. 16KiB of output data per call seems reasonable
@longnguyen2004
Copy link
Collaborator Author

This is usable now, but ending the stream will cause an infinite loop to happen. I don't know where it is yet, but I suspect it's the sync loops.

@dank074
Copy link
Owner

dank074 commented Oct 21, 2024

During the implementation of this, I find the streamOptions object to be a bit unwieldy, given how the user can set important stream properties (width, height, frame rate, codec) arbitrarily, which can cause the set values and the actual values to not match. It also makes the overall flow a bit confusing, since the same object controls both the MediaUdp settings and the encoder output, instead of MediaUdp automatically detecting the values from the encoder.

Since we now directly uses the ffmpeg API, which gives us frame information, I'm thinking of providing another function, playStream(Client, Readable) that automatically sets up the stream properties and initialize the stream, and remove all MediaUdp stuff from the streamLivestreamVideo function. Does that sound good?

I think creating another function that does that for you sounds good.

When you say that MediaUdp should detect the values from the encoder, what would that look like? The user should still be able to change the parameters of an existing udp connection, kind of like when the user selects to change the screen/application in a Go Live stream which doesn't reset the connection but could result in different stream parameters.

@BitcircuitEU
Copy link

During the implementation of this, I find the streamOptions object to be a bit unwieldy, given how the user can set important stream properties (width, height, frame rate, codec) arbitrarily, which can cause the set values and the actual values to not match. It also makes the overall flow a bit confusing, since the same object controls both the MediaUdp settings and the encoder output, instead of MediaUdp automatically detecting the values from the encoder.
Since we now directly uses the ffmpeg API, which gives us frame information, I'm thinking of providing another function, playStream(Client, Readable) that automatically sets up the stream properties and initialize the stream, and remove all MediaUdp stuff from the streamLivestreamVideo function. Does that sound good?

I think creating another function that does that for you sounds good.

When you say that MediaUdp should detect the values from the encoder, what would that look like? The user should still be able to change the parameters of an existing udp connection, kind of like when the user selects to change the screen/application in a Go Live stream which doesn't reset the connection but could result in different stream parameters.

Well currently if you have a movie thats like 1920x800 whats a normal movie format, most ppl will set the format to 1920x1080 what will stretch the video. And it would be kinda annoying to change for every movie file playing

@dank074
Copy link
Owner

dank074 commented Oct 21, 2024

Well currently if you have a movie thats like 1920x800 whats a normal movie format, most ppl will set the format to 1920x1080 what will stretch the video. And it would be kinda annoying to change for every movie file playing

Yeah, like I said, I think it is a good idea to provide an alternative function which will use the existing video codec properties, such as video resolution. But we should also leave the existing stream function that sets user-provided options since the user may want to edit these. Right now the Discord SFU doesn't seem to check for Nitro subscription, so any user is able to send whatever resolution they want, but this may not be true in the future.

@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Oct 22, 2024

H265 is broken, not sure if it's due to my change or not
VP8 still works fine

@dank074
Copy link
Owner

dank074 commented Oct 22, 2024

There seems to be some questions and confusion around my statement, so here's some graphs to hopefully demonstrate my point.

Currently, the library operates like this

image

The streamLivestreamVideo is this weird half-user side, half-library side function. The user can customize it, but it is still tightly coupled to MediaUdp, and it needs to set the encoder parameters to be the same as the streaming parameters, which might not always be the case.

This is how I envision the library to work

image

In this flow, the streamLivestreamVideo function is cleanly separated from library internals, and both side don't leak implementation details to eachother. This is possible now because the library takes in a regular MKV stream, and the demuxing is handled completely inside the library (see demux function for more details), which means we can get important parameters (width, height, frame rate,...) from the media stream itself, instead of having to trust what the user sets.

Ohh that makes sense. So in this scenario it's possible to change the video without having to start a new MediaUdp connection, right? demux will just update MediaUdp with the new stream params

@longnguyen2004
Copy link
Collaborator Author

Theoretically yes, though I don't know yet how the re-negotiation process works, let's ignore that for now.

This will come into play later on
It turns out, the `mp4_h264toannexb` filter was helping us a great deal here, by inserting the parameter sets before each I frame. With mkv and other containers, they're only specified once in the file, so we need to synthesize it ourselves
@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Oct 23, 2024

Update after my latest additions:

  • H.264: Fully works (tested with NVENC, AMF, libx264)
  • H.265: Sorta works (I frames decode fine, but other frames produce garbage output, not sure why) Retested, working fine, it's just flaky connection causing problems

Testing problematic streams from other issues:

A few notes about this implementation:

  • ALL non-livestream sources need to have -re in the ffmpeg argument. Due to the decoupled design, the sleeping that is done in VideoStream/AudioStream no longer stalls the ffmpeg process, so it'll keep decoding and quickly fill up your RAM.
    (m3u8 stream should be treated as non-livestream, since frames are delivered in bursts and not a steady stream of frames)
  • Due to the reason mentioned above, the sleeping in VideoStream/AudioStream is no longer needed and can be removed.
  • Input to the library is now a single Matroska stream, containing H.264/H.265/VP8/VP9/AV1 video and Opus audio. Raw audio is no longer accepted due to complexity in frame splitting.

While spin waiting will produce the most accurate syncing, it's also very CPU intensive (briefly peaking a single core of an i3-330M), and most of the time it's not really needed anyway. Using a timeout of 1ms is better.

Ideally there will be a signaling mechanism between the 2 stream, instead of each stream polling eachother, but I'm keeping this simple.
@longnguyen2004 longnguyen2004 marked this pull request as ready for review October 24, 2024 03:24
Copy link
Collaborator Author

@longnguyen2004 longnguyen2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alone dropped the peak CPU usage on an i3-330m from 25% to 10%, a 2.5x reduction!

Avoiding `BigInt` if possible is always good
@BitcircuitEU
Copy link

Can confirm that several movie files are in sync after this update

Copy link
Owner

@dank074 dank074 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@dank074
Copy link
Owner

dank074 commented Oct 25, 2024

  • ALL non-livestream sources need to have -re in the ffmpeg argument. Due to the decoupled design, the sleeping that is done in VideoStream/AudioStream no longer stalls the ffmpeg process, so it'll keep decoding and quickly fill up your RAM.
    (m3u8 stream should be treated as non-livestream, since frames are delivered in bursts and not a steady stream of frames)
  • Due to the reason mentioned above, the sleeping in VideoStream/AudioStream is no longer needed and can be removed.

Should we make this a separate issue? To remove the sleeping in AudioStream & VideoStream

Or should we just do it in the PR since it fixes what the initial issue was with using -re flag

@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Oct 26, 2024

It can be a separate issue, since it doesn't hurt to have the sleeping in here.

Anyway, please wait a few days before merging. I'm waiting for a new build of libav.js which would have Yahweasel/libav.js@13cafce included in there, and remove the ts-expect-error in various places.

EDIT: Expect Sunday or Monday Yahweasel/libav.js#66 (comment)

@BitcircuitEU
Copy link

It can be a separate issue, since it doesn't hurt to have the sleeping in here.

Anyway, please wait a few days before merging. I'm waiting for a new build of libav.js which would have Yahweasel/libav.js@13cafce included in there, and remove the ts-expect-error in various places.

EDIT: Expect Sunday or Monday Yahweasel/libav.js#66 (comment)

Seems to be comitted?

@longnguyen2004
Copy link
Collaborator Author

Waiting for release, which would be 1-2 days from now

@longnguyen2004
Copy link
Collaborator Author

Ready to merge! As for the follow-up changes, I'll do them in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants