-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create simpler facades for decoding steps #4
Comments
Useful feedback on this topic from @scottlamb in #3 (comment) |
Continuing from the comment you mentioned: I'm working on my RTSP crate (now in its own repository and MIT/Apache-licensed). I've realized that on the wire, almost all the VCL NALU are fragmented. Right now I accumulate the fragments via copying into a I'd also like to eventually decode the slice headers (to track reference frames). I think my ideal is something like this:
This would allow parsing the slice headers without allocation or copying for RBSP processing. Most of the NAL wouldn't be even examined because presumably the slice headers are a very short prefix of it. I might eventually send you a PR for this if you like this plan. I've got a bunch of other code to write before I get too serious about processing the slice headers though. |
Why? * performance improvement to RbspBitReader as shown below (although this is likely a small percentage of any useful program's time) * switch from a less popular crate to a more popular crate * be more consistent with adts-reader * step toward the lazy RbspBufDecoder interface I described in #4 ``` parse_nal/sps time: [511.27 ns 512.15 ns 513.01 ns] change: [-17.150% -17.002% -16.853%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe ```
Actually, I have a much more radical idea. The only emulation prevention three byte-removing logic can be a The push parser case would give it a For most NAL types it'd make sense to only try parsing once in I think this would simplify the logic for the push parsing as well as sharing more of it with the already-buffered parsing. There'd be just one place push parsing buffers instead of three and counting ( Also, I imagine some of the time the caller wants to both parse and keep the NAL bytes to send onward (similar to my RTSP library). This might allow them to share the buffer (at the cost of some more API complexity). A code sketch: pub trait RbspHandler {
type Read: RbspBitRead;
type Ctx;
/// Processes a partially buffered NAL unit. `read` always reads from the beginning.
/// Returns false if no additional calls are desired for this NAL unit.
fn partial(&mut self, ctx: &mut Context<Self::Ctx>, header: NalHeader, read: Self::Read) -> bool {
true
}
/// Processes a fully buffered NAL unit. `read` always reads from the beginning.
/// Called iff every prior `partial` call for this NAL unit (zero or more) returned true.
fn complete(&mut self, ctx: &mut Context<Self::Ctx>, header: NalHeader, read: Self::Read) {}
}
impl<Read: RbspBitRead, Ctx> RbspHandler for PicParameterSetNalHandler<Ctx> {
type Read = Read;
type Ctx = Ctx;
fn complete(&mut self, ctx: &mut Context<Self::Ctx>, header: NalHeader, read: Self::Read) {
match PicParameterSet::read(ctx, read) {
Ok(pps) => ctx.put_pic_param_set(pps),
Err(e) => error!("pps: {:?}", e),
}
}
}
impl<Read: RbspBitRead, Ctx> RbspHandler for SliceNalHandler<Ctx> {
type Read = Read;
type Ctx = Ctx;
fn partial(&mut self, ctx: &mut Context<Self::Ctx>, header: NalHeader, read: Self::Read) -> bool {
match SliceHeader::read(ctx, read) {
Ok(header) => {
info!("TODO: expose to caller: {:#?}", header);
false
},
Err(SliceHeaderError::Incomplete) => true,
Err(e) => {
error!("slice_header() error: SliceHeaderError::{:?}", e);
false
},
}
}
fn complete(&mut self, ctx: &mut Context<Self::Ctx>, header: NalHeader, read: Self::Read) -> bool {
match SliceHeader::read(ctx, read) {
Ok(pps) => info!("TODO: expose to caller: {:#?}", header),
Err(e) => error!("slice_header() error: SliceHeaderError::{:?}", e),
}
}
} What do you think? |
On second thought, I don't think it's any more complex. Also, I think we could get rid of NAL/SEI switches, NAL handlers could be a lambda rather than a trait; a little lighter-weight but less explicit. I could go either way on that part. mod nal {
/// A single, partially- or completely-buffered NAL.
/// The push module uses a [ByteSliceNal], but it's possible to implement
/// this interface with any type that can produce a `std::io::BufRead`.
pub trait Nal {
type BufRead: std::io::BufRead;
/// Returns if the NAL is completely buffered.
pub fn complete(&self) -> bool;
/// Returns the NAL header or error if missing/corrupt.
pub fn header(&self) -> Result<NalHeader, NalHeaderError> { ... }
/// Returns the bytes in NAL form (including the header byte and
/// any emulation-prevention-three-bytes). If the NAL is incomplete,
/// reads may fail with `ErrorKind::WouldBlock`.
pub fn reader(&self) -> Self::BufRead;
/// Returns the bytes in RBSP form (skipping header byte and
/// emulation-prevention-three-bytes).
pub fn rbsp_bytes(&self) -> rbsp::ByteReader<Self::BufRead> { ... }
/// Returns a helper to access bits within the RBSP.
pub fn rbsp_bits(&self) -> rbsp::BitReader<rbsp::ByteReader<Self::BufRead>> { ... }
}
/// A NAL which stores its bytes contiguously.
pub struct ByteSliceNal<'a> { buf: &'a [u8], complete: bool }
impl<'a> Nal for ByteSliceNal<'a> { ... }
impl<'a> AsRef<u8> for ByteSliceNal<'a> { ... }
}
mod push {
/// The results of a NAL handler during push parsing.
pub enum HandlerResult {
/// If this NAL is incomplete, continue buffering it for a following call.
ReadMore,
/// Stop buffering this NAL; move on to the next one immediately.
/// This is no different than `ReadMore` when the NAL is complete.
Skip,
/// Abort the push call with the given error. Following pushes will also fail.
AbortPush(Box<dyn Error>),
}
/// Accumulator for NAL push parsers.
/// This is meant to be used by parsers for a specific format: Annex B, AVC, MPEG-TS, RTP, etc.
/// Accumulates NALs in an internal buffer and delegates to a caller-supplied handler.
pub struct Accumulator<'a> {
buf: Vec<u8>,
state: AccumulatorState,
handler: &'a mut FnMut(ByteSliceNal),
}
impl Accumulator {
pub fn new(nal_handler: &mut FnMut(ByteSliceNal) -> HandlerResult) -> Self;
pub fn start(&mut self);
// TODO: maybe should be push(..., end: bool) instead to avoid an extra !complete call to nal_handler.
pub fn push(&mut self, buf: &[u8]) -> Result<(), Error>;
pub fn end(&mut self);
}
/// Push parser for an Annex B form bytestream
/// (one with NALs separated by `00 00 01` or `00 00 00 01`).
pub struct AnnexBParser<'a> { accumulator: Accumulator<'a>, state: AnnexBParseState }
impl<'a> AnnexBParser<'a> {
pub fn new(nal_handler: &mut FnMut(nal: ByteSliceNal) -> HandlerResult) -> Self;
pub fn push(&mut self, buf: &[u8]) -> Result<(), Error>;
pub fn end_units(&mut self) -> Result<(), Error>;
}
}
/// RBSP stuff, independent of push or pull parsing.
mod rbsp {
/// Strips header byte and `emulation-prevention-three-byte` elements from the single-NAL input
/// to yield the RBSP.
pub struct ByteReader<N: std::io::BufRead> { ... }
impl<N: std::io::BufRead> std::io::Read for ByteReader<N> { ... }
impl<N: std::io::BufRead> std::io::BufRead for ByteReader<N> { ... }
/// Reads the RBSP as bits given a byte-level reader.
// (This only needs `BufRead` rather than `Read` for `has_more_rbsp_data`.)
pub struct BitReader<R: std::io::BufRead>(bitstream_io::Reader<R, bitstream_io::BigEndian>);
impl<R: std::io::BufRead> BitReader<R> {
...
}
}
// Example usage
fn my_annexb_reader() {
let mut ctx = Context::new();
let mut parser = h264_reader::push::AnnexBParser::new(|nal| {
match nal.header().unit_type() {
Ok(UnitType::PicParameterSet) if !nal.complete() => return NalResult::ReadMore,
Ok(UnitType::PicParameterSet) => {
match PicParameterSet::read(ctx, nal.rbsp_bits()) {
Ok(pps) => ctx.put_pic_param_set(pps),
Err(e) => error!("Bad pps: {:?}", e),
}
},
Ok(UnitType::SliceWithPartitioningIdr) => {
match SliceHeader::read(ctx, nal.rbsp_bits()) {
Ok(hdr) => info!("Slice header: {:#?}", hdr),
Err(SliceHeaderError::NeedMoreData) => return NalResult::ReadMore,
Err(e) => error!("Bad slice header: {:?}", e),
}
},
Ok(_) => {},
Err(e) => error!("bad NAL header: {:#?}", e),
}
NalResult::Skip
});
let f = File::open("foo.h264").unwrap();
let mut reader = BufReader::new(f);
loop {
let buf = reader.fill_buf().unwrap();
if buf.is_empty() {
break;
}
parser.push(buf);
read.consume(buf.len());
}
} |
Does the underlying use of My hope has always been to be able to extract metadata (sps + pps now; eventually All this said, if there is an irreconcilable choice between either API usability or avoiding |
@scottlamb I've added you to the project. Please continue to PR changes; however if I don't respond for a few days and you are blocked then please know that I trust your sense of taste and am ok with you merging in order to move forward. Don't feel that this places an obligation on you to pick up maintenance work if you don't have the time or inclination - I'm just aiming to make life easy. For my part I will try to get into the habit of PR'ing my own changes rather than committing direct to master. |
After some tweaking of the details (now in draft PR #26): it copies on I think this is strictly an improvement over what
Thanks for the vote of confidence! |
For dholroyd#4: make the bit reader type take a BufRead rather than a slice so we don't have to keep a buffered copy of the RBSP. While I'm at it, reduce "stuttering" by taking the module name out of the struct name. This is intrusive but mechanical.
For dholroyd#4: make the bit reader type take a BufRead rather than a slice so we don't have to keep a buffered copy of the RBSP. While I'm at it, reduce "stuttering" by taking the module name out of the struct name. This is intrusive but mechanical.
For dholroyd#4: make the bit reader type take a BufRead rather than a slice so we don't have to keep a buffered copy of the RBSP. While I'm at it, reduce "stuttering" by taking the module name out of the struct name. This is intrusive but mechanical.
For dholroyd#4: make the bit reader type take a BufRead rather than a slice so we don't have to keep a buffered copy of the RBSP. While I'm at it, reduce "stuttering" by taking the module name out of the struct name. This is intrusive but mechanical.
For dholroyd#4: make the bit reader type take a BufRead rather than a slice so we don't have to keep a buffered copy of the RBSP. While I'm at it, reduce "stuttering" by taking the module name out of the struct name. This is intrusive but mechanical.
Are you looking for me to review/merge your PRs as you'd do for mine? Or is that just to kick the CI into running before you commit? |
PR #26 is now ready to review with a significant API overhaul. |
@dholroyd Could I talk you into taking a look at the PR? This one is much more invasive than previous changes so I've held off on merging it to get a sense how comfortable you are with the new interface. |
Many pieces of functionality currently exposed by this create for decoding layers of the H264 spec exposes an API which...
...all with the goal of allow the implementation to avoid copying data.
This is useful for performance reasons in some workloads, but the API is really complicated and a bit hard to use.
It would be nice to provide alternative APIs that work much more simply when,
These simpler APIs would exist in addition to the performant-but-inconvenient interfaces. The implementations should share as much code as possible (very likely, the 'simple' API could be implemented as a higher-level abstraction on top of the complicated one).
The text was updated successfully, but these errors were encountered: