You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note the variable file length. We have a std::fs::File read stream. Created as such:
let reader = file.bytes();
There are 0..x of these 'CentralDirectoryRecords'. We cannot take specific byte counts, as the record lengths are not known prior to parsing them. We also cannot store the entire directory in memory (as it is too large). Is there a similar function to nom::bytes::streaming::take_till in nom-derive which will allow us to repeatedly process a stream of unknown length until some condition is true?
The text was updated successfully, but these errors were encountered:
If I'm correct, file.bytes() returns an iterator over bytes. Unfortunately, nom itself is not designed to work with iterators or readers, but mostly slices or strings. Using nom combinators will only return a parsed object, Incomplete or an error.
In other words, nom and nom-derive can solve one part of the problem (parsing an item once you have the bytes), but not the logic to call the read() function.
One solution is to fill a buffer, call parse to get a result. If you don't have enough bytes, you'll get an Incomplete (and needs to refill or extend your buffer), or you get an object. After using the object, you'll have to consume bytes (and shift the buffer).
For example, the pcap-parser crate works with streams (huge pcap files) by using a circular buffer and some functions to control when to refill the buffer, etc. You can find an example in the crate documentation, and the implementation of next. This may appear a bit complex, but is the most efficient way to parse items (not calling read every few bytes), and gives you fine control over the buffer and the structs.
One other solution is to use another derive crate, that would be Read-oriented. For example, binread works similarly to nom-derive, but with readers. This should be faster to implement, at the cost of being a bit less efficient (but maybe this is not your hardest constraint?)
We're processing zip files, which contain a list of
CentralDirectoryRecords
concatenated together. Each individual one looks similar to:Note the variable file length. We have a
std::fs::File
read stream. Created as such:There are
0..x
of these 'CentralDirectoryRecords'. We cannot take specific byte counts, as the record lengths are not known prior to parsing them. We also cannot store the entire directory in memory (as it is too large). Is there a similar function tonom::bytes::streaming::take_till
innom-derive
which will allow us to repeatedly process a stream of unknown length until some condition is true?The text was updated successfully, but these errors were encountered: