-
Notifications
You must be signed in to change notification settings - Fork 1
Handle non-MP3 sections of streams more gracefully. #3
Comments
Sounds good and the lookahead shouldn't be that difficult to fix actually, can you confirm that this would be a valid approach to figuring out whether we are dealing with a valid header?:
|
I've created another test release, could you replace the |
Testing it right now. |
Good news: Bad news: It's now out of bounds (16) in This is confusing to me. In my understanding, The 4 bytes Were you able to reproduce the errors with your build? |
That's useful thanks, I'll have another look! |
The plot thickens: |
Sorry I only just realised that that's the same data you were talking about. So in the test build you tried out those bitrateIndex == 0 headers were actually rejected, which means all those files now ("correctly", in terms of reading the header) run into the same type of decoding error. I'm trying to narrow it down based on their frame header parameters now. Are you sure the occurrences of |
Ok, here's what I've figured out:
So the root cause for both of those cases is that the current primitive header validation (which only looks at the 12bit sequence of 1s, layer, bitrateIndex and samplingFrequencyIndex separately) lets too many bogus frames through. The header filtering can still be improved by rejecting headers with invalid bitrate/mode combinations according to the standard, I'll have a look at those shortly. |
I admit that I haven't looked into free bitrate (or supporting more than Layer3) in my experiments. But if you're going to crash on them you might as well ignore them instead :) I looked more into the file and it appears that the entire thumbnail image is embedded inside the first 74 KB of the file, including tons of adobe meta-info (for the image file): I doubt Photoshop wanted to embed a 4 frame Layer 1 MP3 with only 0xFF frame data. |
Indeed, but how exactly can we tell that we are looking at a non-audio part of the stream, and skip the appropriate number of bytes? This isn't an issue with ID3 tags generally by the way, the decoding errors only seem to occur with files that ffmpeg describes as consisting of more than one stream (although I'm not sure what the technical condition for that is):
|
I think it's quite safe to require a frame to be followed by another header, as described above. At least for the first frame (if you do it for every frame you will reject the last one if there's garbage afterwards). |
I just tried this approach -- maybe there's something wrong with the way in which I backtrack the inputstream after encountering a bogus header, but the approach simply isn't working for any of your files. More revealing though is the fact that, using this strict two-directly-adjacent-frames constraint, the test cases for the Layer III files included in this repository break as well, which sugggests that there is a general problem with how Layer III frames are decoded (or, to be more precise, there is a problem with where the inputstream pointer is left after reading a Layer III frame -- could this be an issue with padding?) |
Ok I've got good news and bad news, the good news is that the adjacent-header constraint is now implemented and working, the bad news is that noLove.mp3 contains something that looks like a valid frame header directly followed by something that can be decoded without throwing any errors directly followed by something that looks like another valid frame header (but whose supposed frame content then throws an exception upon decoding). Here's the current build for testing: https://github.com/kevinstadler/JavaMP3/releases/download/v1.0.4-pre/javamp3-1.0.4.jar While this is already an enormous step forward I wonder if it's possible to also weed out those last few issues. Apart from the obvious solution of requiring two decodable frames to be right next to each other at the beginning of the file (which would require more complex input stream rollback work based on how the original decoder codebase was written), I wonder if there is another way to weed out bogus adjacent frame headers based on their content -- for example, are there any constraints/rules of thumb of different frames within the same file having the same MPEG layer and sample rate? That would make for a much more easily implementable filter, any ideas welcome. |
(I came here from processing/processing-sound#32 - I hope I'm in the right fork...)
Most real-world MP3 files containing actual music will have ID3 tags and/or other garbage before the actual MP3 frames.
The current implementation in https://github.com/kevinstadler/JavaMP3/blob/master/src/main/java/fr/delthas/javamp3/Decoder.java#L441 assumes that the first occurrence of the sync word (i.e. 12 set bits, beginning on a byte boundary) marks the first MP3 header of the file. In practice this appears to be wrong more often than not.
The consequence of this is that you might find a
0b111111111111
pattern somewhere in the ID3 tags followed by bits that do not form a valid MP3 header. You might readsamplingFrequency
as0b11
and then try to accessSAMPLING_FREQUENCY
out of bounds - similar for the other table indices. From the tens of files I tried, I estimate that 80% of my music library fails to be decoded by JavaMP3 for this reason.In my experience so far, this is relatively simple to mitigate (at least conceptually) by staying suspicious whether we actually have an MP3 frame:
When encountering an invalid index into any of the tables while decoding a supposed header, it's not a valid header - keep looking. This requires at most 4 bytes of "lookahead".
While searching the first header, if you find a 4 byte sequence that looks like a valid header (i.e. passes 1.), additionally require that the frame described by it is followed immediately by another valid header. For all subsequent headers you should be able to skip this check as there should be no garbage between the frames.
With specifically crafted ID3 tags this might still be breakable but I have not found any problems with this approach on normal files.
I understand that this is less trivial to implement when you are reading from a
Buffer
and can't easily look forward/backward. But even if you have to work around that, at least the performance/latency impact should be negligible because 2. is only relevant for finding the first header and 1. has an overhead of a singleint
.The text was updated successfully, but these errors were encountered: