Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Handle non-MP3 sections of streams more gracefully. #3

Open
MLanghof opened this issue May 11, 2019 · 13 comments
Open

Handle non-MP3 sections of streams more gracefully. #3

MLanghof opened this issue May 11, 2019 · 13 comments

Comments

@MLanghof
Copy link

(I came here from processing/processing-sound#32 - I hope I'm in the right fork...)

Most real-world MP3 files containing actual music will have ID3 tags and/or other garbage before the actual MP3 frames.

The current implementation in https://github.com/kevinstadler/JavaMP3/blob/master/src/main/java/fr/delthas/javamp3/Decoder.java#L441 assumes that the first occurrence of the sync word (i.e. 12 set bits, beginning on a byte boundary) marks the first MP3 header of the file. In practice this appears to be wrong more often than not.

The consequence of this is that you might find a 0b111111111111 pattern somewhere in the ID3 tags followed by bits that do not form a valid MP3 header. You might read samplingFrequency as 0b11 and then try to access SAMPLING_FREQUENCY out of bounds - similar for the other table indices. From the tens of files I tried, I estimate that 80% of my music library fails to be decoded by JavaMP3 for this reason.

In my experience so far, this is relatively simple to mitigate (at least conceptually) by staying suspicious whether we actually have an MP3 frame:

  1. When encountering an invalid index into any of the tables while decoding a supposed header, it's not a valid header - keep looking. This requires at most 4 bytes of "lookahead".

  2. While searching the first header, if you find a 4 byte sequence that looks like a valid header (i.e. passes 1.), additionally require that the frame described by it is followed immediately by another valid header. For all subsequent headers you should be able to skip this check as there should be no garbage between the frames.

With specifically crafted ID3 tags this might still be breakable but I have not found any problems with this approach on normal files.

I understand that this is less trivial to implement when you are reading from a Buffer and can't easily look forward/backward. But even if you have to work around that, at least the performance/latency impact should be negligible because 2. is only relevant for finding the first header and 1. has an overhead of a single int.

@kevinstadler
Copy link
Owner

Sounds good and the lookahead shouldn't be that difficult to fix actually, can you confirm that this would be a valid approach to figuring out whether we are dealing with a valid header?:

  1. bitrateIndex (4 bit) can't be 0 or 15
  2. samplingRateIndex (2 bit) can't be 3
  3. layerIndex (2 bit) can't be 0

@kevinstadler
Copy link
Owner

I've created another test release, could you replace the javamp3-1.0.3.jar that is in your Processing/libraries/sound/library with the following file: https://github.com/kevinstadler/JavaMP3/releases/download/v1.0.4/javamp3-1.0.4.jar (renaming the file to -1.0.3 should be no problem) and see if that fixes the issue of loading MP3s with ID3 tags?

@MLanghof
Copy link
Author

Testing it right now.

@MLanghof
Copy link
Author

MLanghof commented May 12, 2019

Good news: noLove.mp3 (from the sound issue) no longer dies due to samplingRateIndex = 3.

Bad news: It's now out of bounds (16) in samples_I, just like close.mp3. The london.mp3 error is also unchanged.

This is confusing to me. In my understanding, sample_I should only be called if there is a Layer1 frame in the file. But even if I disable all my additional checks, I cannot find a valid Layer1 frame in the file.

The 4 bytes FF FF 03 E8 appear 4 times before the actual audio in noLove.mp3, starting at locations 6513, 6541, 6569 and 6597. They decode to Layer1, but bitrateIndex = 0, so those should be rejected. Based on how fast JavaMP3 dies trying to decode that file, I suspect it's those invalid headers that kill it.

Were you able to reproduce the errors with your build?

@kevinstadler
Copy link
Owner

That's useful thanks, I'll have another look!

@kevinstadler
Copy link
Owner

The plot thickens: noLove.mp3 is not supported because it uses the "free bitrate" which is not implemented, I've added an informative error message for such cases. Having a look at the other ones now.

@kevinstadler
Copy link
Owner

Sorry I only just realised that that's the same data you were talking about. So in the test build you tried out those bitrateIndex == 0 headers were actually rejected, which means all those files now ("correctly", in terms of reading the header) run into the same type of decoding error. I'm trying to narrow it down based on their frame header parameters now.

Are you sure the occurrences of FF FF 03 E8 are actually a bogus header in this case? I was considering accepting them as a header and printing a warning message about free bitrate support before trying to decode the data (and most likely crashing), but if you think it's generally more likely they are bogus and should be skipped I might just strictly reject them as in the version you tested.

@kevinstadler
Copy link
Owner

Ok, here's what I've figured out:

  • as you already pointed out, noLove.mp3 and close.mp3 are both Layer III but are incorrectly detected as containing a Layer I frame header, which leads to the ArrayIndexOutOfBoundsException: 16 somewhere down the line
  • london.mp3, also Layer III, is incorrectly detected as having a Layer I frame at the beginning, but its decoding doesn't lead to an ArrayIndexOutOfBoundsException straightaway. Instead, decoding of the first frame ends at some bogus point in the file, from which the decoder tries to find the second valid frame header -- there is no frame header there so it skips through a majority of the file before finding a bogus 4 byte sequence that looks like a (Layer III) frame header, and an attempt to decode that bogus frame causes a NullPointerException

So the root cause for both of those cases is that the current primitive header validation (which only looks at the 12bit sequence of 1s, layer, bitrateIndex and samplingFrequencyIndex separately) lets too many bogus frames through. The header filtering can still be improved by rejecting headers with invalid bitrate/mode combinations according to the standard, I'll have a look at those shortly.

@MLanghof
Copy link
Author

MLanghof commented May 12, 2019

I admit that I haven't looked into free bitrate (or supporting more than Layer3) in my experiments. But if you're going to crash on them you might as well ignore them instead :)

I looked more into the file and it appears that the entire thumbnail image is embedded inside the first 74 KB of the file, including tons of adobe meta-info (for the image file):

image

image

image

I doubt Photoshop wanted to embed a 4 frame Layer 1 MP3 with only 0xFF frame data.

@kevinstadler
Copy link
Owner

Indeed, but how exactly can we tell that we are looking at a non-audio part of the stream, and skip the appropriate number of bytes? This isn't an issue with ID3 tags generally by the way, the decoding errors only seem to occur with files that ffmpeg describes as consisting of more than one stream (although I'm not sure what the technical condition for that is):

Input #0, mp3, from 'close.mp3':
  Metadata:
  [...]
  Duration: 00:04:37.35, start: 0.000000, bitrate: 352 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
    Stream #0:1: Video: mjpeg, yuvj444p(pc, bt470bg/unknown/unknown), 1417x1417, 90k tbr, 90k tbn, 90k tbc
    Metadata:
      comment         : Cover (front)

@MLanghof
Copy link
Author

Indeed, but how exactly can we tell that we are looking at a non-audio part of the stream, and skip the appropriate number of bytes?

I think it's quite safe to require a frame to be followed by another header, as described above. At least for the first frame (if you do it for every frame you will reject the last one if there's garbage afterwards).

@kevinstadler
Copy link
Owner

I just tried this approach -- maybe there's something wrong with the way in which I backtrack the inputstream after encountering a bogus header, but the approach simply isn't working for any of your files.

More revealing though is the fact that, using this strict two-directly-adjacent-frames constraint, the test cases for the Layer III files included in this repository break as well, which sugggests that there is a general problem with how Layer III frames are decoded (or, to be more precise, there is a problem with where the inputstream pointer is left after reading a Layer III frame -- could this be an issue with padding?)

@kevinstadler
Copy link
Owner

Ok I've got good news and bad news, the good news is that the adjacent-header constraint is now implemented and working, the bad news is that noLove.mp3 contains something that looks like a valid frame header directly followed by something that can be decoded without throwing any errors directly followed by something that looks like another valid frame header (but whose supposed frame content then throws an exception upon decoding).

Here's the current build for testing: https://github.com/kevinstadler/JavaMP3/releases/download/v1.0.4-pre/javamp3-1.0.4.jar

While this is already an enormous step forward I wonder if it's possible to also weed out those last few issues. Apart from the obvious solution of requiring two decodable frames to be right next to each other at the beginning of the file (which would require more complex input stream rollback work based on how the original decoder codebase was written), I wonder if there is another way to weed out bogus adjacent frame headers based on their content -- for example, are there any constraints/rules of thumb of different frames within the same file having the same MPEG layer and sample rate? That would make for a much more easily implementable filter, any ideas welcome.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants