Chunk padding not recognized #90

Jazzdoodle · 2020-09-09T15:32:19Z

The current parser assumes that the data fields in a (sub)chunk fills up the entire chunk size. I have encountered WAV files with chunk padding, that means the next chunk starts only after a few unused padding bytes. Before reading the next subchunk header, you should explicitly seek to the next subchunk start as indicated by the actual chunk size in the chunk header instead on relying on things lining up.

Jazzdoodle · 2020-09-09T15:43:07Z

I've fixed the issue, and while I was at it I've also fixed the restriction of the format and data chunk order. I cannot be bothered to fork and make a pull request, so here's my new version of wavread in WAV.jl:

function wavread(io::IO; subrange=(:), format="double")
    chunk_size = read_header(io)
    samples = Array{Float64, 1}()
    nbits = 0
    sample_rate = Float32(0.0)
    opt = WAVChunk[]

    # Subtract the size of the format field from chunk_size; now it holds the size
    # of all the sub-chunks
    chunk_size -= 4
    # GitHub Issue #18: Check if there is enough data to read another chunk
    subchunk_header_size = 4 + sizeof(UInt32)
    fmt = WAVFormat()
    data_position = 0
    data_size = 0
    while chunk_size >= subchunk_header_size
        # Read subchunk ID and size
        subchunk_id = Vector{UInt8}(undef, 4)
        read!(io, subchunk_id)
        subchunk_size = read_le(io, UInt32)
        nextchunk_start = position(io) + subchunk_size
        if subchunk_size > chunk_size
            chunk_size = 0
            break
        end
        chunk_size -= subchunk_header_size + subchunk_size
        # check the subchunk ID
        if subchunk_id == b"fmt "
            fmt = read_format(io, subchunk_size)
            sample_rate = Float32(fmt.sample_rate)
            nbits = bits_per_sample(fmt)
            push!(opt, WAVChunk(fmt))
        elseif subchunk_id == b"data"
            data_position = position(io)
            data_size = subchunk_size
        else
            subchunk_data = Vector{UInt8}(undef, subchunk_size)
            read!(io, subchunk_data)
            push!(opt, WAVChunk(Symbol(subchunk_id), subchunk_data))
        end
        seek(io, nextchunk_start)
    end
    if data_size > 0 && data_position > 0
        seek(io, data_position)
        if format == "size"
            return convert(Int, data_size / fmt.block_align), convert(Int, fmt.nchannels)
        end
        samples = read_data(io, data_size, fmt, format, make_range(subrange))
    end
    return samples, sample_rate, nbits, opt
end

up the entire chunk size. I have encountered WAV files with chunk padding, that means the next chunk starts only after a few unused padding bytes. Before reading the next subchunk header, you should explicitly seek to the next subchunk start as indicated by the actual chunk size in the chunk header instead on relying on things lining up. I've fixed the issue (dancasimiro#90), and while I was at it I've also fixed the restriction of the format and data chunk order.

mgkuhn · 2020-10-03T17:07:47Z

A slightly fixed version of that fix is now PR #91.

dancasimiro · 2020-10-04T13:15:02Z

Fixed by #91

mgkuhn mentioned this issue Oct 3, 2020

wavread: always read next chunk, to handle chunk padding #91

Merged

dancasimiro closed this as completed Oct 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunk padding not recognized #90

Chunk padding not recognized #90

Jazzdoodle commented Sep 9, 2020

Jazzdoodle commented Sep 9, 2020

mgkuhn commented Oct 3, 2020

dancasimiro commented Oct 4, 2020

Chunk padding not recognized #90

Chunk padding not recognized #90

Comments

Jazzdoodle commented Sep 9, 2020

Jazzdoodle commented Sep 9, 2020

mgkuhn commented Oct 3, 2020

dancasimiro commented Oct 4, 2020