Skip to content
Dion Mendel edited this page Jun 25, 2023 · 15 revisions

Navigation


How do I use string encodings with BinData?

BinData will internally use 8bit binary strings to represent the data. You do not need to worry about converting between encodings.

If you wish BinData to present string data in a specific encoding, you can override #snapshot as illustrated below:

class UTF8String < BinData::String
  def snapshot
    super.force_encoding('UTF-8')
  end
end

str = UTF8String.new("\xC3\x85\xC3\x84\xC3\x96")
str #=> "ÅÄÖ"
str.to_binary_s #=> "\xC3\x85\xC3\x84\xC3\x96"

How do I speed up initialization?

I'm doing this and it's slow.

999.times do |i|
  foo = Foo.new(:bar => "baz")
  ...
end

BinData is optimized to be declarative. For imperative use, the above naïve approach will be slow. Below are faster alternatives.

The fastest approach is to reuse objects by calling #clear instead of instantiating more objects.

foo = Foo.new(:bar => "baz")
999.times do
  foo.clear
  ...
end

If you can't reuse objects, then consider the prototype pattern.

prototype = Foo.new(:bar => "baz")
999.times do
  foo = prototype.new
  ...
end

The prefered approach is to be declarative.

class FooList < BinData::Array
  default_parameter :initial_length => 999

  foo :bar => "baz"
end

array = FooList.new
array.each { ... }

How do I model this complex nested format?

A common pattern in file formats and network protocols is type-length-value. The type field specifies how to interpret the value. This gives a way to dynamically structure the data format. An example is the TCP/IP protocol suite. An IP datagram can contain a nested TCP, UDP or other packet type as decided by the protocol field.

Modelling this structure can be difficult when the nesting is recursive, e.g. IP tunneling. Here is an example of the simplest possible recursive TLV structure, a list that can contains atoms or other lists.

How do I seek around in the data stream?

See Multi-pass I/O.

I use windows and I'm getting weird results

Windows defaults to opening files in text mode, which performs translations on the data when reading or writing. Ensure you open files in binary mode