Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for little-endian byte order representation. #75

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sunsingerus
Copy link

Although RFC4122 recommends network byte order for all fields, the PC industry (including the ACPI, UEFI, SMBIOS and Microsoft specifications) has consistently used little-endian byte encoding for the first three fields: time_low, time_mid, time_hi_and_version.
Example from SMBIOS spec, section 7.2.1 System - UUID
The UUID {00112233-4455-6677-8899-AABBCCDDEEFF} would thus be represented as: 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf

Although RFC4122 recommends network byte order for all fields, the PC industry (including the ACPI, UEFI, SMBIOS and Microsoft specifications) has consistently used little-endian byte encoding for the first three fields: time_low, time_mid, time_hi_and_version.
Example from SMBIOS spec, section 7.2.1 System - UUID
The UUID {00112233-4455-6677-8899-AABBCCDDEEFF} would thus be represented as: 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf
@pborman
Copy link
Collaborator

pborman commented Feb 6, 2021

So this means the version cannot be determined since they moved the version field. Sounds like Microsoft. This isn't the first standard that they have messed up for everyone else.

I am not sure how the unmarshaling will get called int this change. I could see a function that swaps between a compliant UUID and a non-compliant one. To make this work with any of encoding packages that use the Unmarshaler interface I think you would need a whole new type that modifies both the marshal/unmarshal methods as well as parsing bytes.

This will deserve some more thought and discussion. It would also deserve some tests :-)

@sunsingerus
Copy link
Author

sunsingerus commented Feb 7, 2021

So this means the version cannot be determined since they moved the version field.

Yes, as far as I understand the situation, there is no way to determine what byte order is used in the UUID binary representation. One "must know in advance" what byte order is used for encoding.

I am not sure how the unmarshaling will get called int this change.

Let me explain the situation as I see it.
The UUID with a string representation, say, "{00112233-4455-6677-8899-aabbccddeeff}" can be represented differently in wire/physical format.
As bytes 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff using big-ending encoding (as RFC4122 recommends).
As bytes 33 22 11 00 55 44 77 66 88 99 aa bb cc dd ee ff using little-ending encoding (as specified in ACPI spec or SMBIOS spec or any other spec which honors little-endian encoding).
We must know in advance, what byte order is used and call appropriate FromBytes() function.

Let's take a look on example. Imagine a device produced by a hardware vendor. Each vendor has its own UUID, embedded into produced device. Let's say we'd like to print "human-readable name" of the vendor.
We have a buf with raw data fetched from the device. Let's say UUID is located in bytes buf[0x10:0x20]
So we'd like to be able to do:

var vendorX = uuid.MustParse("00112233-4455-6677-8899-aabbccddeeff")
vendorUUID, err = uuid.FromBytes(buf[0x10:0x20])
if vendorUUID == vendorX {
  fmt.Println("This device is produced by Vendor X")
}

In our case UUID inside the buf is represented in little-endian endcoding , while uuid.FromBytes() expects big-endian encoding, and result of the uuid.FromBytes() would be UUID with string representation "{33221100-5544-7766-8899-aabbccddeeff}". So the check would fail due to bytes physical positioning.

Let's introduce another function, FromBytesLittleEndian() which will treat bytes as little-endian encoded.

var vendorX = uuid.MustParse("00112233-4455-6677-8899-aabbccddeeff")
vendorUUID, err = uuid.FromBytesLittleEndian(buf[0x10:0x20])
if vendorUUID == vendorX {
  fmt.Println("This device is produced by Vendor X")
}

and the whole piece of code will run correctly.

I could see a function that swaps between a compliant UUID and a non-compliant one.

Yes, I've introduced top-level function

func FromBytesLittleEndian(b []byte) (uuid UUID, err error)

which is insiped by

func FromBytes(b []byte) (uuid UUID, err error)

and appropriate marshaling functions:

func (uuid UUID) MarshalBinaryLittleEndian() ([]byte, error)
func (uuid *UUID) UnmarshalBinaryLittleEndian(data []byte) error

which are inspired by

func (uuid UUID) MarshalBinary() ([]byte, error)
func (uuid *UUID) UnmarshalBinary(data []byte) error

respectively. These new functions understand little-endian byte encoding and how to Marshal/Unmarshal binary data.

To make this work with any of encoding packages that use the Unmarshaler interface I think you would need a whole new type that modifies both the marshal/unmarshal methods as well as parsing bytes.

Can you, please, elaborate this item a little bit, I am not sure what exacly is meant here. Is it really so complicated? And how already exisiting functions, such as FromBytes(), MarshalBinary() and UnmarfshalBinary() fit into this picture?

Also I am not sure about whole new type, because it is still the same UUID with the same logic/idea. The only thing different is byte order in UUID's binary representation.
Also we need to be able to compare UUIDs read from sources with different byte order to each other directly, such as this:

uuidX, err = uuid.FromBytes(bufX[0x10:0x20])
uuidY, err = uuid.FromBytesLittleEndian(bufY[0x10:0x20])
if uuidX == uuidY {
  fmt.Println("UUIDs are equal")
}

This will deserve some more thought and discussion.

Yes, sure, that's why I am opening the discussion with initial PR for little-endian encoding support.
As an alternative approach, I can suggest to introduce optional parameter to FromBytes() function which will specify encoding, something like:

func FromBytes(b []byte, order ...ByteOrder) (uuid UUID, err error)

Insipired by encoding.binary package.
If there is no order provided - assume big-endian encoding. Otherwise use explicitly specified encoding.
What do you think?

It would also deserve some tests :-)

Yep :-)

@pborman
Copy link
Collaborator

pborman commented Feb 8, 2021

These really are two different types because they have different binary representations and they are not comparable to each other without converting from one to another. Also, encoders, such as encoding/json, use interfaces as described in encoding to know how to encode/decode values. Since there are interfaces they must have those exact names. The only way to get JSON to unmarshal an SMBIOS UUID (as opposed to regular UUID) is to have a new type. It does not need all the functions provided by the standard UUID package and can easily be written in terms of the regular uuid package:

package leuuid

import "github.com/google/uuid"

type LEUUID [16]byte

func FromUUID(u uuid.UUID) LEUUID {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
        return u 
}

func Parse(s string) (LEUUID, error) {
        u, err := uuid.Parse(s)
        return FromUUID(u), err
}

func (u LEUUID) UUID() uuid.UUID {
        return FromUUID(u)
}

func (u LEUUID) MarshalText() ([]btye, error) {
        return u.UUID().MarshalText()
}

...

@pborman
Copy link
Collaborator

pborman commented Feb 8, 2021

I wrote up a working example of what I suggested above, it is included below.

Would this concept work for you? In your structures who are serialized to the DSP0134 format you would use the dsp0134.UUID type rather than the standard uuid.UUID type.

// Package dsp0134 is a wrapper around github.com/google/uuid.  It supports the
// non-standard UUID format described in the DMTF System Management BIOS
// (SMBIOS) Reference Specification document DSP0134.
// https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf
// The DSP0134 standard reorders the first 8 bytes of the UUID.  The standard
// RFC4122 encoding for the UUID "00112233-4455-6677-8899-AABBCCDDEEFF" is
//	00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF
// The encoding specified in DSP0134 is:
//	33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
package dsp0134

import "github.com/google/uuid"

// A UUID is a 128 bit (16 byte) Universal Unique IDentifier as defined by
// DSP0134.
//
// Most methods from github.com/google/uuid can be used by invoking the
// UUID method:
//
//	var u UUID
//	return u.UUID().Version()
type UUID [16]byte

// swapUUID returns u after converting it to/from RFC4122 from/to DSP0134
// ordering.
func swapUUID(u [16]byte) [16]byte {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
        return u
}

// swapInPlace is like swapUUID but swaps u in place.
func swapInPlace(u []byte) {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
}

// ToUUID return u as a uuid.UUID.
func ToUUID(u UUID) uuid.UUID {
        return swapUUID(u)
}

// FromUUID return u as a UUID.
func FromUUID(u uuid.UUID) UUID {
        return swapUUID(u)
}

// Parse is analgous to github.com/google/uuid.Parse.
func Parse(s string) (UUID, error) {
        u, err := uuid.Parse(s)
        return swapUUID(u), err
}

// FromBytes is analgous to github.com/google/uuid.FromBytes.
func FromBytes(b []byte) (UUID, error) {
        u, err := uuid.FromBytes(b)
        return swapUUID(u), err
}

// ParseBytes is analgous to github.com/google/uuid.ParseBytes.
func ParseBytes(b []byte) (UUID, error) {
        u, err := uuid.ParseBytes(b)
        return FromUUID(u), err
}

// UUID returns u as a github.com/google/uuid.UUID.
func (u UUID) UUID() uuid.UUID {
        return ToUUID(u)
}

// MarshalText implements encoding.TextMarshaler.
func (u UUID) MarshalText() ([]byte, error) {
        return uuid.UUID(swapUUID(u)).MarshalText()
}

// UnmarshalText implements encoding.TextUnmarshaler.
func (u *UUID) UnmarshalText(data []byte) error {
        if err := (*uuid.UUID)(u).UnmarshalText(data); err != nil {
		return err
	}
	swapInPlace(u[:])
	return nil
}

// MarshalBinary implements encoding.BinaryMarshaler
func (u UUID) MarshalBinary() ([]byte, error) {
        return uuid.UUID(u).MarshalBinary()
}

// UnmarshalBinary implements encoding.BinaryUnmarshaler
func (u *UUID) UnmarshalBinary(data []byte) error {
        return (*uuid.UUID)(u).UnmarshalBinary(data)
}

@pborman
Copy link
Collaborator

pborman commented Feb 9, 2021

Hey, I made and example package: github.com/pborman/dsp0134
Will this package work for you? If so I am thinking we can make it a sub package of the uuid package.

@sunsingerus
Copy link
Author

Thanks a lot for the detailed explanation, I understand now that initial approach was incorrect. Little-endian UUID has to comply to interfaces from encoding package, thus it has to be a separate type. Package dsp0134 from the first glance looks good for me, however I need to try it

Will this package work for you?

Thank you very much for the package, I'll try tomorrow, sorry for the delay.

@pborman
Copy link
Collaborator

pborman commented Feb 16, 2021

Did this work? I am wondering if it is worth making into a supported package or not. I am thinking a sub package of this UUID package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants