Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] zero-allocation Set #4142

Closed
wants to merge 15 commits into from
Closed

Conversation

MrAlias
Copy link
Contributor

@MrAlias MrAlias commented May 26, 2023

This proposes to separate the data a Set represents from the Set itself. Doing so allows for ...

  • zero(-ish) allocations to the heap and little computation overhead during set construction
  • reduction in size of data when passing Sets (i.e. the metric pipeline)

It also means that while Sets remain comparable, their equivalents can no longer be tested with ==. The existing Distinct type or the Equals method need to be used to test equivalence.

Design

  • Add a registry to globally hold all Set data.
    • Sets are constructed by sorting and de-duplicating their data then passing that data to the registry for a unique ID.
    • The data in the registry uses reference counting to know when it can be removed from the registry
  • Redefine a Set to hold pointer to a unique ID of the data (note: this is not the same reference as other sets for the same data)
    • When Sets are constructed their IDs have a finalizer set. When they become unreachable the reference they hold to the data will be removed from the registry.
  • Redefine a Destinct to hold the unique ID of the Set data the Destinct was made from
  • Use pools for all set data and IDs to amortize heap allocations
  • A zero allocation implementation of the FNV-1a hash algorithm is added to compute unique IDs for the Set data

Zero Allocations

By owning all of the data and controlling its life-cycle, we are able to effectively use pools for any data that needs to be allocated to the head. This means that all allocations can be amortized and effectively creating Sets will require zero allocations to the heap.

Reduced size of Set

Defining a Set with a single *uint64 field means the size of the Set is now 8 bytes. Contrast this to the prior implementation (an interface{} (2 uintptr) referencing an array of N KeyValue) which was (2 * sizeOf(uintptr)) + (N * sizeOf(KeyValue)) (on a 32-bit system with no data this would be 8 bytes).

This reduction in Set data size means that whenever a set is passed as an argument, the sized copied on the stack will be much smaller, but also consistent. This means the Set will always be able to be in a small size stack frame.

Trade Offs

Equivalence testing

The Set is still comparable. However, comparison of Sets created with the same KeyValues will evaluate to false when compared in a map or with ==. To test equivalence the Equals method needs to be used, and a map should be defined over the Distinct type instead.

This incompatibility does not break the API, but it does alter released behavior. Even though the Set is defined with an Equals method and the Distinct type is explicitly declared to be used as a map key instead, this will likely break user (and our metric pipeline) code. Careful consideration of if this is acceptable needs to be made.

It might be possible to resolve this, but I have not found a way. More investigation might be beneficial.

Testing

  • Moderate test coverage is included. It is not release-ready level of testing.
  • The BenchmarkNewSet benchmark is added to show the performance improvement of the changes.
$ go test -run='^$' -bench=BenchmarkNewSet -count=20 > out.txt
$ benchstat out.txt
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/attribute
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
         │   out.txt    │
         │    sec/op    │
NewSet-8   2.327µ ± 30%

         │   out.txt   │
         │    B/op     │
NewSet-8   19.50 ± 18%

         │  out.txt   │
         │ allocs/op  │
NewSet-8   0.000 ± 0%

MrAlias added 14 commits May 26, 2023 14:22
This is a failed experiment that tried to hold set data in its own
registry and use a uint64 reference number to act as a fast map key and
underlie a set.

It failed because a Set is returned from all the NewSet* functions and
since it is not allocated to the heap there is no way to correctly set a
finalizer for the object.
This is still a failed solution. It converts a pointer to a uintptr via
unsafe.Pointer, but it also then converts it back. This is going to
cause issues because the original pointer address is not guaranteed to
remain the same and casting back could end up pointing at invalid
memory.
@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 1, 2023

An additional branch from this that would show the SDK changes needed to support this would be helpful.

@pellared

This comment was marked as outdated.

@pellared
Copy link
Member

pellared commented Jun 6, 2023

Even though the Set is defined with an Equals method and the Distinct type is explicitly declared to be used as a map key instead, this will likely break user (and our metric pipeline) code.

In my opinion, this is acceptable. This is not an ABI breaking change. Moreover, we always have the option to revert the change.

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 7, 2023

Using a weak reference for the pointer held by the Set was explored. Similar to this comment and noted in this issue, there is no way to return the same strong pointer from the weak reference without abusing unsafe.Pointer. If Go ever introduces a moving garbage collector the implementation would break.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I only had a chance to do a cursory look, so mostly superficial comments.

var (
// keyValueType is used in computeDistinctReflect.
keyValueType = reflect.TypeOf(KeyValue{})
var slicePool = sync.Pool{New: func() any { return new([]KeyValue) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious how much this pool helps with performance. How much worse it is without this pool and with just allocating slices as needed?

func getSlice(length, capacity int) *[]KeyValue {
v := slicePool.Get().(*[]KeyValue)
if cap(*v) < capacity {
*v = make([]KeyValue, length, capacity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*v = make([]KeyValue, length, capacity)
return make([]KeyValue, length, capacity)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can return v to the pool since we didn't use it.

return Set{id: id}
}

var idPool = sync.Pool{New: func() any { return new(uint64) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why we need a pool of ids? Can we always allocate instead?

attribute.Float64Slice("[]float64", []float64{10.23, 941.1, 184e9, -2.3}),
attribute.StringSlice("[]string", []string{"", "one", "two"}),
}
// Pre-sort to remove from first iteration results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this defeat the purpose of the benchmark? Aren't we interested in finding out the performance of a typical case where the attributes are not necessarily sorted? Or is being sorted more typical?

var sets = newSetRegistry(-1)

func newSet(data *[]KeyValue) Set {
id := getID()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite sure but is it necessary to allocate (or get from a pool) a pointer to an id? The id is already stored in the setData.key. Can Set.id point to setData.key instead?

//
// A pointer is used so the finalizer can handle reference-counting for
// the sets registry while still being optimized as a map key.
id *uint64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to make this work by using a pointer to setData instead of a pointer to a separate uint64?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jan 23, 2024

The use of finalizers here does not seem like a production ready solution. Closing.

@MrAlias MrAlias closed this Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants