-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a generic set data structure #99
Conversation
All the ones I could find where clumsy to use, I was quite surprised. Hope that Go will soon gain a sets package in the stdlib so that we can get rid of this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pre-approving but the difference pre-sizing computation should be dropped, IMO. Simply set it to the size of s
.
|
||
// Difference returns a set containing the elements of s that are not in x. | ||
func (s *Set[T]) Difference(x *Set[T]) *Set[T] { | ||
result := New[T](max(0, s.Size()-x.Size())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err... if x
and s
are disjoint the result will be equal to s
so the actual max size it can reach is the size of s
no matter what's the size of x
is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We are allocating memory, and in this case we must allocate it before knowing how much we really need.
This errs on the side of allocating as little as possible, which is still better than not allocating anything. The map will then grow automatically with the inserts.
We could allocate the maximum size (s.Size()
) and then release what is not used, but Go doesn't currently have a way to shrink memory allocated to a map (except playing GC tricks), see golang/go#54766.
Last, we could just not preallocate anything. But it would have no positive impact compared to the current code, actually it would incur more allocations.
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If x is larger than s, you're allocating 0.
If x is smaller than s, you're still likely to be under-allocating unless x is totally included in s which shouldn't be the general case.
So you're almost always under-allocating. Is it worth it? If you really want to avoid the growth, then pre-allocate s. For the current use case of having N input directories where N is small (<10), the waste is probably much better then the reallocation+move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're almost always under-allocating. Is it worth it?
That is the point. It is always worth it, in the sense that it cannot do harm and sometimes it does good.
For sure, for the sizes we are talking about for cogito usage, as you point out, the difference is unmeasurable.
Still, there is no harm, so I just want to be sure I understand your point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just don't think it's worth it.
I don't know how Go handles the growth but the usual way is to double the current allocation (when still under some limit) whenever you need more room and so you end up with over-allocation anyway in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odormond I understand your position. We will have to agree to disagree :-)
All the ones I could find where clumsy to use, I was quite surprised.
Hope that Go will soon gain a sets package in the stdlib so that we can get rid
of this one.
I split this from the current branch I am working on because it is self-contained, but to
understand why I implemented only this partial API you have to wait to see the next PR.
I have to say that generics and type constraints are good stuff!
PCI-2561