Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a generic set data structure #99

Merged
merged 1 commit into from
Sep 1, 2022
Merged

Conversation

marco-m-pix4d
Copy link
Contributor

All the ones I could find where clumsy to use, I was quite surprised.

Hope that Go will soon gain a sets package in the stdlib so that we can get rid
of this one.

I split this from the current branch I am working on because it is self-contained, but to
understand why I implemented only this partial API you have to wait to see the next PR.

I have to say that generics and type constraints are good stuff!

PCI-2561

All the ones I could find where clumsy to use, I was quite surprised.

Hope that Go will soon gain a sets package in the stdlib so that we can get rid
of this one.
Copy link
Contributor

@odormond odormond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pre-approving but the difference pre-sizing computation should be dropped, IMO. Simply set it to the size of s.


// Difference returns a set containing the elements of s that are not in x.
func (s *Set[T]) Difference(x *Set[T]) *Set[T] {
result := New[T](max(0, s.Size()-x.Size()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err... if x and s are disjoint the result will be equal to s so the actual max size it can reach is the size of s no matter what's the size of x is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We are allocating memory, and in this case we must allocate it before knowing how much we really need.

This errs on the side of allocating as little as possible, which is still better than not allocating anything. The map will then grow automatically with the inserts.

We could allocate the maximum size (s.Size()) and then release what is not used, but Go doesn't currently have a way to shrink memory allocated to a map (except playing GC tricks), see golang/go#54766.

Last, we could just not preallocate anything. But it would have no positive impact compared to the current code, actually it would incur more allocations.

Or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If x is larger than s, you're allocating 0.
If x is smaller than s, you're still likely to be under-allocating unless x is totally included in s which shouldn't be the general case.
So you're almost always under-allocating. Is it worth it? If you really want to avoid the growth, then pre-allocate s. For the current use case of having N input directories where N is small (<10), the waste is probably much better then the reallocation+move.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're almost always under-allocating. Is it worth it?

That is the point. It is always worth it, in the sense that it cannot do harm and sometimes it does good.
For sure, for the sizes we are talking about for cogito usage, as you point out, the difference is unmeasurable.
Still, there is no harm, so I just want to be sure I understand your point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't think it's worth it.

I don't know how Go handles the growth but the usual way is to double the current allocation (when still under some limit) whenever you need more room and so you end up with over-allocation anyway in most cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@odormond I understand your position. We will have to agree to disagree :-)

@marco-m-pix4d marco-m-pix4d merged commit 6934616 into master Sep 1, 2022
@marco-m-pix4d marco-m-pix4d deleted the sets-data-structure branch September 1, 2022 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants