Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: hash: make the Sum method less confusing #21070

Closed
bcmills opened this issue Jul 18, 2017 · 12 comments
Closed

proposal: hash: make the Sum method less confusing #21070

bcmills opened this issue Jul 18, 2017 · 12 comments

Comments

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2017

The Go 1 hash.Hash interface has a Sum method:

    // Sum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    Sum(b []byte) []byte

Individual packages (such as md5) contain Sum functions with very similar signatures but totally different meaning:

  // Sum returns the MD5 checksum of the data.
  func Sum(data []byte) [Size]byte

This makes (hash.Hash).Sum confusing, as illustrated in the review for http://golang.org/cl/49030.

I can see two possible improvements.

  1. We could rename the method to AppendSum (along the lines of the strconv Append functions). Adding such a method to the implementations (but not the Hash interface itself) would be backward-compatible with Go 1.
    // AppendSum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    AppendSum(b []byte) []byte
  1. We could change Sum to not accept a parameter and always return a new slice. If we pair that with appropriate inlining and devirtualization optimizations, it could theoretically be as efficient as appending to an existing slice.
    // Sum returns a new slice containing the current hash.
    // It does not change the underlying hash state.
    Sum() []byte

I recommend option (1), because I think it would migrate more smoothly in binaries that mix Go 1 and Go 2.

@bradfitz
Copy link
Contributor

always return a new slice. If we pair that with appropriate inlining and devirtualization optimizations, it could theoretically be as efficient as appending to an existing slice.

I don't see that happening anytime soon. I think AppendSum and being explicit about memory is still best until we've proven we can make a compiler and/or GC smart enough to not worry about allocation costs.

@bradfitz bradfitz added the v2 An incompatible library change label Jul 18, 2017
@ianlancetaylor ianlancetaylor changed the title proposal (Go 2): hash: make the Sum method less confusing proposal: Go 2: hash: make the Sum method less confusing Feb 27, 2018
@gopherbot gopherbot added this to the Proposal milestone Feb 27, 2018
@ianlancetaylor
Copy link
Member

I agree that renaming the Sum method to AppendSum seems like an appropriate change.

@RaduBerinde
Copy link
Contributor

In go 1, we could rename the argument b to something more explicit like appendTo:

Sum(appendTo []byte) []byte

This is just a cosmetic, backwards compatible change but it can help avoid the confusion.

@bradfitz
Copy link
Contributor

This may not be worth the pain. We'd need to make a new v2 hash package, and then either make v2s of all implementations, or add an additional AppendSum methods to all types alongside their old Sum methods to implement the old [v1] hash package. Both seem kinda gross.

@as
Copy link
Contributor

as commented Nov 22, 2018

Why does the method need to be called AppendSum? What other Append operations could it possibly be referring to? If have an Appender interface type with the signature Append([]byte) []byte, naming the method with the extra suffix AppendSum prevents the standard library from accruing implicitly satisfied Appender types.

The original name Sum should have just been Append. The documentation for the method set seems to agree.

@akavel
Copy link
Contributor

akavel commented Dec 11, 2018

@bradfitz Because of the "pain" you mention, I believe if this might ever be done, "Go2" is the only point where it could be. Please don't casually dismiss it too easily. Please note there's also a huge pain in the current state of events. Because of this very issue, I personally have a huge yellow virtual "WARNING" sign attached to the whole hash package in my mind, with the small print saying: "never touch the hash package without cautiously reading the godoc; there's some well-known minefield somewhere there". And I'm certainly not a noob to Go, being in the community since the 2009. Notably, I do also believe that the current shape of the interface can be unnecessarily risky with regards to use in security-related applications (there's a reason why many Hash implementations are located in the crypto package), and in this domain I believe it's generally better to be safe than sorry, including in API design. For the sake of making the virtual world somewhat safer.

I understand this may not be a "prime time" candidate for the "Go2" process. But once some of the necessary Go2 refactoring approaches/mechanisms get tested and proven on some more "attractive" packages/issues, and people get used to them in the Go 2 transition period, I don't see why the "smaller papercuts" issues like this one cannot get pulled onto the bandwagon as well.

@detailyang
Copy link

Any plan? it's highly useful to minimize memory overhead when the go program will do the sum

BTW: Append API naming style is idiomatic:)

@ianlancetaylor ianlancetaylor removed v2 An incompatible library change Go2Cleanup labels Jun 21, 2023
@ianlancetaylor ianlancetaylor changed the title proposal: Go 2: hash: make the Sum method less confusing proposal: hash: make the Sum method less confusing Jun 21, 2023
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

There are many things we can redo in v2's of packages, but changing the names of methods in widely-used interfaces does not seem to be one of them. For the same reason I don't think we can change, for example, that zero-length Reads are allowed.

We can't add a method to hash.Hash, and some packages only return a hash.Hash, so we can't add methods to an exported concrete implementation because there isn't one.

It seems like the best we can do is provide better examples in the affected packages, and perhaps also a non-executable example in the doc comment for the Sum methods.

@adonovan
Copy link
Member

adonovan commented Jun 29, 2023

Perhaps this would be a sufficient addition to the doc comment:

    // Sum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    //
    // To store the hash into an array of the appropriate size,
    // use the following form (illustrated for sha256):
    // 
    //		var hash [sha256.Size]byte
    //		h.Sum(hash[:0])
    Sum(b []byte) []byte

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals Jul 5, 2023
@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals Jul 12, 2023
@rsc rsc closed this as completed Jul 12, 2023
@golang golang locked and limited conversation to collaborators Jul 11, 2024
@rsc rsc removed this from Proposals Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants