Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string: overly-lazy copy from []byte to string #67758

Closed
adam-azarchs opened this issue May 31, 2024 · 4 comments
Closed

string: overly-lazy copy from []byte to string #67758

adam-azarchs opened this issue May 31, 2024 · 4 comments

Comments

@adam-azarchs
Copy link
Contributor

Go version

1.22

Output of go env in your module/workspace:

This reproduces just fine on go.dev/play, but locally I have

GOARCH='amd64'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOOS='linux'
GOVERSION='go1.22.3'

What did you do?

func printStrings(header, s1, s2 string) {
	fmt.Println(header)
	fmt.Println(s1)
	fmt.Println(s2)
	fmt.Println()
}

func f(s string) string {
	return s
}

func main() {
	buffer := make([]byte, 0, len("some longer string"))
	buffer = append(buffer, "some "...)
	shortString := string(append(buffer, "short string"...))
	longString := string(append(buffer, "longer string"...))
	printStrings("correct:", shortString, string(append(buffer, "longer string"...)))
	printStrings("correct:", string(append(buffer, "short string"...)), longString)
	printStrings("incorrect:", string(append(buffer, "short string"...)), string(append(buffer, "longer string"...)))
	printStrings("correct:", f(string(append(buffer, "short string"...))), string(append(buffer, "longer string"...)))
}

https://go.dev/play/p/xHDQzbVo3W5

What did you see happen?

correct:
some short string
some longer string

correct:
some short string
some longer string

incorrect:
some longer strin
some longer string

correct:
some short string
some longer string

What did you expect to see?

correct:
some short string
some longer string

correct:
some short string
some longer string

incorrect:
some short string
some longer string

correct:
some short string
some longer string

I would expect for

a := f()
b := g()
h(a,b)

to have identical behavior to

h(f(), g())

but it seems not to be the case here if g() is string().

@ianlancetaylor
Copy link
Member

string(x) is a type conversion, not a function call. The order of evaluation rules (https://go.dev/ref/spec#Order_of_evaluation) have nothing to say about when type conversions are evaluated. In an expression like

	printStrings("incorrect:", string(append(buffer, "short string"...)), string(append(buffer, "longer string"...)))

the first call to append must run before the second call to append, but the type conversions to string may occur in any order and at any time after the append calls. In this case the second append call changes the contents of buffer, and the conversion to string is picking up the contents after the second call.

Closing because this behavior is permitted by the language spec.

@ianlancetaylor ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale May 31, 2024
@adam-azarchs
Copy link
Contributor Author

adam-azarchs commented Jun 1, 2024

Yes, it's true that this is technically permitted by the language spec. When it says

when evaluating the operands of an expression, ... all function calls, method calls, receive operations, and binary logical operations are evaluated in lexical left-to-right order.

it doesn't say anything about type conversions, which this technically is. However, it's no accident that the syntax for type conversion looks the same as the syntax for a function call, and in the case of string() converting from a []byte the compiler is in fact turning it into a function call

go/src/runtime/string.go

Lines 75 to 80 in 9b43bfb

// slicebytetostring converts a byte slice to a string.
// It is inserted by the compiler into generated code.
// ptr is a pointer to the first element of the slice;
// n is the length of the slice.
// Buf is a fixed-size buffer for the result,
// it is not nil if the result does not escape.

While that is of course an internal implementation detail, string(buf) looks like a function call, syntactically, and is in fact implemented as a function call, so I don't think it's unreasonable for someone to be surprised that this is undefined behavior.

In for example

func idString1(b []byte) string { return string(b) }
type idString2 string
func cat(s1, s2 string) { return s1 + s2 }
r1 := cat(idString1(append(buf, x)), idString2(append(buf, y)))
r2 := cat(idString2(append(buf, x)), idString1(append(buf, y)))

I think most people would be surprised that r1 and r2 do not match. Yes, order of operations for type conversions is undefined behavior, but I have a hard time seeing why. Possibly some future compiler optimization might be enabled by such reordering, but any case where such a reordering would have an observable effect would by definition be changing the behavior of programs that were previously generating correct outputs (or fixing the behavior of programs which weren't; this kind of string corruption is easy to not notice).

@ianlancetaylor
Copy link
Member

  1. Appending to the same buffer more than once in the same expression is clearly asking for trouble. Code like that should be discouraged even if it is clearly defined.
  2. The language is defined by a spec, not an implementation. We want code to be written to the spec, not the implementation. And in any case there are multiple implementations, and they behave differently when it comes to order of evaluation.
  3. If you really believe that the order of type conversions should be defined by the spec, please file a language change proposal. Thanks.

@zigo101
Copy link

zigo101 commented Jun 5, 2024

By the current spec, this is not a bug. It is just some counter-intuitive.
There is a proposal to make the rule more natural: #27804

BTW, the output of the following program changes (between Go toolchain 1.21 and 1.22). It is also not a bug.

package main

func main() {
  s := make([]byte, 0, 4)
  s = append(s, "go"...)
  defer println(
    string(append(s, "od"...)),
    string(append(s, "lf"...)),
  )
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants