Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflect: add Value.FieldByIndexErr #48218

Closed
robpike opened this issue Sep 6, 2021 · 30 comments
Closed

reflect: add Value.FieldByIndexErr #48218

robpike opened this issue Sep 6, 2021 · 30 comments

Comments

@robpike
Copy link
Contributor

robpike commented Sep 6, 2021

See #48215, among others.

It's panicking because Value.FieldByIndex follows the chain of embedded fields and encounters a nil. The only way for the caller of FieldByIndex to prevent the panic for arbitrary data is essentially to implement the type walk itself. I have a fix for the issue that wraps FieldByIndex to turn the panic into an error, but it seems that reflect itself should provide that.

So I have a mild proposal that reflect.Value get a new method that returns an error instead of panicking. (We can't change the original as it would require a signature change.) It's trivial to implement; the only hard part is choosing a name.

@robpike robpike changed the title reflect: provide a variant of FieldByIndex that does not panic proposal: reflect: provide a variant of FieldByIndex that does not panic Sep 6, 2021
@gopherbot gopherbot added this to the Proposal milestone Sep 6, 2021
@Splizard
Copy link

Splizard commented Sep 7, 2021

TryFieldByIndex?

@Skarlso
Copy link
Contributor

Skarlso commented Sep 8, 2021

FieldByIndexErr - keeping it simple
FieldByIndexWithError
FieldByIndexNoPanic - more descriptive
StableFieldByIndex || FieldByIndexStable - following some convention here and there where Stable depicts no panicking or crashing allowed.

@mvdan
Copy link
Member

mvdan commented Sep 8, 2021

Names aside, we also have the option of CanFieldByIndex(index []int) bool, similar to others like CanInt or CanConvert. At least it would be more consistent with the rest of the API design.

@robpike
Copy link
Contributor Author

robpike commented Sep 8, 2021

@mvdan That would require a double iteration and arguably more code at the call site, which is unfortunate. Isn't the error return pattern a better model?

@mvdan
Copy link
Member

mvdan commented Sep 8, 2021

Probably. The intent was to list another option - the more API-consistent - for the sake of laying it all out.

@rsc
Copy link
Contributor

rsc commented Sep 8, 2021

This seems worth doing. I suggest FieldByIndexErr, and it returns an error instead of panicking for invalid indexes or nils, and the error for an unexpected embedded nil something like "nil in T1's embedded *T2", to give useful context about where the nil is.

@rsc
Copy link
Contributor

rsc commented Sep 8, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rhcarvalho
Copy link
Contributor

Is there any similar function in the stdlib that returns an error instead of panicking?

Naively looking through GOROOT I didn't see any other exported function ending with Err, so that does look a bit odd.

$ rg 'func [A-Z]\w+Err\(' $(go env GOROOT)/src --glob '*.go' --glob '!*_test.go'
$ rg 'func [A-Z]\w+Error\(' $(go env GOROOT)/src --glob '*.go' --glob '!*_test.go' | pbcopy
/usr/local/Cellar/go/1.17/libexec/src/syscall/syscall_plan9.go:func NewError(s string) error { return ErrorString(s) }
/usr/local/Cellar/go/1.17/libexec/src/syscall/zsyscall_windows.go:func GetLastError() (lasterr error) {
/usr/local/Cellar/go/1.17/libexec/src/os/error.go:func NewSyscallError(syscall string, err error) error {
/usr/local/Cellar/go/1.17/libexec/src/go/scanner/errors.go:func PrintError(w io.Writer, err error) {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/mod/module/module.go:func VersionError(v Version, err error) error {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/xerrors/adaptor.go:func FormatError(f Formatter, s fmt.State, verb rune) {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/sys/windows/zsyscall_windows.go:func GetLastError() (lasterr error) {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/sys/unix/gccgo.go:func SyscallNoError(trap, a1, a2, a3 uintptr) (r1, r2 uintptr) {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/sys/unix/gccgo.go:func RawSyscallNoError(trap, a1, a2, a3 uintptr) (r1, r2 uintptr) {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/sys/unix/syscall_linux_gc.go:func SyscallNoError(trap, a1, a2, a3 uintptr) (r1, r2 uintptr)
/usr/local/Cellar/go/1.17/libexec/src/cmd/vendor/golang.org/x/sys/unix/syscall_linux_gc.go:func RawSyscallNoError(trap, a1, a2, a3 uintptr) (r1, r2 uintptr)
/usr/local/Cellar/go/1.17/libexec/src/cmd/api/testdata/src/pkg/p2/p2.go:func NewError(s string) error {}
/usr/local/Cellar/go/1.17/libexec/src/cmd/go/internal/mvs/errors.go:func NewBuildListError(err error, path []module.Version, isVersionChange func(from, to module.Version) bool) *BuildListError {
/usr/local/Cellar/go/1.17/libexec/src/cmd/go/internal/robustio/robustio.go:func IsEphemeralError(err error) bool {
/usr/local/Cellar/go/1.17/libexec/src/cmd/vet/testdata/print/print.go:func UnexportedStringerOrError() {

In the opposite case, we use Must when we want a panic instead of an error, so perhaps what I'm asking is whether we have an established counterpart to "Must"?


I decided to look at my my GOPATH / module cache. I did find a few occurrences of func SomethingOrError(...) error. Would FieldByIndexOrError be a good fit for the func proposed here?

// FieldByIndexOrError returns the nested field corresponding to index.
// It returns a non-nil error if v's Kind is not struct or if (...).
func (v Value) FieldByIndexOrError(index []int) (Value, error)

@dsnet
Copy link
Member

dsnet commented Sep 9, 2021

Currently FieldByName returns a "zero Value" if no field was found. Can't we change FieldByIndex to also return a "zero Value" if no field was found. Generally, it is backwards compatible to switch from case where it panicked to one where it doesn't. Case that did rely on a panic will probably still panic since most usages of a zero Value panic.

@robpike
Copy link
Contributor Author

robpike commented Sep 9, 2021

@dsnet There are two reasons I'd rather not go that way, at least exclusively.

The first is that code that panics is annoying, but at least the panic points to the line where things went wrong. If FIeldByIndex returned a zero value, the code might (and existing code certainly would) continue to execute and would either crash somewhere else, making the problem more obscure, or execute to completion incorrectly.

The second is that FieldByIndex is unusual in that it loops. That's the root of the problem: It's difficult for the caller to protect against bad data because it might be nested deep. Thus when it does fail, the reason why is still hidden. I would like an error to come back because FieldByIndex knows what field was nil. If it just returns a zero value, the caller must still do all the work we are trying to avoid in order to give a helpful error message.

So although we might want to change the function not to panic as well, for the sake of debugging and helpful error messages we should have a variant that explains what went wrong, and an error return is the canonical way to do that.

@dsnet
Copy link
Member

dsnet commented Sep 9, 2021

but at least the panic points to the line where things went wrong

That's a good argument.

I would like an error to come back because FieldByIndex knows what field was nil. If it just returns a zero value, the caller must still do all the work we are trying to avoid in order to give a helpful error message.

Is the error intended for human consumption, machine interpretation, or both?

Many of the suggestions above return an error. I'm somewhat hesitant about an error return type because:

  1. I suspect there are valid use cases to be able to machine interpret where the problem occurred, and an error is a sharp API since you'll need to type assert it to some named error type for more semantic information, and
  2. the API of reflect thus far tends to avoid ever returning an error. As @mvdan notes, it seems that much of the API is built around predicates that tell you whether an operation is possible (e.g., Value.CanXXX) and the operation itself which may panic (e.g., Value.Set).

If we were to add new API, I'd prefer to see something that didn't return an error but returns sufficient information to explain what's happening. Perhaps:

// FieldByIndexN iterates through index successively trying to index the nth field of v.
// If the current value is not a struct, is a nil pointer, or the index is out of bounds,
// then iteration ends returning the current value and the iteration index.
//
// FieldByIndexN (if fully successful) is equivalent to returning (v.FieldByIndex(index), len(index)).
func (v Value) FieldByIndexN(index []int) (Value, int)

Thus, you can easily construct your own error message from this information, but also make use of this for machine interpretation of whether indexing was valid.

@robpike
Copy link
Contributor Author

robpike commented Sep 9, 2021

I understand the reluctance to go against the grain of the existing API in the reflect package, but Value.FieldByIndex is unusual (unique?) in the way it operates, and the reflect API is very old. I don't think we do the same thing today if we stared over. Those zero values are seriously annoying to program against, for example. I'd rather just know what went wrong. I've written more reflect-driven code than most, and it's always painful tracking down missed zero values when they go wrong.

For the particular FieldByIndexN you suggest, it unpacks the simple loop inside FieldByIndex and throws it back to the caller, making it needlessly cumbersome and slower. But that loop is already written, tightly, inside the existing function.

So although the arguments not to just return an error make sense in isolation, in modern Go errors are what we do. Inventing a new way of looping, or adhering to an API convention that hasn't weathered well, is so much messier than the trivial change to just add an error return.

@dsnet
Copy link
Member

dsnet commented Sep 12, 2021

If we're going to add new API for this, I'd like to discuss whether related functionality should also be added or not.

In the "encoding/json" package, we have logic during Unmarshal that is similar to Value.FieldByIndex, but instead of panicking on nil pointers, we expect it to allocate a new struct on our behalf and proceed indexing into the nested struct. You could call this Value.MutableFieldByIndex or something.

I'm not seriously proposing adding this method, but it's not clear to me why one function might meet the bar for inclusion and not this one.

In both cases, working around the lack of a native method for this is about 10-20 lines of code.

@robpike
Copy link
Contributor Author

robpike commented Sep 12, 2021

@dsnet I am not sure what you are saying here. You're mixing two packages and have some hypotheticals in there. Can you please clarify?

@dsnet
Copy link
Member

dsnet commented Sep 12, 2021

I'm making the argument that other useful variants of FieldByIndex exist beyond just having one that avoids panicking. Given the existence of other variants, I'd like to know why this variant meets the bar for inclusion over other reasonable variants (e.g., MutableFieldByIndex mentioned in my comment).

@robpike
Copy link
Contributor Author

robpike commented Sep 12, 2021

I suggest that there should have been only one variant, the one I am proposing, as it is sufficient and easy to use. History has saddled us with a panicking variant, which is troublesome.

Note that this is not a widely used function. In the standard library, outside of tests it appears only a handful of times, in gob and xml and template, and in one vendored package.

So another answer, which I would be fine with, is to leave things alone and I'll just deal with the panics with a wrapper. I opened the issue to get a sense of what people want. Sometimes "nothing new" is the right answer.

@dsnet
Copy link
Member

dsnet commented Sep 12, 2021

I suggest that there should have been only one variant, the one I am proposing, as it is sufficient and easy to use.

Easy to use is nice, but performance is key too. The currently proposed variant returns an error, which probably implies an allocation (otherwise I question the utility of an error return value). As such, I think that prevents it's use in performance sensitive code. For example, in "encoding/json":

fv := v
for _, i := range f.index {
if fv.Kind() == reflect.Ptr {
if fv.IsNil() {
continue FieldLoop
}
fv = fv.Elem()
}
fv = fv.Field(i)
}

While it would be correct to switch the logic above to the variant proposed here, we probably still wouldn't do it for performance reasons.

@Skarlso
Copy link
Contributor

Skarlso commented Sep 12, 2021

@dsnet Before jumping the "but does it perform" gun, I very much would like to see some metrics about this claim. I suspect the change is negligible, but definitely let's back that up with some numbers. ;)

@dsnet
Copy link
Member

dsnet commented Sep 12, 2021

I very much would like to see some metrics about this claim.

Sure. https://play.golang.org/p/D2O37buuqde shows that FieldByIndexN (as I proposed above) is 2.7x faster than FieldByIndexErr and this is with FieldByIndexErr returning a fairly useless error. If we have FieldByIndexErr spend effort to construct something more meaningful like which field was nil, then it ends up being 12x slower.

Also, usage of FieldByIndexN in the json logic would be something like:

fv, n := v.FieldByIndexN(f.index)
if n < len(f.index) {
    continue FieldLoop
}

@Skarlso
Copy link
Contributor

Skarlso commented Sep 12, 2021

Cheers, thanks for the numbers! That's a good point, so document that in the comment of the function that they will sacrifice performance in turn of stability.

@robpike
Copy link
Contributor Author

robpike commented Sep 12, 2021

@dsnet I don't understand your point about allocation. In the overwhelming majority of cases the error will be nil, and there will be no allocation. The case that causes the problem is a nested embedded field with a nil pointer to a struct inside. That's very rare.

But you seem to have a very different idea about this whole problem. I am proposing making it easier for the few clients that use this routine to be shorter and safer. There is a mismatch in our expectations here. Nothing I have seen in practice would indicate this is a bottleneck. But you are arguing that this a performance-critical function.

FieldByIndexN would serve, but its client would need to do more work to get a valuable error. I guess that's our other disagreement: I want a good error return from the function—in fact that's all I want—but you don't seem as interested in that.

Look, this is a tiny matter that shouldn't require this much discussion. Let's see what the proposal committee thinks.

@rsc
Copy link
Contributor

rsc commented Sep 12, 2021

I agree about this being overanalyzed.
In the reflect API, misuse - liking calling .Len on a pointer or passing invalid indexes - panics.
We are not going to reconsider that decision today.

When we added FieldByIndex, we thought the only possible error was misuse, hence no error.

Embedded nil struct pointers are not misuse:
they are a property of the input that cannot be predicted by the caller.
Given an index set that is clearly valid for t.FieldByIndex(index),
there is no way to predict whether it is valid for v.FieldByIndex(index)
short of executing the exact code that the function does.

Remember the context from #48215:

type A struct {
	S string
}
type B struct {
	*A
}
tmpl := template.Must(template.New("").Parse(`{{ .S }}`))
// tmpl.Execute panics
if err := tmpl.Execute(io.Discard, B{}); err != nil {
	t.Fatal(err)
}

When the template engine's call to v.FieldByIndex panics in this case, that is not from misuse.

Yes, we could look into returning a zero value from FieldByIndex.
But we adjusted the panic message in 2014.
I can only surmise we believed then that we couldn't change the fact of the panic for fear of breaking callers.
Surely that argument is only stronger now.

That leaves adding a variant that doesn't panic.
We could of course make the variant return a zero value, but then the error will be "no field S", which is confusing.
Returning an error lets us give a clearer message: "B has embedded nil *A".
That seems like a clear win.

Yes, there are no functions that return errors today in reflect,
but that's because the API was designed entirely so that the only potential failures were due to misuse.
The combination of nil embedded struct pointers and FieldByIndex is the only case I see that is a non-misuse failure.

We also have no precedent for what to do with a function that has no error result but turns out to need one.
We are usually more careful than that.
We just didn't see the interactions here early enough.

So we are in uncharted waters in two different ways here.
FieldByIndexErr seems like a completely reasonable course to chart in this unusual situation.

@rsc rsc changed the title proposal: reflect: provide a variant of FieldByIndex that does not panic proposal: reflect: add Value.FieldByIndexErr Sep 15, 2021
@bcmills
Copy link
Contributor

bcmills commented Sep 15, 2021

My litmus test for reflect is that it should express operations that are possible in ordinary Go code in a straightforward way, and should not try to simplify operations that are not possible in ordinary Go code.

The ordinary Go equivalent of the code in #48215 is something like this (https://play.golang.org/p/hgtM9jD2DnL):

func TestFieldAccess(t *testing.T) {
	type A struct {
		S string
	}
	type B struct {
		*A
	}

	b := B{}
	fmt.Println(b.S)
}

A panic-avoiding version of that ordinary Go code today would have to traverse each (implicit) field individually, and check for nil at each step:

	b := B{}
	if a := b.A; a != nil {
		fmt.Println(a.S)
	}

And that is exactly what the equivalent reflect code must do today, using reflect.Value.IsNil and reflect.Value.Field instead of reflect.FieldByIndex (#48218 (comment)).

A variant of the .S operation that avoids the panic might look like a _, ok form:

func TestFieldAccess(t *testing.T) {
	type A struct {
		S string
	}
	type B struct {
		*A
	}
	if s, ok := b.S; ok {
		fmt.Println(b.S)
	}
}

And that is more-or-less exactly what the proposed FieldByIndexErr would implement.

So I think we should add a panic-avoiding shorthand to the reflect.Value API if (and only if!) we add the corresponding panic-avoiding shorthand to the language proper. Otherwise, reflect-based code that needs to traverse embedded fields of pointer types can use the same pattern required in ordinary Go code.

@robpike
Copy link
Contributor Author

robpike commented Sep 15, 2021

Starting to believe the right response here is to make the function document a little stronger, and just to use the recover I've already written.

No good deed goes unpunished.

@rsc
Copy link
Contributor

rsc commented Sep 22, 2021

I disagree for the specific case of FieldByIndex. We could have avoided adding that in the first place, but we decided it was so incredibly common as to merit having it, to avoid making everyone reinvent it. Everyone will have to reinvent FieldByIndexErr too, one way or another. We should address this need.

It's true that FieldByIndex is special. But it really is special (and common).

@rsc
Copy link
Contributor

rsc commented Oct 6, 2021

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Oct 13, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: reflect: add Value.FieldByIndexErr reflect: add Value.FieldByIndexErr Oct 13, 2021
@rsc rsc modified the milestones: Proposal, Backlog Oct 13, 2021
@robpike robpike self-assigned this Oct 24, 2021
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/357962 mentions this issue: reflect: add FieldByIndexErr

@NicklasWallgren
Copy link

Has there been any discussions around the possibility of adding FieldByNameErr as well?

@ianlancetaylor
Copy link
Member

@NicklasWallgren Not that I know of. One can always call v.Type().FieldByName and then pass the Index field to v.FieldByIndexErr.

@rsc rsc unassigned robpike Jun 23, 2022
@rsc rsc moved this to Accepted in Proposals Aug 10, 2022
@rsc rsc added this to Proposals Aug 10, 2022
@rsc rsc removed this from Proposals Nov 2, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests