Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: panic on embedded struct pointer that implements Unmarshaler #30148

Closed
segevfiner opened this issue Feb 9, 2019 · 12 comments
Closed

Comments

@segevfiner
Copy link
Contributor

segevfiner commented Feb 9, 2019

What version of Go are you using (go version)?

$ go version
go version go1.11.5 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="<snip>/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="<snip>/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build174238184=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/c6Q_ehlEHMQ

What did you expect to see?

Unmarshaling works.

What did you see instead?

panic: json: Unmarshal(nil *main.A)

Analysis

This is caused by encoding/json not initializing (new) the embedded struct pointer field. Type asserting the embedding struct to Unmarshaler works, with the interface value holding a pointer to the embedding struct, but on calling the UnmarshalJSON method it gets called with the embedded struct as the receiver, which is nil.

Also see go-yaml/yaml#436

@dsnet
Copy link
Member

dsnet commented Feb 9, 2019

I don't think there's anything that we can do without changing the reflect package (and since the reflect package is a dynamic representation of the Go language, this may mean changing the Go language itself).

The fundamental problem is that embedded types have their methods forwarded to the parent. Thus, type B has an UnmarshalJSON method. From the perspective of the json package, it has no idea that it needs to also allocate a value of type A since it sees the UnmarshalJSON method on B first and calls it. You could argue that json should investigate whether the UnmarshalJSON was forwarded from embedding, and if so, allocate the necessary embedded receivers. However, reflect doesn't provide an API to distinguish between methods directly declared on the parent, versus those from embedded children.

@dsnet
Copy link
Member

dsnet commented Feb 9, 2019

Also, the fact that you can't directly call UnmarshalJSON on an allocated B suggest that the user code itself is problematic.

var b B
err := b.UnmarshalJSON([]byte(`{"hello": "world"}`))
fmt.Println(err) // json: Unmarshal(nil *main.A)

@segevfiner
Copy link
Contributor Author

segevfiner commented Feb 10, 2019

I guess fixing this would require extending reflect with the ability to give you information about the chain of embedded structs from where a method implementation comes from. But maybe it also requires additional special handling so that an embedded UnmarshalJSON doesn't interfere with serializing the struct itself.

Any sort of input on the zero value of B will trigger this since encoding/json will try to call it's embedded UnmarshalJSON method.

@dsnet
Copy link
Member

dsnet commented Feb 10, 2019

There is no way in Go to embed a type and not have it's method set be forwarded. Embedding forwards both fields and methods and is a common gotcha in Go and a source of bugs.

That said, you can destructively cancel out the forwarded method if another embedded type has the same method. In such a case, the parent type will not have forwarded methods that have naming conflicts.

Consider this modified example of yours: https://play.golang.org/p/UnR4h63VkwH

However, I should note that this approach is a hack. In general I recommend against the use of embedding. It tends to cause more problems than it solves.

@katiehockman
Copy link
Contributor

@segevfiner is this still a bug that you want addressed, or do you feel that your question has been answered? If you think this is more of a question, and less of a bug at this point, you can continue the conversation here: https://golang.org/wiki/Questions

@segevfiner
Copy link
Contributor Author

Definitely a bug. If an embedded value works, why shouldn't an embedded pointer? Also considering that this is not documented anywhere, and calls the user UnmarshalJSON with a nil pointer receiver which is completely unexpected here and will likely cause a random panic suprising the user/developer.

@katiehockman
Copy link
Contributor

@segevfiner thanks for following up.

/cc @rsc

@katiehockman katiehockman added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 11, 2019
@dsnet
Copy link
Member

dsnet commented Feb 11, 2019

Definitely a bug.

I disagree.

If an embedded value works, why shouldn't an embedded pointer?

Embedded pointers already have asymmetric behavior compared to the non-pointer versions.

Examples:

// type B { *A; *a }
b.A = 5        // may panic for embedded pointer, never panics for embedded value
b.a = new(int) // impossible to initialize an embedded unexported pointer (see #21357)

I don't believe asymmetry here is an argument that this is a bug.

calls the user UnmarshalJSON with a nil pointer receiver which is completely unexpected here

The json did call UnmarshalJSON with an allocated pointer to (*B).UnmarshalJSON. It is the Go language semantics that then forwarded the call to the underlying (*A).UnmarshalJSON with a nil pointer. Again, this is not the json package's fault but rather surprising behavior around embedding in Go.

cause a random panic surprising the user/developer.

The json package is not panicking. Your example code is calling panic on L30.

I don't think there is anything actionable here for json other than a proposal to first change reflect or the Go language.

@dsnet dsnet removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 11, 2019
@segevfiner
Copy link
Contributor Author

Whether you call this a bug or not. It's still a behavior that surprised me and I didn't expect.

Of course they have different/asymmetric behavior, but Go does try to hide some of the differences where it can between pointers and values. Not all cases of course since they arw different after all.

Do consider that this can easily happen in some deeply nested struct inside some other struct and maybe even on some slice of such structs which allocates items dynamically.

Yeah json isn't panicking. But the user UnmarshalJSON called with an unexpected nil receiver can easily do.

Except documenting this, if there is any convention of documenting such gotchas... It's quite possible that fixing this would require extending/modifying reflect to be able to fix this.

@dsnet
Copy link
Member

dsnet commented Feb 11, 2019

Embedded types are in general difficult to use properly and naturally lead to surprisingly spooky action at a distance. This is a general gotcha that isn't specific to json. As such, I don't believe we need to complicate the json documentation (which is already complicated) with gotchas from the language itself. We cannot possibly document every edge-case and expect the documentation to still be readable, especially in a case where it is not the fault of json.

@katiehockman
Copy link
Contributor

I'm going to close this issue, then, and we can re-open if this needs to be evaluated again later on.

@sabey
Copy link

sabey commented Jan 24, 2020

If it's impossible to allocate a *nestedField couldn't we alternatively check within the indirect function if the Unmarshaller belongs to a struct, and if so loop over the anonymous fields and if only one anonymous field contains the Unmarshaller interface return an error that it's unsupported instead of panicking?

@golang golang locked and limited conversation to collaborators Jan 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants