Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: cannot unmarshal into unexported embedded pointer types #21357

Closed
dsnet opened this issue Aug 8, 2017 · 31 comments
Closed

encoding/json: cannot unmarshal into unexported embedded pointer types #21357

dsnet opened this issue Aug 8, 2017 · 31 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. release-blocker
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Aug 8, 2017

A consequence of fixing #21353 is that you cannot round-trip marshal/unmarshal the following:

type embed struct{ X int }
type S struct{ *embed }

in := &S{&embed{5}}
out := &S{}

b, _ := json.Marshal(in) // string(b) == `{"X": 5}`

// Panics, since out.embed is nil and it cannot initialize that field.
json.Unmarshal(b, out)

// Manually allocated embed would allow this to work:
out.embed = &embed{}
json.Unmarshal(b, out)

The problem is that the json package had been relying on a bug within the reflect package, where it was able to allocate the unexported field. With that being fixed in #21353, what should be the correct behavior when unmarshaling into a pointer of an embedded unexported struct type? We can either:

  • Always ignore the embedded struct
  • Only ignore the embedded struct if not allocated
  • Return an error when the embedded struct is not allocated

I believe the 3rd option is the least surprising.

\cc @zombiezen @adams-sarah @cybrcodr @jba @pongad

@dsnet dsnet added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. NeedsFix The path to resolution is known, but the work has not been done. release-blocker labels Aug 8, 2017
@dsnet dsnet added this to the Go1.10 milestone Aug 8, 2017
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 8, 2017
@pongad
Copy link
Contributor

pongad commented Aug 9, 2017

Sorry if this has been raised before, I'm not yet very familiar with this issue.

What if we slightly change the problem to this:

package other

type embed struct{ X int }
type S struct { *embed }

func NewS(x int) *S {
  return &S{&embed{x}}
}

package main
in := other.NewS(5)
out := &other.S{}
b, _ := json.Marshal(in) // string(b) == `{"X": 5}`
json.Unmarshal(b, out)

I assume Unmarshal panics, because of the same reason.
But now we can't out.embed = &other.embed{} because we can't access the field right? IIUC, to make the round-trip work in general, packages must provide a way to allocate all embedded structs with exported members. This is easy in the above example, just call NewS(0), but I can see it being more problematic as structs grow.

I think choice 1 has the advantage that unmarshalling into an embedded field would unconditionally not work instead of working only if some properties you might not have control over is satisfied.

@dsnet
Copy link
Member Author

dsnet commented Aug 9, 2017

It's not json.Marshal, but json.Unmarshal because it does not have access to S.embed, which is a private field.

@pongad
Copy link
Contributor

pongad commented Aug 9, 2017

I assume Marshal panics, because of the same reason.

Do you mean in the line above? You're right, I mistyped. Edited.

@dsnet dsnet added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Aug 9, 2017
@cybrcodr
Copy link
Contributor

cybrcodr commented Aug 9, 2017

I support the solution of returning an error. Users will need to either explicitly construct the embed instance if they do care about unmarshaling values into it, or if they don't, they can tag it with json:"-" to skip unmarshaling.

@gopherbot
Copy link

Change https://golang.org/cl/53643 mentions this issue: cmd/compile: consider pointers to unexported embedded types as unexported

@dsnet dsnet self-assigned this Aug 12, 2017
@rsc
Copy link
Contributor

rsc commented Aug 21, 2017

Maybe we should just ignore/reject pointers to unexported embedded types unconditionally. The "only if they're allocated" is a bit surprising.

@s-mang
Copy link
Contributor

s-mang commented Aug 21, 2017

Unless I am mistaken, this would also break something like:

type a struct {
    B string
}

type Parent struct {
    a *a `json:"A"`
}

right?

I think this is probably more common than an embedded unexported pointer.

@dsnet
Copy link
Member Author

dsnet commented Aug 21, 2017

In your example, it's the B field that is embedded into a, from the view of Parent, there is no embedding happening.

@rsc, are you proposing ignoring pointers to unexported embedded types only for unmarshal or marshal+unmarshal?

@s-mang
Copy link
Contributor

s-mang commented Aug 21, 2017

my mistake, it looks like json pkg ignores a field despite the tag.
i was sure we did this in datastore, but... no matter.

@gopherbot
Copy link

Change https://golang.org/cl/60410 mentions this issue: cmd/compile: fix and improve struct field reflect information

@rsc
Copy link
Contributor

rsc commented Aug 31, 2017

Yes, I was proposing to ignore pointers to unexported embedded types entirely.

@rsc
Copy link
Contributor

rsc commented Aug 31, 2017

(For both Marshal and Unmarshal, unconditionally, no matter what their values.)

@dsnet dsnet added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Aug 31, 2017
@dsnet
Copy link
Member Author

dsnet commented Aug 31, 2017

I'll implement the change and see what breaks as a result of this (hopefully nothing).

gopherbot pushed a commit that referenced this issue Sep 5, 2017
The previous logic was overly complicated, generated suboptimally
encoded struct type descriptors, and mishandled embeddings of
predeclared universal types.

Fixes #21122.
Fixes #21353.
Fixes #21696.
Fixes #21702.
Updates #21357.

Change-Id: If34761fa6dbe4af2af59dee501e7f30845320376
Reviewed-on: https://go-review.googlesource.com/60410
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/76851 mentions this issue: encoding/json: always ignore embedded pointers to unexported struct types

@dsnet
Copy link
Member Author

dsnet commented Nov 10, 2017

I'm sad to discover that this breaks many packages in github.com/google/google-api-go-client.

For example in dataflow/v1b3/dataflow-gen.go:L264-L276:

func (s *ApproximateProgress) UnmarshalJSON(data []byte) error {
	type noMethod ApproximateProgress
	var s1 struct {
		PercentComplete gensupport.JSONFloat64 `json:"percentComplete"`
		*noMethod
	}
	s1.noMethod = (*noMethod)(s)
	if err := json.Unmarshal(data, &s1); err != nil {
		return err
	}
	s.PercentComplete = float64(s1.PercentComplete)
	return nil
}

The good news is that the fix is trivial: just export noMethod as NoMethod. Also, this is generated code, so it should be easy to fix the generator to stop doing this and output something valid.

\cc @zombiezen Who owns the generator for google-api-go-client?

@zombiezen
Copy link
Contributor

Effectively, @jba and I do. We can roll out a generated code fix pretty quickly.

@dsnet
Copy link
Member Author

dsnet commented Nov 10, 2017

Can you make the suggested change? Essentially s/noMethod/NoMethod/g. Since the type is defined within the function, it will not be publicly accessible.

I can review the CL.

@jba
Copy link
Contributor

jba commented Nov 10, 2017

I happen to be working in the neighborhood. I'll do it.

@zombiezen
Copy link
Contributor

https://code-review.googlesource.com/19571 regenerates the code.

@jba
Copy link
Contributor

jba commented Dec 1, 2017

As of 6be1c09, the json implementation is broken with respect to duplicate fields. Same-named fields at the same nesting level should annihilate each other, as this playground example shows. But at tip, the json-encoded string is {"Dup":1} instead of the correct {}.

@dsnet
Copy link
Member Author

dsnet commented Dec 1, 2017

If the new behavior is to consistently ignore any exported fields from embedded pointers to unexported structs, then this seems to be working as intended.

It seems inconsistent to me if we ignore unmarshaling exported fields from embedded pointers to unexported structs, but still consider such fields for field annihilation.

@jba
Copy link
Contributor

jba commented Dec 2, 2017

There's an inconsistency either way. Whatever the ability of the reflect package to allocate certain values, a duplicate field cannot be selected from a struct. That's a fact about the type, not about the value. As it stands now, encoding/json behaves inconsistently w.r.t. the type. With

type T struct { embed1;  *embed2 }
type embed1 struct{ Dup int }
type embed2 struct{ Dup int }

Dup is marshaled, but change *embed2 to embed2 or Embed2 or *Embed2, and it isn't. In all four cases, T{}.Dup will not compile.

@jba
Copy link
Contributor

jba commented Dec 2, 2017

I should say that this is obviously an edge case, and I have no stake in the answer. I just want to make sure we think about it.

@dsnet dsnet added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Dec 2, 2017
@rsc
Copy link
Contributor

rsc commented Dec 4, 2017

CL 76851 was both mailed and submitted after the freeze, and it is breaking things in questionable ways, such that we're not sure what the right semantics are. Given all that, it seems like the right thing for Go 1.10 is to revert CL 76851, to restore the Go 1.9 behavior.

@dsnet
Copy link
Member Author

dsnet commented Dec 4, 2017

It’s not possible to revert to Go1.9 behavior without also reverting the compiler fix that CL/60410 addresses.

@rsc
Copy link
Contributor

rsc commented Dec 6, 2017

I looked through this again. This is really unfortunate. Going back to Joe's original example:

type embed struct{ X int }
type S struct{ *embed }

in := &S{&embed{5}}
out := &S{}

b, _ := json.Marshal(in) // string(b) == `{"X": 5}`

// Panics, since out.embed is nil and it cannot initialize that field.
json.Unmarshal(b, out)

// Manually allocated embed would allow this to work:
out.embed = &embed{}
json.Unmarshal(b, out)

We tried ignoring that field entirely, and that broke in various ways, so I think it seems pretty clear we should move on to Joe's option 3: return an error if we want to fill out embed but are now prevented from doing that.

Ignoring the field is much more surprising than I realized: it's a silent behavioral change. Option 3 should only change some former accidental successes into error results, which is usually preferred anyway.

@dsnet, can you look into a fix for encoding/json? (If not let me know and I will make time.) Thanks.

@dsnet
Copy link
Member Author

dsnet commented Dec 6, 2017

Can do.

@gopherbot
Copy link

Change https://golang.org/cl/82135 mentions this issue: encoding/json: error when trying to set an embedded pointer to unexported struct types

@dsnet
Copy link
Member Author

dsnet commented Dec 7, 2017

The new policy fixed all the known breakages inside Google (with the exception of googleapis/google-cloud-go#813) without any changes to user code.

lestrrat added a commit to lestrrat-go/jwx that referenced this issue Feb 2, 2018
sfllaw added a commit to sfllaw/hkp that referenced this issue Mar 3, 2018
Because of golang/go#21353, it is no longer possible for
`json.Unmarshal` to set values into unexported structs. This is reported
in golang/go#21357:

`json: cannot set embedded pointer to unexported struct: jsonhkp.publicKey`

The correct solution is to export the struct, because it is a public
interface.

Contributed by @quantcast: https://quantcast.com.
jba added a commit to googleapis/google-cloud-go that referenced this issue Jul 12, 2018
DO NOT SUBMIT: test still fails because of bug in encoding/json fix.

As of Go 1.10, a bug in reflect has been fixed, so it is no longer
possible to create pointers to embedded, unexported structs. As a result,
encoding/json's output has changed in these cases. (See golang/go#21357).

We make the same change in our fields package, for consistency.

Fixes #813.

Change-Id: I836ddc6f9c4e8ef9e37643b66f287f2916799c93
rvdwijngaard pushed a commit to rvdwijngaard/aws-xray-sdk-go that referenced this issue Sep 26, 2018
this commit fixes a runtime panic "cannot set embedded pointer to unexported struct: sampling.samplingRule"

see golang/go#21357
@golang golang locked and limited conversation to collaborators Dec 7, 2018
@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. release-blocker
Projects
None yet
Development

No branches or pull requests

8 participants