Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: cannot unmarshal custom interface value #45512

Open
colin-sitehost opened this issue Apr 12, 2021 · 21 comments
Open

encoding/json: cannot unmarshal custom interface value #45512

colin-sitehost opened this issue Apr 12, 2021 · 21 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@colin-sitehost
Copy link

colin-sitehost commented Apr 12, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16.3 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2204759132=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/VMWbHeo54sj

p := private(*new(wrap))
err := json.Unmarshal([]byte("1"), &p)
fmt.Println(p, err)

p = new(wrap)
err = json.Unmarshal([]byte("1"), p)
fmt.Println(*p.(*wrap), err)

i := (interface{})(*new(wrap))
err = json.Unmarshal([]byte("1"), &i)
fmt.Println(i, err)

i = new(wrap)
err = json.Unmarshal([]byte("1"), i)
fmt.Println(*i.(*wrap), err)

What did you expect to see?

Any interface should be able to be assignable:

1 <nil>
1 <nil>
1 <nil>
1 <nil>

What did you see instead?

Instead it appears that interface{} is special, and unless main.private implements json.Unmarshaler it will error at runtime:

0 json: cannot unmarshal number into Go value of type main.private
1 <nil>
1 <nil>
1 <nil>

This restriction does not seem meaningful, and I could not find it documented in source why this was enforced. Can we not mutate the pointed value inside a provided interface regardless of type?

@mvdan
Copy link
Member

mvdan commented Apr 12, 2021

I wonder if this is a duplicate of #26946, since it also involved pointers to interface types which contain an underyling type to decode into. cc @dsnet

@colin-sitehost

This comment has been minimized.

@mvdan
Copy link
Member

mvdan commented Apr 12, 2021

Here's a tweaked program: https://play.golang.org/p/w6_c2eoUF7B

Note that, in the "it works" scenario, you end up decoding into interface{}, so you get a new value of default type float64 for a number.

That fails in the case where the interface has one method, because a basic type cannot satisfy a non-empty interface, hence the error.

What were you expecting should happen, in terms of the final types?

@mvdan
Copy link
Member

mvdan commented Apr 12, 2021

Also, if you were expecting to decode into the inner wrap type without a pointer, that can't work as it isn't addressable and thus modifiable. To allow that, the inner type needs to be a *wrap, just like in the other two cases which do work consistently.

So I think that this is working as intended, as far as I can tell. If you find anything confusing, perhaps we need better docs or error messages.

@dsnet
Copy link
Member

dsnet commented Apr 12, 2021

I consider the inability to unmarshal into an unaddressable value (which can occur with maps and interfaces) a flaw in the implementation. Unfortunately, it's hard to change this behavior as people have likely come to depend on the current behavior. 😞

@colin-sitehost
Copy link
Author

colin-sitehost commented Apr 12, 2021

So, to give a little more context, because I tried to boil down the example to just what was failing, but I messed that up: I am trying to deserialise an externally tagged enum:

type Exported struct{
     Shared string
     Enum Enum
}

type Enum interface{
        private()
}

type One struct{ /*...*/ }

func (One) private() {}

type Two struct{ /*...*/ }

func (Two) private() {}

type Three struct{ /*...*/ }

func (Three) private() {}

And I wanted to avoid having to do this:

func (e *Exported) UnmarshalJSON(b []byte) error {
        var wire struct{
               Shared string
               Type string
                Enum json.RawMessage
        }
        json.Unmarshal(b, &wire)
        switch wire.Type {
        case "one":
                var v One
                json.Unmarshal(wire.Enum, &v)
                e.Enum = v
        case "two":
                var v Two
                json.Unmarshal(wire.Enum, &v)
                e.Enum = v
        case "three":
                var v Three
                json.Unmarshal(wire.Enum, &v)
                e.Enum = v
        }
        return nil
}

Since all three cases do exactly the same thing, I wanted to do this:

        switch wire.Type {
        case "one":
                e.Enum = One{}
        case "two":
                e.Enum = Two{}
        case "three":
                e.Enum = Two{}
        }
        json.Unmarshal(wire.Enum, &e.Enum)

And it seemed intuitive that I should be able to address the concrete type inside Exported.Enum, but maybe that is not the case?

@mvdan
Copy link
Member

mvdan commented Apr 12, 2021

I think your best there is to use pointers. Interfaces always store pointers anyway, so it's not like storing a One is any more efficient than storing a *One.

@colin-sitehost
Copy link
Author

Is there a reason why I should not be allowed to use a value here? The limitation seems arbitrary.

I am also not sure what your perf argument is, but I would expect two pointers would be worse than one if that held water. Instead, I strongly prefer values in cases like this to enforce imutability.

@mvdan
Copy link
Member

mvdan commented Apr 12, 2021

Is there a reason why I should not be allowed to use a value here? The limitation seems arbitrary.

We likely can't change that behavior without breaking existing users, like @dsnet said earlier.

I would expect two pointers would be worse than one if that held water

Not sure what you mean. In your type One struct{ /*...*/ } example above, placing *One inside the interface is just one pointer/allocation. Placing a One is also one pointer/allocation, because interfaces can only contain pointers.

Instead, I strongly prefer values in cases like this to enforce imutability.

I tend to agree, but you're limited by what is possible between interfaces and the well established encoding/json behavior, I'm afraid.

@colin-sitehost
Copy link
Author

colin-sitehost commented Apr 12, 2021

We likely can't change that behavior without breaking existing users, like @dsnet said earlier.

Can you clarify or point to what would be a breaking change? I do not follow how adding the ability to do something that previously did not work is an issue.

I tend to agree, but you're limited by what is possible between interfaces and the well established encoding/json behavior, I'm afraid.

Is this documented anywhere? Calling this "well established" certainly caught me off guard.

@dsnet
Copy link
Member

dsnet commented Apr 12, 2021

If you haven't heard of it before, you might want to give Hyrum's Law a read.

The json package is I believe within the top 15 most used packages, and any change that results in an observable difference to how JSON is serialized or deserialized may break some user's use-case even if the new behavior is arguably better. A number of fixes we made in previous releases had to be rolled back because it was too disruptive.

@mknyszek mknyszek added this to the Backlog milestone Apr 12, 2021
@mknyszek mknyszek added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Apr 12, 2021
@mknyszek
Copy link
Contributor

Based on my reading of the issue, it doesn't look like this is something we can fix without the potential for widespread breakage. I'm inclined to close the issue, unless we can show that not many people depend on the behavior that might want to be changed.

@OneOfOne
Copy link
Contributor

@mknyszek can't it be fixed and be put behind a knob in the decoder?

@mknyszek
Copy link
Contributor

mknyszek commented Apr 12, 2021

I defer that decision to the owners of the package (of which @dsnet and @mvdan are two). In this context of this issue I am but a humble gardener that noticed this issue was not yet triaged. I don't know what the right answer is here.

@dsnet
Copy link
Member

dsnet commented Apr 12, 2021

There's quite a bit a functionality that we would like to change with json (as evidenced by the many issues opened against it). Theoretically, we can add a knob for every one of those changes, but that would lead to the json package being incomprehensible. We could also judiciously choose to only add knobs for the most pressing issues (for which this is probably not high on the list).

The best course of action is to systemically look at the entirety of the json package, and ask ourselves whether the summation of all the changes justify a v2 json. Maybe it does, maybe it doesn't.

@OneOfOne
Copy link
Contributor

@dsnet wouldn't that actually work fine though? adding a v2 package.

@dsnet
Copy link
Member

dsnet commented Apr 12, 2021

wouldn't that actually work fine though? adding a v2 package.

I'm not sure I understand the question. Of course a v2 package could fix this issue without worrying about backwards compatibility. That said, a v2 package is beyond the scope of this issue.

@colin-sitehost
Copy link
Author

I will say with generics so close, it may be better to try and poc an optimised implementation that does not rely upon runtime reflection, as it is so slow.

@dsnet
Copy link
Member

dsnet commented Apr 12, 2021

Generics may provide type safety in some situations, but is not a magic feature that's going to speed up JSON serialization. In order to make serialization of arbitrary Go data structures faster, you would need to be able to change the serialization logic at compile-time based on arbitrary Go types†. Generics does not provide that. The main ways to speed up JSON serialization‡ is either through the use of unsafe, which won't happen, or by code generating the (Un)MarshalJSON methods. However, tools for doing that exist outside the standard library.

† The generics proposal does allow type switching over specific types to provide specialized implementations. However, that doesn't help JSON since the set of possible types to switch over is not finite.
‡ I'm assuming that Go reflection is the bottleneck.

@colin-sitehost
Copy link
Author

In that case, I would request we start a v2 tracking issue to document what we would like to change and confirm when we hit a critical mass. (But I agree that is out of scope for this issue.)

@mvdan
Copy link
Member

mvdan commented May 5, 2021

@colin-sitehost that's been happening since last year, see https://twitter.com/mvdan_/status/1280133100673159171.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants