Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/gob: roundtrip fail for a struct embedding a field implementing Gob{Decoder,Encoder} #22172

Closed
cznic opened this issue Oct 7, 2017 · 8 comments

Comments

@cznic
Copy link
Contributor

cznic commented Oct 7, 2017

What version of Go are you using (go version)?

go version go1.9.1 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/jnml"
GORACE=""
GOROOT="/home/jnml/go"
GOTOOLDIR="/home/jnml/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build919481852=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
$ 

What did you do?

$ cat main.go 
package main

import (
	"bytes"
	"encoding/gob"
	"fmt"
)

type T int

func (t *T) GobEncode() ([]byte, error) { return nil, nil } // Alt. eg. 'return []byte{42}, nil' works the same.
func (t *T) GobDecode(b []byte) error   { return nil }

type U struct {
	B bool
	T // Alt. 'X T' makes the failure go away.
}

func main() {
	out := &U{
		B: true,
	}

	f := bytes.NewBuffer(nil)
	enc := gob.NewEncoder(f)
	if err := enc.Encode(out); err != nil {
		panic(err)
	}

	var in U
	dec := gob.NewDecoder(f)
	if err := dec.Decode(&in); err != nil {
		panic(err)
	}

	if in != *out {
		fmt.Printf("got\n%#v\nexp\n%#v\n", in, *out)
	}
}
jnml@4670:~/src/tmp$ go run main.go 

Playground

What did you expect to see?

Nothing

What did you see instead?

got
main.U{B:false, T:0}
exp
main.U{B:true, T:0}
@cznic cznic changed the title gob: roundtrip fail for a struct embedding a field implementing Gob{Decoder,Encoder} encoding/gob: roundtrip fail for a struct embedding a field implementing Gob{Decoder,Encoder} Oct 7, 2017
cznic pushed a commit to cznic/ir that referenced this issue Oct 7, 2017
Workaround for golang/go#22172.

	modified:   ir.go
	modified:   operation.go
	modified:   value.go
@dpinela
Copy link
Contributor

dpinela commented Oct 7, 2017

This isn't a bug; U gets the GobDecode and GobEncode methods from the embedded T, so the encoder and the decoder use these implementations - which don't know anything about U's fields, and so can't encode or decode them - instead of the normal recursive walk. This is also true of similar interfaces for JSON, XML and the like.

@robpike
Copy link
Contributor

robpike commented Oct 8, 2017

As explained, it is working as intended. That's how embedded fields behave.

@robpike robpike closed this as completed Oct 8, 2017
@cznic
Copy link
Contributor Author

cznic commented Oct 8, 2017

@robpike

FTR: I consider the "explanation" mostly irrelevant to the problem. It is clear how embedded fields work, but that just make us know why calling out.Gob{De,En}code works. But the program does not do that call. The call is performed in the stdlib's gob package and I'll try to explain why I think it's not correct to do so.

First of all, it's true that this is working as documented, because

A type that implements GobEncoder and GobDecoder has complete control over the representation of its data and may therefore contain things such as private fields, channels, and functions, which are not usually transmissible in gob streams.

However, I don't think the implementation is correct when seeing that U implements GobEncoder - but only through method promotion of T's method. The problem is that T's GobEncoder implementation has no way to know its receiver is embedded in another struct and access U's fields. This defeats the "complete control" quoted above. T has complete control of encoding U, when embedded in U, but in the same time no access to U's instance.

This can lead to some scenarios, which I consider problematic. One of them is the code in this issue, some random other:

  • Something can work for a long time well in production. A day comes when someone adds an ambedded field to a struct implementing GobEncoder. Information is lost the next time gob encoding is used to store the information. It can easily be noticed only too late.

  • Adding another embedded field of type say W, also implementing GobEncoder to the code in this issue makes the roundtrip work again, masking the problem. A day comes when someone removes one of the embedded fields and again, all except the one embedded field of U are not encoded anymore.

  • Spooky action at distance. The embedded field does not implement GobEncode, but it is itself a struct and it's a type imported from another package. A day comes when any level down the promotion chain someone implements GobEncode on one of the nested, embedded fields and from that point all information from all the higher level embedders cease to be encoded.

These are all risky situations, not only from the data loss point of view, but it can be also a security risk, when certain fields stay unexpectedly at zero values when decoded from a gob encoding vaues with knowingly non zero value fields.

I think the proper handling should consider only types implementing GobEncode by themselves, not by promotion, because, let me repeat myself, in the case of using GobEncode promoted methods the type, in this case U, has actually zero, not 'full' control over how its data are encoded and decoded because it's actually done by the instance of the embedded type which methods got promoted and actually called.

PS: I haven't checked, but I assume reflection can distinguish between promoted methods and methods implemented directly by a type. If that assumption is false then there's of course probably no way to change/fix this.

I don't want to prolongue talking on a closed issue, but if this stays considered not a bug, I may submit a proposal to change gob's behavior, if someone thinks it should be done.

@ianlancetaylor
Copy link
Contributor

Everything you describe is working as expected and documented. Yes, you should be careful about embedding types that implement a GobEncode method. No, we should not add a special case to encoding/gob when handling embedded types. And, as you raise as a possibility, I don't even see a way to do this.

@cznic
Copy link
Contributor Author

cznic commented Oct 8, 2017

@ianlancetaylor

Yes, you should be careful about embedding types that implement a GobEncode method.

I see, thank you. WDYT about filling a documentation issue suggesting adding a reformulation of the above, something like

Do not use a single embedded field implementing GobEncoder in combination with other exported fields which do not implement GobEncoder. Any non-zero values of those will be lost on Encode. Even when an embedded field does not implement GobEncoder, you're at risk that when the type implements GobEncoder any time later, which may not be even under your control, the data-loss situation will/may arise unexpectedly without warning."

? It's not too hard to fall into this trap. The intent is to possibly save others the time I spent debugging when I run into this situation.

And, as you raise as a possibility, I don't even see a way to do this.

By possibility you mean to change the behavior? I then assume that reflection cannot distinguish between "own" and promoted methods. Is that correct? If so, then maybe a separate issue for adding such feature to the reflect package should be filled. Any opinion about this?

@ianlancetaylor
Copy link
Contributor

Personally I think your doc suggestion adds too much text and in the wrong place. This kind of discussion seems more appropriate in a discussion of embedded types, as it is not specific to encoding/gob. For example, it comes up more often in the way that the fmt package handles the String method.

Yes, I meant that as far as I know the reflect package can not distinguish directly implemented methods and promoted methods. I don't think such a feature should be added. In general, the reflect package implements what the language implements, and in the language promoted methods are the same as methods defined directly on the type.

@ianlancetaylor
Copy link
Contributor

I'll add that I'm sorry you got bitten so badly by this. It is a corner of the language that can catch people by surprise. But if better docs are needed, let's make sure to put them in the right place.

@cznic
Copy link
Contributor Author

cznic commented Oct 8, 2017

Gob lesson learned and learning new things is always good, no problem here. Thank you for taking the time to reply.

@golang golang locked and limited conversation to collaborators Oct 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants