Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/gob: document or improve handling of untrusted data #20221

Open
josharian opened this issue May 3, 2017 · 4 comments
Open

encoding/gob: document or improve handling of untrusted data #20221

josharian opened this issue May 3, 2017 · 4 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@josharian
Copy link
Contributor

josharian commented May 3, 2017

It is pretty easy to construct small malicious encoding/gob packets with large impacts, e.g. by using a large map hint (CL 40113).

I don't see:

  • anything in the encoding/gob docs that discusses safe handling of untrusted data
  • anything in the encoding/gob API that would support checking whether untrusted data is safe to decode
  • anything in the implementation that ensures e.g. that there is a relationship between the amount of data and the memory used; if there were limits on amplification, then a io.LimitedReader could be used

Although maybe I missed something.

It seems like we should add some or all of the above.

cc @bradfitz @gruszczy @randall77 @robpike

@josharian josharian added this to the Go1.9 milestone May 3, 2017
@gruszczy
Copy link
Contributor

  1. Untrusted data: Isn't it a generally good advice, that should be included regardless of the map behavior? Or was gob encoding guaranteed to behave well against untrusted data before? Python is pretty clear in pickle:

"Warning The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source."

  1. Implementation:
    a) Can we check when decoding the map that the hint is capped (this was contested before in https://go-review.googlesource.com/c/40113/) just like slices are?

    // Take care with overflow in this calculation.
    if n < 0 || uint64(n) != u || nBytes > tooBig || (size > 0 && nBytes/size != u) {
    // We don't check n against buffer length here because if it's a slice
    // of interfaces, there will be buffer reloads.
    errorf("%s slice too big: %d elements of %d bytes", typ.Elem(), u, size)
    }

b) Independent of capping the hint, can we compare the size of the message to the proposed hint and cap it based on that? Something along the lines (in decodeMap):

func (dec *Decoder) decodeMap(mtyp reflect.Type, state *decoderState, value reflect.Value, keyOp, elemOp decOp, ovfl error) {
n := int(state.decodeUint())
# Ensure that the hint is not maliciously set to be very big on a small message.
msgSize := len(state.b.data)
if msgSize / (mtyp.Key().Size() + mtyp.Elem().Size()) < n {
n = msgSize / (mtyp.Key().Size() + mtyp.Elem().Size())
}

If someone sends multiple maps with large hints and not content, the first map creation should consume remaining (and potentially panic not being able to decode something).

  1. Encoding/gob documentation: After 2) can we guarantee that the decoded message will have roughly the same size as encoded message, so the user of the API can protect themselves against decoding maliciously large messages?

@josharian
Copy link
Contributor Author

Untrusted data: Isn't it a generally good advice, that should be included regardless of the map behavior?

Yes. It should have been documented long ago, like pickle. I think this probably applies to arrays and slices as well, although I haven't checked.

can we compare the size of the message to the proposed hint and cap it based on that?

Seems like a reasonable approach to me, but the devil is in the details. (Does [1000]int{999: 1} get encoded as a key/value pair? Should we reject it, where we previously accepted it? Is is just a multiplication factor, or is it an absolute size + multiplication factor? How does that work with piecemeal encoding of[1000][1000]int? Etc.)

After 2) can we guarantee that the decoded message will have roughly the same size as encoded message, so the user of the API can protect themselves against decoding maliciously large messages?

Probably. Again, it depends on the exact details. It would be nice to provide a guarantee of the form "decoding a gob of length n will allocate at most ~100*n bytes", if that is possible, but I suspect it may not be.

Failing that, documentation.

@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.9 Jun 9, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 28, 2017
@gopherbot
Copy link

CL https://golang.org/cl/47094 mentions this issue.

@bradfitz
Copy link
Contributor

Documented for Go 1.9. Will leave open to decide whether there's more to do.

gopherbot pushed a commit that referenced this issue Jun 29, 2017
And some double space after period cleanup while I'm here.
I guess my previous regexps missed these. My next cleaner should
probably use go/ast instead of perl.

Updates #20221

Change-Id: Idb051e7ac3a7fb1fb86e015f709e32139d065d92
Reviewed-on: https://go-review.googlesource.com/47094
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 29, 2017
@gopherbot gopherbot removed the NeedsFix The path to resolution is known, but the work has not been done. label Jun 29, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Unplanned Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

4 participants