encoding/gob: oom in Decode #53369

catenacyber · 2022-06-14T13:23:55Z

What version of Go are you using (`go version`)?

$ go version
go version go1.17.6 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/catena/Library/Caches/go-build"
GOENV="/Users/catena/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/catena/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/catena/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.6"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/catena/go/src/github.com/catenacyber/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/pp/dc1dtf9x2js3v0jx_m010nqr0000gn/T/go-build4237848497=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.17.6 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.17.6
uname -v: Darwin Kernel Version 21.3.0: Wed Jan  5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_X86_64
ProductName:	macOS
ProductVersion:	12.2.1
BuildVersion:	21D62
lldb --version: lldb-1316.0.9.41
Apple Swift version 5.6 (swiftlang-5.6.0.323.62 clang-1316.0.20.8)
gdb --version: GNU gdb (GDB) 9.1

What did you do?

Run https://go.dev/play/p/_e8MEYSVB8k?v=gotip

What did you expect to see?

The program finishing and printing somme Hello, without having allocated too much space

What did you see instead?

Nothing

Running heap profiling I see that almost 4 GByte was allocated from

encoding/gob.(*Decoder).readMessage
/usr/local/go/src/encoding/gob/decoder.go

  Total:      3.95GB     3.95GB (flat, cum) 99.91%
     96            .          .           	if dec.buf.Len() != 0 { 
     97            .          .           		// The buffer should always be empty now. 
     98            .          .           		panic("non-empty decoder buffer") 
     99            .          .           	} 
    100            .          .           	// Read the data 
    101       3.95GB     3.95GB           	dec.buf.Size(nbytes) 
    102            .          .           	_, dec.err = io.ReadFull(dec.r, dec.buf.Bytes()) 
    103            .          .           	if dec.err == io.EOF { 
    104            .          .           		dec.err = io.ErrUnexpectedEOF 
    105            .          .           	} 
    106            .          .           }

I guess saferio could be used as for other packages to fix this

Found by https://github.com/catenacyber/ngolo-fuzzing on oss-fuzz
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=48028

The text was updated successfully, but these errors were encountered:

mvdan · 2022-06-14T14:23:46Z

This is sort of expected; the current limit is

// tooBig provides a sanity check for sizes; used in several places. Upper limit
// of is 1GB on 32-bit systems, 8GB on 64-bit, allowing room to grow a little
// without overflow.
const tooBig = (1 << 30) << (^uint(0) >> 62)

It used to be 1GiB on 64-bit systems, but got raised to 8GiB per #27635. It is a hard-coded limit, so nothing will really be high enough and low enough for everyone.

One potential middle ground to satisfy everyone could be to not allocate the entire buffer upfront if its size is huge - say, above 128MiB. In those "huge buffer" scenarios, we would let the buffer grow as the contents get read in, starting with a reasonably small buffer. This way we would still support content that is up to 8GiB in size, but we wouldn't allocate up to 8GiB with a <1KiB input payload, which is an extremely cheap OOM/crash attack. One could still make a Go program run out of memory with this strategy, but it would likely take gigabytes of input to actually do so, making the attack much less interesting in the face of network speeds, rate limits, etc.

That strategy would make real large inputs somewhat slower, as we would repeatedly grow and copy the buffer, but I think that's probably an OK tradeoff if the growth is fast enough, akin to append.

cc @robpike @bf @rsc and potentially @golang/security for visibility

catenacyber · 2022-06-14T15:09:42Z

One potential middle ground to satisfy everyone could be to not allocate the entire buffer upfront if its size huge

Indeed, that is what is being done with saferio cf https://go-review.googlesource.com/c/go/+/412014/ for instance

mvdan · 2022-06-14T15:26:19Z

Ah, so my idea above isn't all that creative :) I think doing the same in gob is very reasonable.

robpike · 2022-06-14T23:08:33Z

Not sure how allocating lazily helps reduce memory load. The decoder has the size of the data item. Assuming the size is correct, the right thing to do is to allocate the buffer and just do the read. It's simple and fast. Not only that, allocating piecemeal will result in more memory being allocated, not less.

Encoding/gob may not be the most robust system, it expects the data to be honest. Protecting against lies in the stream can only get you so far.

Fuzzing bugs are often like this, throwing random noise at things to see if things can break. For a data stream decoder capable of reading massive data items, it's easy to trigger an OOM and not at all clear how to prevent them. The feeble attempt involving that constant is the best we've come up with.

robpike · 2022-06-14T23:11:49Z

Addendum: Gob is best used to stream data between related, known systems. It is not the best choice when accepting data from untrusted sources.

catenacyber · 2022-06-15T09:15:03Z

Not sure how allocating lazily helps reduce memory load.

The fuzzer/attacker needs to produce 1Gbyte of data (instead of a few bytes) to get gob to allocate 1Gbyte

I am ok if this is the expected behavior as gob should not come from untrusted sources (as for index/sufixarray)

Another fix could be to let the user specify the tooBig value...

mvdan · 2022-06-15T09:36:17Z

allocating piecemeal will result in more memory being allocated, not less.

As far as OOMs go, I think we're worried about peak memory usage. For the case of a truly large input, growing the buffer in small steps is presumably not significantly worse than allocating the entire buffer at once, as we can rely on the GC to get rid of the older buffers. And if the user really wants to support reading huge inputs, I imagine they want to ensure they have plenty of memory available anyway - or otherwise, redesign their code to split the gob into smaller messages.

I also don't think that this will affect the vast majority of gob users, as they are hopefully not encoding gigabytes of data as a single message. saferio allows starting with a 10MiB buffer, for example, but we could allow gob to start with a larger buffer if we think that would be more appropriate.

it's easy to trigger an OOM

I generally disagree on your idea of "easy" here :) If I wanted to crash a website with this attack as it stands now, I would only have to send less than 1KiB of data. With the proposed fix, I would have to send many gigabytes - on my ~7Mb upload connection, sending 8GiB would take upwards of two hours, assuming it doesn't fail. I could possibly rent a server somewhere with a faster connection, but they will usually charge you for using large amounts of bandwidth.

And all of this assumes I would succeed at uploading 8GiB to a server without that failing somehow. If we talk about HTTP endpoints, for example, I've found it's pretty common to have proxies which limit the size of requests or rate limit the amount of data that a single client can send in a short period of time, either of which would make the proposed fix pretty effective.

gopherbot · 2022-06-25T00:04:18Z

Change https://go.dev/cl/413979 mentions this issue: encoding/gob: use saferio to read large buffer

seankhliao added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 15, 2022

rolandshoemaker mentioned this issue Jul 11, 2022

go/internal/gcimporter: Import can OOM on corrupt/adversarial package #53787

Closed

gopherbot closed this as completed in dee9adc Aug 19, 2022

golang locked and limited conversation to collaborators Aug 19, 2023

gopherbot added the FrozenDueToAge label Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/gob: oom in Decode #53369

encoding/gob: oom in Decode #53369

catenacyber commented Jun 14, 2022

mvdan commented Jun 14, 2022 •

edited

catenacyber commented Jun 14, 2022

mvdan commented Jun 14, 2022

robpike commented Jun 14, 2022

robpike commented Jun 14, 2022

catenacyber commented Jun 15, 2022

mvdan commented Jun 15, 2022

gopherbot commented Jun 25, 2022

encoding/gob: oom in Decode #53369

encoding/gob: oom in Decode #53369

Comments

catenacyber commented Jun 14, 2022

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

mvdan commented Jun 14, 2022 • edited

catenacyber commented Jun 14, 2022

mvdan commented Jun 14, 2022

robpike commented Jun 14, 2022

robpike commented Jun 14, 2022

catenacyber commented Jun 15, 2022

mvdan commented Jun 15, 2022

gopherbot commented Jun 25, 2022

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

mvdan commented Jun 14, 2022 •

edited