Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/gob: oom in Decode #53369

Closed
catenacyber opened this issue Jun 14, 2022 · 8 comments
Closed

encoding/gob: oom in Decode #53369

catenacyber opened this issue Jun 14, 2022 · 8 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@catenacyber
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.17.6 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/catena/Library/Caches/go-build"
GOENV="/Users/catena/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/catena/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/catena/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.6"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/catena/go/src/github.com/catenacyber/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/pp/dc1dtf9x2js3v0jx_m010nqr0000gn/T/go-build4237848497=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.17.6 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.17.6
uname -v: Darwin Kernel Version 21.3.0: Wed Jan  5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_X86_64
ProductName:	macOS
ProductVersion:	12.2.1
BuildVersion:	21D62
lldb --version: lldb-1316.0.9.41
Apple Swift version 5.6 (swiftlang-5.6.0.323.62 clang-1316.0.20.8)
gdb --version: GNU gdb (GDB) 9.1

What did you do?

Run https://go.dev/play/p/_e8MEYSVB8k?v=gotip

What did you expect to see?

The program finishing and printing somme Hello, without having allocated too much space

What did you see instead?

Nothing

Running heap profiling I see that almost 4 GByte was allocated from

encoding/gob.(*Decoder).readMessage
/usr/local/go/src/encoding/gob/decoder.go

  Total:      3.95GB     3.95GB (flat, cum) 99.91%
     96            .          .           	if dec.buf.Len() != 0 { 
     97            .          .           		// The buffer should always be empty now. 
     98            .          .           		panic("non-empty decoder buffer") 
     99            .          .           	} 
    100            .          .           	// Read the data 
    101       3.95GB     3.95GB           	dec.buf.Size(nbytes) 
    102            .          .           	_, dec.err = io.ReadFull(dec.r, dec.buf.Bytes()) 
    103            .          .           	if dec.err == io.EOF { 
    104            .          .           		dec.err = io.ErrUnexpectedEOF 
    105            .          .           	} 
    106            .          .           } 

I guess saferio could be used as for other packages to fix this

Found by https://github.com/catenacyber/ngolo-fuzzing on oss-fuzz
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=48028

@mvdan
Copy link
Member

mvdan commented Jun 14, 2022

This is sort of expected; the current limit is

// tooBig provides a sanity check for sizes; used in several places. Upper limit
// of is 1GB on 32-bit systems, 8GB on 64-bit, allowing room to grow a little
// without overflow.
const tooBig = (1 << 30) << (^uint(0) >> 62)

It used to be 1GiB on 64-bit systems, but got raised to 8GiB per #27635. It is a hard-coded limit, so nothing will really be high enough and low enough for everyone.

One potential middle ground to satisfy everyone could be to not allocate the entire buffer upfront if its size is huge - say, above 128MiB. In those "huge buffer" scenarios, we would let the buffer grow as the contents get read in, starting with a reasonably small buffer. This way we would still support content that is up to 8GiB in size, but we wouldn't allocate up to 8GiB with a <1KiB input payload, which is an extremely cheap OOM/crash attack. One could still make a Go program run out of memory with this strategy, but it would likely take gigabytes of input to actually do so, making the attack much less interesting in the face of network speeds, rate limits, etc.

That strategy would make real large inputs somewhat slower, as we would repeatedly grow and copy the buffer, but I think that's probably an OK tradeoff if the growth is fast enough, akin to append.

cc @robpike @bf @rsc and potentially @golang/security for visibility

@catenacyber
Copy link
Contributor Author

One potential middle ground to satisfy everyone could be to not allocate the entire buffer upfront if its size huge

Indeed, that is what is being done with saferio cf https://go-review.googlesource.com/c/go/+/412014/ for instance

@mvdan
Copy link
Member

mvdan commented Jun 14, 2022

Ah, so my idea above isn't all that creative :) I think doing the same in gob is very reasonable.

@robpike
Copy link
Contributor

robpike commented Jun 14, 2022

Not sure how allocating lazily helps reduce memory load. The decoder has the size of the data item. Assuming the size is correct, the right thing to do is to allocate the buffer and just do the read. It's simple and fast. Not only that, allocating piecemeal will result in more memory being allocated, not less.

Encoding/gob may not be the most robust system, it expects the data to be honest. Protecting against lies in the stream can only get you so far.

Fuzzing bugs are often like this, throwing random noise at things to see if things can break. For a data stream decoder capable of reading massive data items, it's easy to trigger an OOM and not at all clear how to prevent them. The feeble attempt involving that constant is the best we've come up with.

@robpike
Copy link
Contributor

robpike commented Jun 14, 2022

Addendum: Gob is best used to stream data between related, known systems. It is not the best choice when accepting data from untrusted sources.

@catenacyber
Copy link
Contributor Author

Not sure how allocating lazily helps reduce memory load.

The fuzzer/attacker needs to produce 1Gbyte of data (instead of a few bytes) to get gob to allocate 1Gbyte

I am ok if this is the expected behavior as gob should not come from untrusted sources (as for index/sufixarray)

Another fix could be to let the user specify the tooBig value...

@mvdan
Copy link
Member

mvdan commented Jun 15, 2022

allocating piecemeal will result in more memory being allocated, not less.

As far as OOMs go, I think we're worried about peak memory usage. For the case of a truly large input, growing the buffer in small steps is presumably not significantly worse than allocating the entire buffer at once, as we can rely on the GC to get rid of the older buffers. And if the user really wants to support reading huge inputs, I imagine they want to ensure they have plenty of memory available anyway - or otherwise, redesign their code to split the gob into smaller messages.

I also don't think that this will affect the vast majority of gob users, as they are hopefully not encoding gigabytes of data as a single message. saferio allows starting with a 10MiB buffer, for example, but we could allow gob to start with a larger buffer if we think that would be more appropriate.

it's easy to trigger an OOM

I generally disagree on your idea of "easy" here :) If I wanted to crash a website with this attack as it stands now, I would only have to send less than 1KiB of data. With the proposed fix, I would have to send many gigabytes - on my ~7Mb upload connection, sending 8GiB would take upwards of two hours, assuming it doesn't fail. I could possibly rent a server somewhere with a faster connection, but they will usually charge you for using large amounts of bandwidth.

And all of this assumes I would succeed at uploading 8GiB to a server without that failing somehow. If we talk about HTTP endpoints, for example, I've found it's pretty common to have proxies which limit the size of requests or rate limit the amount of data that a single client can send in a short period of time, either of which would make the proposed fix pretty effective.

@seankhliao seankhliao added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 15, 2022
@gopherbot
Copy link

Change https://go.dev/cl/413979 mentions this issue: encoding/gob: use saferio to read large buffer

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants