Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug/elf: NewFile can be tricked into reading the entire file into memory if it is corrupted #68454

Open
umanwizard opened this issue Jul 15, 2024 · 10 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@umanwizard
Copy link

Go version

go version go1.22.2 linux/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/home/brennan/.cache/go-build'
GOENV='/home/brennan/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/brennan/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/brennan/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/gnu/store/vcp6w6z0syjwv383s0nywhy0svxk2d05-go-1.22.2/lib/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/gnu/store/vcp6w6z0syjwv383s0nywhy0svxk2d05-go-1.22.2/lib/go/pkg/tool/linux_arm64'
GOVCS=''
GOVERSION='go1.22.2'
GCCGO='gccgo'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build369678066=/tmp/go-build -gno-record-gcc-switches

What did you do?

Take a 64-bit little-endian ELF file at least 1GiB in size, and run the following script to corrupt it (by setting the shstrndx to point to a section of size 1GiB):

package main

import (
	"encoding/binary"
	"fmt"
	"os"
)

func main() {
	if len(os.Args) < 2 {
		fmt.Println("Usage: go run main.go <filename>")
		return
	}

	filename := os.Args[1]
	file, err := os.OpenFile(filename, os.O_RDWR, 0)
	if err != nil {
		fmt.Println("Error opening file:", err)
		return
	}
	defer file.Close()

	// Write "1" to shstrndx
	_, err = file.WriteAt([]byte { 1, 0, 0, 0}, 62)
	if err != nil {
		fmt.Println("Error writing to file:", err)
		return
	}

	// Get offset of section 1
	var data [8]byte
	_, err = file.ReadAt(data[:], 40)
	if err != nil {
		fmt.Println("Error reading from file:", err)
		return
	}
	offset := int64(binary.LittleEndian.Uint64(data[:])) + 64

	// Change sec1 type to 3 (SHT_STRTAB), offset to 0, size to 1GiB
	file.WriteAt([]byte { 3, 0 }, offset + 4)
	file.WriteAt([]byte {0, 0, 0, 0, 0, 0, 0, 0}, offset + 24)
	file.WriteAt([]byte {0, 0, 0, 0x40, 0, 0, 0, 0}, offset + 32)
}

Then try to get the buildinfo, e.g. using https://github.com/kelseyhightower/buildinfo . This uses elf.NewFile under the hood.

What did you see happen?

The program consumes 1GiB of memory before failing.

What did you expect to see?

Memory use should be bounded.

The reason is that debug/elf reads the entire shstrtab into memory in order to get section names. It uses saferio.ReadData to do this.

ReadData's doc comment reads as follows:

// ReadData reads n bytes from the input stream, but avoids allocating
// all n bytes if n is large. This avoids crashing the program by
// allocating all n bytes in cases where n is incorrect.

However, this is misleading: it does not actually prevent allocating more than n bytes (which is 10MB in this case). It just prevents it from allocating that many at a time. It will happily allocate chunks in a loop until it has read up to n or the total file size.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 15, 2024
@ianlancetaylor
Copy link
Member

It would be a bug if there were a way to trick debug/elf into allocating more memory than exists in the file. That is what the saferio package avoids. That kind of thing is bad because an attacker can provide a small file and trick the program into allocating a great deal of memory.

I think it's less interesting when the program doesn't read more than the contents of a file. That means that for an attacker to get the program to allocate a lot of memory, the attacker has to provide a large file. That kind of attack doesn't scale, so it's not all that dangerous.

Of course I'm not opposed to fixing this if we can. But it doesn't seem high priority. In particular I don't think we should reduce the efficiency of debug/elf to fix this kind of problem.

@umanwizard
Copy link
Author

Note that e.g. with truncate you can create a sparse file of arbitrary length without actually occupying disk space.

@ianlancetaylor
Copy link
Member

That is true but that doesn't seem like a particularly likely remote attack. And a local attack can already allocate essentially arbitrary amounts of memory. Perhaps I am missing something.

@umanwizard
Copy link
Author

I think we are on the same page that it's not a critical security issue. I was actually not even thinking about security when I opened the issue, but rather performance/efficiency. It is annoying if malformed files can cause unexpected OOMs or GC pauses, even if it's not a security issue.

Also, I question the premise that it's not exploitable if the amount of memory an attacker can cause to be allocated is limited by the size of the files. Imagine a service that lets customers upload ELF files, e.g. in order to extract debug symbols. It wouldn't be hard to upload a big enough file to cause the service to consume more memory than is available to it.

The easiest ways to solve this would be to either (1) just use normal file read APIs to get data from the shstrtab, rather than slurping the whole thing in, or (2) to mmap it. Both of these would fall short of your requirement to not make debug/elf less efficient: the first inflates the number of syscalls required, and the second will make goroutines block on I/O. Another very simple solution is to just fail if the section is bigger than some value like 10 MiB. It's hard to imagine a legitimate ELF file whose shstrtab is bigger than that ...

@cherrymui cherrymui changed the title debug/elf: NewFile can be tricked into reading the entire file into memory if it is corrupted. debug/elf: NewFile can be tricked into reading the entire file into memory if it is corrupted Jul 16, 2024
@cherrymui cherrymui added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jul 16, 2024
@cherrymui cherrymui added this to the Unplanned milestone Jul 16, 2024
@ianlancetaylor
Copy link
Member

It seems to me that there are other places in debug/elf where this class of problem can arise, such as the symbol table, the symbol string table, the symbol version sections, the dynamic section, the DWARF sections. What drew your attention particularly to the shstrtab section?

@umanwizard
Copy link
Author

Because that is the only section that is read into memory in full when elf.NewFile is called (unless I misread the code).

@ianlancetaylor
Copy link
Member

I expect that is true, but most users will also call the Symbols method or the Data method of some section or some other method that reads a section. They won't just call NewFile and not do anything else.

@prattmic
Copy link
Member

Regardless of whether this counts as a bug, security issue, etc, it is certainly an annoyance that getting symbols from a binary with a large symbol table requires a lot of memory. It would be nice if there was a way to search for a specific symbol or iterate over symbols without consuming as much memory.

cc @mknyszek @zpavlinovic

@prattmic
Copy link
Member

I took a look at a couple of large (500+ MB) C++ binaries I have laying around (as one does...). They had symtab sections around 40MB, which is a bit less than I expected. I'd say that amount of memory usage is not a huge issue, but could be painful for some applications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

6 participants