Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go tool can be tricked into linking a set of packages that do not typecheck as a whole #13119

Closed
momchil-velikov opened this issue Nov 1, 2015 · 7 comments

Comments

@momchil-velikov
Copy link
Contributor

Normally, when compiling a package, it is typechecked together with its imports, their imports , etc, transitively. It is possible to successfully compile a package, but later link it with a version of a package's import, that it has not been typechecked against.

When building a package, the go tool generates a buildID as a SHA1 sum of, basically, the list of package source file names and the buildIDs of the imported packages. The tool compares the expected/computed buildID with the actual buildID, recorded in the package binary, and decides if the package needs to be rebuilt.

Two different packages can have the same set of source file names and the same set of imported packages, consequently the same buildID. This is not entirely unlikely, for example, these might be two (binary incompatible) versions of the same package.

Consider two such packages A1 and A2, with identical buildID. A third package,C, imports A and is successfully typechecked and compiled against A2 as the actual package found for the import path A. If, later, package C is linked with A1, go tool may not notice that C needs to be rebuilt, because the computed buildID of C, which uses the buildID of A1, is identical to the recorded buildID of C, which uses the buildID of A2.

The steps to reproduce this problem are described in https://github.com/momchil-velikov/go-abuse
(download link https://github.com/momchil-velikov/go-abuse/archive/master.zip)
There are two examples, one of accessing a float32 as uint32 and one of writing to arbitrary memory location, obviously, without using the unsafe package.

It was tested with go-1.4.2, go-1.5-1 and master on GNU/Linux and MacOS X.

@momchil-velikov
Copy link
Contributor Author

@gopherbot
Copy link

CL https://golang.org/cl/16475 mentions this issue.

@ianlancetaylor
Copy link
Contributor

Your example seems to rely on manipulating the GOPATH so that one package is seen in one case and a different package with the same apparent name is seen in a different case. It's not obvious to me that it's important for the go tool to try to defend against people who do this kind of thing. You could do exactly the same thing by simply replacing the .a file. Your proposed CL makes that a bit harder, because now I can't just replace the .a file using cp, I have to actually edit the .a file to change the build note value.

Can you explain why this is worth changing?

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Nov 1, 2015
@momchil-velikov
Copy link
Contributor Author

I consider this issue worth fixing, because it ensures the program as a whole is well-typed, that whatever type-safety assertions the compiler made, while compiling an individual package, still hold at the time of linking.
That said, indeed, it is not possible to prevent users from making a conscious effort to subvert the Go type system, this change makes is harder for such accidental errors to go undetected.

@minux
Copy link
Member

minux commented Nov 2, 2015 via email

@momchil-velikov
Copy link
Contributor Author

I don't think this has anything to do with type safety of Go.

The ultimate goal of type-safety is to produce an executable, free of type errors. A type-checked
object or a library are of little value, as they aren't executed. Thus, achieving type safety is a task for the whole toolchain and runtime, not only for the compiler. I don't claim that this is an issue of the Go language, but with the "go" toolchain.

You're manipulating the objects behind the compiler's back.

I'm sorry, I'm not manipulating objects. I'm manipulating GOPATH, which is supposed to be
manipulated by users.

Actually you don't even need to have two GOPATH to do this.

OK, multiple different ways, by which one can achieve the same bad result just increase the severity of the problem.

The proposed change also introduces a major problem:
two different builds of the same package will produce
different buildID, so we lost the ability produce identical
binaries given identical inputs: the toolchain is no longer
deterministic.

"toolchain is no longer deterministic" is quite a statement, the output consist of eight concrete non-deterministic bytes, the rest is deterministic.

How is that property used at the moment? Whatever procedure compares the outputs of two builds,
isn't it possible to modify it, so it can determine if the difference is only is in specific eight bytes
in the whole output and reach to the same conclusions as before?

To really fix the problem without breaking the deterministic
constraint is very hard, we basically have to include a hash
of the source code into the buildID (just hashing the exported
interface is not enough, because it's possible to only change
the internal implementation of some non-exported functions
and achieve the same arbitrary memory read).

Yes, but hashing the entire source was already considered unacceptable tradeoff wrt. build speed,
which was one of things that I took into account when thinking of the various ways to fix this issue.

@ianlancetaylor
Copy link
Contributor

Every language that supports separate compilation is vulnerable to the issue you describe. I don't think this is serious or worth fixing. Sorry.

Simple deterministic output is a highly desirable property.

@golang golang locked and limited conversation to collaborators Nov 4, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants