Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: speed up 'go run' by caching binaries #33468

Open
eliasnaur opened this issue Aug 5, 2019 · 20 comments
Open

cmd/go: speed up 'go run' by caching binaries #33468

eliasnaur opened this issue Aug 5, 2019 · 20 comments
Labels
FeatureRequest NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Milestone

Comments

@eliasnaur
Copy link
Contributor

eliasnaur commented Aug 5, 2019

What did you do?

With GO111MODULE=on and the Android SDK and NDK set up, this is the one-liner for creating an Android .apk app file for a Gio program:

$ go run gioui.org/cmd/gogio -target android gioui.org/apps/gophers

What did you expect to see?

go run running nearly as fast as a pre-installed command, that is

$ go install gioui.org/cmd/gio
$ $GOBIN/gio -target android gioui.org/apps/gophers

What did you see instead?

@mvdan filed an issue that pointed out that while convenient, the command above is slowed down by go run always re-linking gioui.org/cmd/gio.

This issue is about caching the binary from go run so it achieves nearly the same speed for its second and subsequent runs as a pre-installed command.

Some slowdown is expected because go run needs to know whether a cached version is the newest. I expect that delay to be minimal with a proper GOPROXY set.

Gio issue #15 also points out that teaching users to use go run is bad, but I believe there are valid reasons to use go run:

  • It's more convenient. One line instead of (at least) two.
  • The user doesn't need to know about $GOBIN ($GOPATH/bin). And even if they do know about it, go run doesn't pollute a user's $GOBIN if they just wanted to test or demo a command from a README.
  • Experienced Go users can easily translate a go run command to its go install equivalent if they prefer.
  • Binary always up to date. I still regularly change the Gio library such that an updated cmd/gio is required. With go run the latest version is always used.
  • Avoids binary name clashes. It just so happens that gio already exists on my system (I believe it is a Gnome tool).
@mvdan
Copy link
Member

mvdan commented Aug 5, 2019

Duplicate of #25416? Unfortunately, that issue was closed.

@theclapp
Copy link
Contributor

theclapp commented Aug 5, 2019

@eliasnaur Be advised that your Gio issue #15 links to #15 here in this repo. Gio issue 15 is, of course, here.

@eliasnaur
Copy link
Contributor Author

@theclapp oops! Thanks, fixed.

@jayconrod
Copy link
Contributor

#25416 seems like the same issue. The rationale for closing it was that we'd prefer not to cache linked binaries, since they take up a lot of space, and in the case of go run and go test, the cache hit rate isn't all that high.

It might make sense to cache binaries if the cache eviction policy were more aggressive for binaries in particular. The cache would need to be a lot smarter though.

@theclapp
Copy link
Contributor

theclapp commented Aug 5, 2019

Yeah, it seems like instead of caching go run, one's readme should use go build && ./the-executable. As near as I can tell from go build -x, go build is smart enough to know that if nothing has changed, it doesn't do anything, and the binary is clearly right there to run again if you want it, and also clearly taking up space if you want to delete it.

In the particular case of Gio's gio command, I think it makes sense to go install it, instead of go running it every time.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

The resolution from last time was #25416 (comment) (amplified in #25416 (comment)).

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

Note that, even with binary caching, go run would still be substantially slower than running the installed binary directly from $GOBIN, since go run would still need to inspect all of the relevant files and directories to see whether the sources have changed.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

And even if they do know about it, go run doesn't pollute a user's $GOBIN if they just wanted to test or demo a command from a README.

Note that one can always set GOBIN=$(mktemp -d) to demo a command from a readme, or use go build -o and pass an explicit binary destination.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

Experienced Go users can easily translate a go run command to its go install equivalent if they prefer.

I think that point also runs in the opposite direction, and more strongly: experienced Go users can easily translate a go install command to a go run command too, and new users are already confused about when go run should or should not work. We should teach new users about go build, go install, and GOBIN as early as we reasonably can, and package install instructions should normalize the use of those rather than go run.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

I still regularly change the Gio library such that an updated cmd/gio is required. With go run the latest version is always used.

When working within a module in module mode, go run should produce a reproducible result, not always upgrade to the latest version. And when working outside of a module, it's not obvious whether go run of a specific package should work at all (see #32027).

(Also note that this point is in direct tension with binary caching: checking for the latest version is an expensive operation. If we assume that go run runs the latest version, then the relative speedup from caching the binary is substantially reduced.)

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 6, 2019
@dmitshur dmitshur added this to the Go1.14 milestone Aug 6, 2019
@eliasnaur
Copy link
Contributor Author

Note that, even with binary caching, go run would still be substantially slower than running the installed binary directly from $GOBIN, since go run would still need to inspect all of the relevant files and directories to see whether the sources have changed.

What files and directories? The source files for gioui.org/cmd/gio and its dependencies are only stored in the cache, which is read-only and in a known state, right?

Note that one can always set GOBIN=$(mktemp -d) to demo a command from a readme, or use go build -o and pass an explicit binary destination.

What can I tell Windows users?

Now that I think about it, perhaps what I like most about "go run" is that it is a simple cross platform way to run Go binaries regardless of environment variables. A "go run" variant for running (cached) binaries from $GOBIN would suffice. I could program my way out of checking version mismatches between the gio tool and the gio packages.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

The source files for gioui.org/cmd/gio and its dependencies are only stored in the cache, which is read-only and in a known state, right?

I don't think we've currently baked any assumptions about the pristineness of the module cache into the build-caching logic. You're right that we could, though.

But we'd still have to at least check the go.mod file to ensure that the module configuration hasn't changed, and that means checking for the go.mod file, which is a not-entirely-trivial directory walk.

@bcmills
Copy link
Contributor

bcmills commented Aug 6, 2019

What can I tell Windows users?

You could give separate instructions for cmd.exe and for Unix-like shells. Or assume that they have their PATH configured appropriately (perhaps by reference to some other document) and tell everyone:

go install gioui.org/cmd/gio
gio -target android gioui.org/apps/gophers

@eliasnaur
Copy link
Contributor Author

Inspired by @bcmills comments, I created #33518 for the problem, not just a possible solution (this issue). Thank you for your patience.

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Aug 7, 2019

I still regularly change the Gio library such that an updated cmd/gio is required. With go run the latest version is always used.

When working within a module in module mode, go run should produce a reproducible result, not always upgrade to the latest version. And when working outside of a module, it's not obvious whether go run of a specific package should work at all (see #32027).

(Also note that this point is in direct tension with binary caching: checking for the latest version is an expensive operation. If we assume that go run runs the latest version, then the relative speedup from caching the binary is substantially reduced.)

Ok, so if we drop the requirement of using the latest version, which I agree is a dubious choice anyway, and if we assume that for most actual uses of the gogio tool the user is operating inside a module, can go run be made fast? I think so:

The first go run inside a module records the version of the tool in go.mod. So subsequent go runs of the same tool use the recorded version, which means that go run can immediately use a cached version of the tool binary.

I think this is interesting because as I argue in #33518 (comment), go run seems to be the correct choice for running gogio, not go install.

@myitcv
Copy link
Member

myitcv commented Aug 19, 2019

FWIW, the points described in this issue and #33518 were the main reasons behind creating https://github.com/myitcv/gobin.

Also linking #30515 therefore, specifically this comment: #30515 (comment). The most recent discussion with @ianthehat on a golang-tools call was that "something like gobin" makes sense as a first cut (noting that whatever the solution it needs to be part of the Go distribution else we move the problem to installing another tool)

@matthewmueller
Copy link

matthewmueller commented Apr 21, 2020

Hey folks, hopefully this is relevant to the discussion. I've been following the linked issues and discussions trying to find an answer to:

  • What's the recommended way to recompile a Go a program during development?

Up until today, I thought that go run main.go was meant for development and go build was meant for production. I didn't realize that go run was meant for sporadic usage.

Could someone share the recommend way to recompile a Go program today? Maybe it's something like this?

go build && ./main

Should I be installing packages?

go build -i && ./main

What does it mean to cache binaries when your source code is changing?

@jayconrod
Copy link
Contributor

Whether you use go build or go run, each compiled package will be stored in the build cache. So the only difference is whether the final binary is relinked when there are no changes.

go build checks whether the output binary exists and examines metadata stamped into the binary. If the binary is up to date, go build skips linking.

go run writes the binary to a temporary file, then deletes it when it's done. So there's no opportunity to skip linking and reuse it.

The -i flag is mostly obsolete. It copies compiled packages into $GOPATH/pkg. This used to speed up builds before the build cache was introduced in Go 1.10, but now it's only useful for installing packages (not usually needed anymore).

@matthewmueller
Copy link

Thanks for the overview @jayconrod! One question:

go build checks whether the output binary exists and examines metadata stamped into the binary. If the binary is up to date, go build skips linking.

Does this mean whenever there is a change to any package within your Go project, then go run and go build have the same performance?

Or is there some sort of up-to-date check with go build on the package-level?

@jayconrod
Copy link
Contributor

Does this mean whenever there is a change to any package within your Go project, then go run and go build have the same performance?

Yes, it should be nearly identical. If they're building the same packages in the same configuration, they should have the same hits and misses in the build cache.

Or is there some sort of up-to-date check with go build on the package-level?

Each package and binary is stamped with a build id, which is used to check whether it's up-to-date. That's used by all build commands, not just go build. buildid.go explains how it works if you're curious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Projects
None yet
Development

No branches or pull requests

9 participants