Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/compile: make -pack the default behavior and remove the flag #21705

Closed
mdempsky opened this issue Aug 31, 2017 · 12 comments
Closed

Comments

@mdempsky
Copy link
Member

TL;DR: I propose making cmd/compile's -pack flag the default behavior and removing the flag. The goal of this is to simplify the compiler's output file formats.

Background: Logically, cmd/compile produces two outputs when compiling a package: the "compiler object" (mainly package export data used by later compilations that import this package) and the "linker object" (compiled function instructions, global data, and other information used by the linker to create the final executable).

However, traditionally compilers only output a single file output. cmd/compile followed this pattern by embedding the compiler object data directly into the linker object with an ad hoc embedding strategy. Because the compiler object data was entirely textual at the time, the embedding framing was textual too (terminated with the sequence "$$").

Since then, the compiler has evolved in two pertinent ways:

  1. Adding the -pack flag, which allows the compiler to write a single standard UNIX archive file, so that tools can more easily find the compiler/linker output they specifically need.
  2. Changing the compiler object's export data representation from textual to binary.

Because the export data is now binary, the textual framing is somewhat tedious: '$' characters need to be escaped, lest they be misinterpreted as the end-of-compiler-object sequence. This escaping is unnecessary when the compiler object is separated from the linker object (such as when the -pack flag is used), but to keep tools simpler, we currently use escaping unconditionally.

By making -pack the default and only mode of operation, we can more easily and independently evolve the compiler and linker output formats.

cmd/go, gb, and Google's internal build system all use the -pack flag. Bazel's Go build rules did not, but uncovered #21703 because of it.

It's also worth noting that cmd/compile has its -linkobj flag, which allows writing the linker output to a separate file from the compiler output. The issues above don't apply when -linkobj is used, but for simplicity/consistency I propose we still make -pack the default/only mode here. However, I'm open to discussing or reevaluating this later.

@gopherbot gopherbot added this to the Proposal milestone Aug 31, 2017
@ianlancetaylor
Copy link
Contributor

I don't think this needs to be a formal proposal that is reviewed by the proposal committee. I think this can be decided by the compiler developers.

I think the main thing to think about is that people do invoke the compiler directly. I think we should keep the -pack flag, at least for a while, even if it becomes a no-op. And if possible the old workflow of invoking the compiler without -pack and then running go tool pack should continue to work, even if the formats being used are different. See #21703.

@mdempsky
Copy link
Member Author

@ianlancetaylor Perhaps a formal proposal is over the top. I mostly wanted to give users and third-party build systems a chance to comment on any issues with making -pack the default behavior.

I think leaving -pack as a noop for a while is fine.

I'm open to ensuring that "go tool compile a.go && go tool pack c a.a a.o" continues to work, but I don't have any good ideas how to accomplish that. I also worry the effort to make this work will exceed the effort of changing those use cases to just invoke "go tool compile -pack a.go" instead (which already works, and also avoids #21703 in older Go releases).

For what it's worth, "go tool compile a.go && go tool compile b.go" will continue to work fine whether a.go is compiled into a.o or a.a.

@ianlancetaylor
Copy link
Contributor

Can't go tool pack c a.a a.o simply observe that a.o is already an archive, and just copy it to a.a?

@mdempsky
Copy link
Member Author

mdempsky commented Aug 31, 2017

When we run "compile -pack a.go", we don't emit a.o. We only emit a.a, with a file entry named _go_.o.

We could potentially make "pack c X.a Y.o" a noop if 1) X.a already exists and contains a file named _go_.o, and 2) Y.o does not exist?

@ianlancetaylor
Copy link
Contributor

The interesting case is where -pack is not specified. I guess what we need to decide is, even if we always emit an archive, what file name we are going to choose if neither -pack nor -o are used?

@mdempsky
Copy link
Member Author

I see. To be explicit: in proposing that -pack become the default behavior, I meant that "compile a.go" would start emitting a a.a file (containing _go_.o), instead of emitting a a.o file. I.e., the same as "compile -pack a.go" does today.

@mdempsky
Copy link
Member Author

mdempsky commented Aug 31, 2017

It's worth noting that bazelbuild/rules_go, in the general case, actually invokes pack like so:

go tool asm asm1.s
go tool asm asm2.s
...
go tool compile -o pkg.o src1.go src2.go ...
go tool pack c pkg.a pkg.o asm1.o asm2.o ...

So we'd have to more generally recognize that pkg.a already exists and contains pkg.o (as _go_.o), but then still append the assembly object files.

@ianlancetaylor
Copy link
Contributor

Given that sequence of commands, presumably pkg.a would not exist, because the -o option would tell the compiler to generate pkg.o.

@mdempsky
Copy link
Member Author

mdempsky commented Sep 1, 2017

Ohh, I think I understand your earlier suggestion now.

So in the bazelbuild/rules_go scenario, yes, pkg.o would exist because of the -o pkg.o flag, but it would now actually be a UNIX archive. cmd/pack could potentially then recognize that (not unlike how today it recognizes Go object files to synthesize a __.PKGDEF entry), and start by simply copying it to pkg.a as is, like you suggested.

The scenario I had in mind earlier though is when -o is not used. In that situation, I would expect cmd/compile to emit a .a file by default, so there won't be any .o file. I don't know how to gracefully handle this situation transparently.

@rsc
Copy link
Contributor

rsc commented Oct 16, 2017

I really don't want to do this right now. The go command is happy with the arrangement and is the primary consumer. Are there other consumers for which this causes a problem? It seems hardly any work to keep working as is.

Also I'm working in the go command on making .x files supported well anyway.

I'm worried about breaking existing uses in other scripts, and I don't see any real benefit here. If you want to optimize the -pack path further, at the cost of performance on the non-pack path, that's fine with me. But please let's keep both.

@rsc
Copy link
Contributor

rsc commented Oct 23, 2017

Going to mark this proposal-declined. Again it's fine to optimize for the -pack code path, but please keep the .o output working. It should not be too difficult. If, at some later point, redoing the object file format bubbles sufficiently far up the stack, then I could see unifying them so that -pack is a no-op, but I don't see any significant benefit to that churn today.

@gopherbot
Copy link

Change https://golang.org/cl/102236 mentions this issue: cmd/compile: always write pack files

gopherbot pushed a commit that referenced this issue Mar 24, 2018
By always writing out pack files, the object file format can be
simplified somewhat. In particular, the export data format will no
longer require escaping, because the pack file provides appropriate
framing.

This CL does not affect build systems that use -pack, which includes
all major Go build systems (cmd/go, gb, bazel).

Also, existing package import logic already distinguishes pack/object
files based on file contents rather than file extension.

The only exception is cmd/pack, which specially handled object files
created by cmd/compile when used with the 'c' mode. This mode is
extended to now recognize the pack files produced by cmd/compile and
handle them as before.

Passes toolstash-check.

Updates #21705.
Updates #24512.

Change-Id: Idf131013bfebd73a5cde7e087eb19964503a9422
Reviewed-on: https://go-review.googlesource.com/102236
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang locked and limited conversation to collaborators Mar 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants