Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: support option -emit-relocs for the golang linker #49031

Open
hailingluo opened this issue Oct 18, 2021 · 24 comments
Open

cmd/link: support option -emit-relocs for the golang linker #49031

hailingluo opened this issue Oct 18, 2021 · 24 comments
Labels
FeatureRequest NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@hailingluo
Copy link

hailingluo commented Oct 18, 2021

What version of Go are you using (go version)?

$ go version 
go version go1.16.5 linux/arm64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

What did you do?

I am working for supporting the option -emit-relocs for the golang linker in order to save the relocations.

@ALTree ALTree added this to the Unplanned milestone Oct 18, 2021
@ALTree
Copy link
Member

ALTree commented Oct 18, 2021

Also cc @cherrymui as cmd/link owner, since this looks like a proposal to add a feature.

@ALTree ALTree added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 18, 2021
@yota9
Copy link

yota9 commented Oct 18, 2021

Hello @cherrymui ! This feature is needed to support binary optimizers e.g. llvm-bolt (we where speaking a bit about it #47908). Currently we are using external linker to use bolt optimizer, but it will be nice to have ability to use the golang linker and remove extra dependencies. Also the external linkers might have some bugs, e.g. both ld and gold linkers have some problems for aarch64 PIC binaries linked with emit-relocs flag e.g. https://sourceware.org/bugzilla/show_bug.cgi?id=28234

@hailingluo hailingluo changed the title cmd/link: support option -emit-reloc for the golang linker cmd/link: support option -emit-relocs for the golang linker Oct 18, 2021
@cherrymui
Copy link
Member

@yota9 Thanks for the information. However, instead of asking for individual features like this, I suggest you open a discussion (perhaps like a proposal issue) about binary optimization. Currently we don't support it, and I'm not sure we want to. The Go runtime makes quite a few assumptions about the binary, and it is important that we have full control of the toolchain (compiler, linker, assembler) and they are evolved together with the runtime. Post-editing the binary may break the assumptions and cause problems at run time. Supporting binary post-editing, even if possible, is not simple, clearly not as simple as this issue and #47908.

Instead, maybe you'd like to take a look at gccgo or gollvm. They are Go implementations based on GCC and LLVM toolchains, and are designed for more flexible backends with less assumptions about the code layout.

Until we have a clear and accepted plan about binary optimization or post-editing, I'd say we don't do this. Thanks.

@cherrymui
Copy link
Member

@hailingluo Is there any reason for emitting relocations other than what @yota9 mentioned? Thanks.

@cherrymui cherrymui added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Oct 18, 2021
@yota9
Copy link

yota9 commented Oct 18, 2021

@cherrymui Thank you for your answer! However the optimization already works and works fine :) The #47908 is just a small part to support optimizations on aarch64, for example the x86 doesn't need this so no extra patches are needed.

@cherrymui
Copy link
Member

Maybe we're not taking about the same thing, sorry. I'd like to understand what specific optimizations you do. For example, does it change instructions in a Go function? Does it change the order of Go functions in the text section? Inlining, outlining, block reordering, etc. for Go functions after linking? I wouldn't think any of them would work without a substantial change of the toolchain/runtime. If not the above, what is it? Thanks.

@yota9
Copy link

yota9 commented Oct 18, 2021

@cherrymui Mostrly we are talking about some instructions changes and basic block reorders. The functions order is changed too. I'm aware of internal golang structures, e.g. findfunctab, pcltab, firstmodule & etc, so all of that structures are updated according to the new layout of the binary :)

@cherrymui
Copy link
Member

Thanks for the information. I think we'd need to understand more about how you do that. Do you change those metadata in a post-editing tool? How does it change them, by decoding and re-encoding them after editing?

Note that the metadata are undocumented and are changing from time to time. We add/remove matadata, change the encoding or meaning as it goes. They are not intended to be understood by anything other than the Go toolchain/runtime. And it is important for them to remain opaque and can be changed flexibly. Thanks.

@yota9
Copy link

yota9 commented Oct 18, 2021

@cherrymui

Do you change those metadata in a post-editing tool? How does it change them, by decoding and re-encoding them after editing?

Yes, it is done during bolt work. The first stage decodes it in the original binary, the last stage patches the data and re-encodes it in the final binary.

Note that the metadata are undocumented

Oh yes, I know that, it took a while in GDB :))

and are changing from time to time.

Yes, I'm aware of that.. Currently I'm supporting 3 versions - 1.14.9/1.14.12 (the same from the tool standpoint) and 1.16.5. For example the 1.14.9 and 1.16.5 are different in pclntab, and It is taken into account. BTW I even found bug in 1.16.5 (#47218) during porting to it.

@hailingluo
Copy link
Author

hailingluo commented Oct 20, 2021

@hailingluo Is there any reason for emitting relocations other than what @yota9 mentioned? Thanks.

@cherrymui Hi, for now, it is the main purpose.

@thanm
Copy link
Contributor

thanm commented Oct 27, 2021

@hailingluo I'd like to expand a little on what @cherrymui said about the need to keep meta-data opaque and changing flexibly.

As an example of the sort of situation we want to avoid, consider the package moderngo/reflect2, available on github.

This package uses "unsafe" to reach into the Go runtime to access runtime-private data stuctures, in order to support reflect-like operations that run faster than the official Golang implementations. Authors of this package basically write code that looks at the Go version (1.x), and based on the version, make unsafe accesses to runtime data structures (using types that are essentially reverse-engineered to act as runtime equivalents). Other packages and programs import "reflect2", maybe because they like the speedup, or maybe by accident (via transitive dependency).

As the Go runtime evolves, however, Go team developers decide to make a change to a runtime data structure that happens to be one of the ones that is relied on by reflect2. When this happens, the problem manifests as a failure or crash in the Go team's "tip" testing (that is, the testing we do to make sure the tip of our main branch is OK) -- all of a sudden we see that we can no longer build some third-party application XYZ, or some other popular applicatoin ABC builds but crashes at runtime.

[I want to add that this is not a "theoretical' situation I'm talking about, this is something that has happened in the past more than once, and is, more than likely, going to crop up in the future].

At this point someone from the Go team has to take the time to track down these failures, analyze and debug them, determine what the issue is, then contact maintainer of moderngo/reflect2, who has to sit down and look at the new version of the runtime data structures, craft a patch for supporting the new thing, submit that to "moderngo/reflect2", tag that commit, publish a new version (with docs), and then applications XYZ and ABC have to decide to adopt that new version (which can also take time).

During all of that interval, our "tip" testing is broken for apps XYZ and ABC. In particular, some other developer may check in an unrelated change that breaks application XYZ for some other reason, but we won't find out about it right away, because we're still waiting for the fix from moderngo/reflect2.

The bottom line here is that this is a big drag on the productivity of the Go team. We want our compiler and runtime developers to be free to make changes to our runtime data structures and internal formats (pclntab, symtab, etc) without the danger of having our tip application testing blow up in random ways.

If we move ahead with having tools like BOLT do rewriting of private/internal data formats (e.g. pclntab), we're moving into another situation like moderngo/reflect2. It would mean that when we make a change to pclntab format during regular Go development, all of a sudden our tip testing of some application (you pick) starts to crash, because the app build process uses BOLT, and BOLT doesn't know about the new format.

Sorry for the lengthy response, but I wanted to try to spell out in detail the concerns here from the Go team. Thanks.

@yota9
Copy link

yota9 commented Oct 27, 2021

Hello @thanm ! Thank your for this detailed response, it was interesting to read. I just want to say that of course the BOLT tool and the moderngo package needs to be synchronized in order to work and are deeply depending on the golang version (as I told currently I support only 1.14.9/1.14.12 (the same from the tool standpoint) and 1.16.5 versions, I didn't test other versions and don't expect it to work without changes). I didn't test the package you was talking about (since currently our production seems to be doesn't use them), but of course we might be in situation when you can either use bolt tool, either some of the hacky code. It seems to be OK for me, since bolt tool usage is optional :)

@thanm
Copy link
Contributor

thanm commented Oct 27, 2021

@yota9 just to be clear, I was not trying to compare BOLT directly to modengo/reflect2, or to say that they are equivalent in any way, I was just using as an example of potential risks of having things that depend on private/internal aspects of Go.

Regarding "bolt tool usage is optional" -- not sure how to decode that. Does that mean nobody is ever going to use Bolt for their production builds? If so, why exactly is it that we need BOLT in the first place? :-)

@cherrymui
Copy link
Member

Specific to emitting relocations, it would be good if you could just use the external linker, as it already has such flags. This way we would not need to add and maintain a feature to the Go linker that may be rarely used (at least, not in Go's typical development flow).

If the gold linker has a bug with that, could you use LLD? Or have the bug fixed for gold? Or fix up the incorrect relocations in your code?

Also, for internally linked binary, since you already disassemble the binary, would it be okay to just use the addresses in the instructions with symbol table lookups? What information could you get with emitting relocations but not this way? Thanks.

@yota9
Copy link

yota9 commented Oct 27, 2021

@thanm Of course somebody will use them. And after that they will need to test it carefully :) I just want to say that it seems to be OK to have some restriction like the unsupported packages like above. The people who will use BOLT and will have some problems will have a choice - either use the package like moderngo, either BOLT, either patch the BOLT/moderngo so it will be able to support this exact package :) So I just want to understand what kind of risks and worries we are talking about, because obviously as it is the post-linker optimizer tool we always may broke something, even in simple C binaries. This is not 100% reliable tool and the production binaries must be tested carefully anyway.

@yota9
Copy link

yota9 commented Oct 27, 2021

@cherrymui

If the gold linker has a bug with that, could you use LLD? Or have the bug fixed for gold? Or fix up the incorrect relocations in your code?

Of course we are working in all directions, including this one :)

Also, for internally linked binary, since you already disassemble the binary, would it be okay to just use the addresses in the instructions with symbol table lookups? What information could you get with emitting relocations but not this way?

Here is the example:

   0x0000000000232028 <+40>:	movz	x3, #0x0, lsl #48
   0x000000000023202c <+44>:	movk	x3, #0x0, lsl #32
   0x0000000000232030 <+48>:	movk	x3, #0x21, lsl #16
   0x0000000000232034 <+52>:	movk	x3, #0x808

It is not possible to tell that this is an address written in register without the relocations. It might be just the 64 bit imm number filled in register. That is why we need the relocations to know that this is exactly the address, that is pointing to something :)

@cherrymui
Copy link
Member

Thanks. That example makes sense.

It is good that you're working on using the external linker, which I think is the best solution for this case. Thanks.

@yota9
Copy link

yota9 commented Oct 27, 2021

@cherrymui I think for now we will continue to investigate the golang linker too. But that time I would like to remind about aarch64 mappings symbols, there is no way to overcome this problem, so please don't just reject this patch, it is the most important one :)

@seankhliao seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 12, 2022
@chao-peng
Copy link

Hello @cherrymui ! This feature is needed to support binary optimizers e.g. llvm-bolt (we where speaking a bit about it #47908). Currently we are using external linker to use bolt optimizer, but it will be nice to have ability to use the golang linker and remove extra dependencies. Also the external linkers might have some bugs, e.g. both ld and gold linkers have some problems for aarch64 PIC binaries linked with emit-relocs flag e.g. https://sourceware.org/bugzilla/show_bug.cgi?id=28234

@yota9 Hi,
May I ask which external linker you are using? Is there any references to this? We are facing the same issue here. Thank you!

@yota9
Copy link

yota9 commented Aug 25, 2022

@chao-peng Hello! AFAIK you can try lld, it might work well. As for us we are using gold now, I've made two patches for the linker. Unfortunately the patch with emit-relocs problem is waiting for it's turn, the first patch is under review now, it fixes veneer insertion for aarch64.

@zamazan4ik
Copy link

@yota9 excuse me for the necroposting :)

Did anything change since the last comment about LLVM BOLT support for Go binaries? Do you have a step-by-step guide about how to use LLVM BOLT with Go binaries? I am interested in applying this binary optimizer for Go programs too :)

@yota9
Copy link

yota9 commented Dec 3, 2023

@zamazan4ik Hi! The process of upstreaming is stopped, I don't have time for this and didn't see enough interest from both go and llvm-bolt community. the change is quite huge and I have no idea how to complete it now :) I was speaking about it with the BOLT creator, he said that there might be some interest now in this field and we might continue this process together one day, but I'm not sure when. The PoC could be found here, but it was not updated for a long time https://reviews.llvm.org/D124347 , so it might contain some unresolved bugs and doesn't support later versions. Internally we're continue to use it successfully in production, currently the latest version supported is 1.18* .

@zamazan4ik
Copy link

Thank you a lot for your efforts! Do you have some performance numbers after applying BOLT to Go binaries that you can share here? I am interested in how much performance did you get with applying BOLT.

currently the latest version supported is 1.18

It's a pity :( Since would be interesting to apply BOLT as an additional optimization step after Go PGO (that was introduced in Go 1.21). Anyway, great work!

@yota9
Copy link

yota9 commented Dec 3, 2023

Thank you a lot for your efforts! Do you have some performance numbers after applying BOLT to Go binaries that you can share here? I am interested in how much performance did you get with applying BOLT.

Sure, my colleague read the presentation for the llvm dev meeting, you can see some benchmarks here: https://youtu.be/AT-ttb6VwRA?si=OwWa0zFadTDwp4OS&t=402 . TLDR up to 19% :)

It's a pity :( Since would be interesting to apply BOLT as an additional optimization step after Go PGO (that was introduced in Go 1.21)

Well, the versions we're supporting are dictated by the version used by production team :) Usually we may expect one major version increase every year, so probably I won't expect 1.21 to be used in the nearest time. Although if it would be open sourced everyone would be able to help to support any version of course. I'm supporting it from 1.14 and usually it is not a very big problem to port it to the newer version. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

8 participants