New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/internal/objabi, cmd/link: direct calls not correctly identified on riscv64 #62465
Comments
cc @golang/riscv64 @golang/compiler |
Is this a regression in Go 1.21? Does it build with Go 1.20? Thanks. |
Yes. Go1.20 can build. |
I can confirm this issue, bisect shows d49719b is the first bad commit. cc @randall77 |
In triage, we think the linker failure is a red herring and that it's actually an issue with the write barrier on RISC-V. The problem is we can't really make progress without a RISC-V machine to access (we have a builder, but we don't have SSH privileges). @randall77 briefly looked into this, but got stuck there. While we look for a way to access a machine in the background, it would be really helpful if someone from @golang/riscv64 could continue investigation here with these breadcrumbs. Thanks. |
Feel free send me |
Some updates, I can build this amtool now but can't start it
I found that defautls.go has an abnormal big initial function (952KB), both PC and link failure might related to big init function. |
I can reproduce the original error with 1.21.1. |
I suspect this is likely to be a riscv64 assembler or linker issue - some quick data points (on openbsd/riscv64):
The The next step is to figure out what that call is supposed to be targeting... |
Two more data points... the reported error message can be reproduced by forcing external linking:
The
|
@thanm This looks like maybe the lazy map stuff? That's the symbol that isn't reached. |
Thanks, yes, it does look related. I think the problem might be this code here: which doesn't include R_RISCV_PCREL_ITYPE. Linker deadcode is calling IsDirectCall here: and if that's not returning TRUE for the call from init func to outline map init fragment, that will be a problem. |
I arrived at the same conclusion - what I believe is happening is that the assembler switches the https://go-review.googlesource.com/c/go/+/520095 That said, based on the comment above FWIW, commenting out the |
I thought the JAL to AUIPC+JALR rewrite still keeps an R_RISCV_CALL relocation around? So the linker can find the direct call target in a few places. Does it not? |
No, the This is needed so that we can distinguish between the single instruction |
There are a number of places in the linker where the code needs to detect whether a relocation is a direct call, e.g.
For RISCV calls are converted to |
@4a6f656c I suggest that when rewriting JAL to AUIPC+JALR, we emit an R_RISCV_PCREL_ITYPE relocation but also keep the R_RISCV_CALL marker relocation. R_RISCV_CALL doesn't need to imply any code change, just tells the linker what the direct call target is. I thought it worked that way, but apparently it does not... I wonder if (and how) the current stack bounds check in the linker works for RISCV64... Maybe CL 520095 does so. I haven't looked at the CL.
I'm not sure why we need to distinguish between them. I was thinking R_RISCV_PCREL_ITYPE applies to actual code generation and R_RISCV_CALL is just a marker. |
I think we're all agreeing that this is a long standing bug that needs to be fixed - |
Currently, I see a number of ways to fix this:
I'll try to send a fix out today or tomorrow. |
Change https://go.dev/cl/520095 mentions this issue: |
Hi, could this be backported to 1.21 as it's a regression and I'm not sure how to workaround it. |
@gopherbot Please open backport to 1.21. This is a build failure with no straightforward workaround. It is a regression from 1.20. |
Backport issue(s) opened: #63166 (for 1.20), #63167 (for 1.21). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases. |
@ianlancetaylor Could we please retitle this issue and the backport issue - it is really "cmd/internal/objabi,cmd/link: direct calls not correctly identified on riscv64" (it does not actually have anything to do with write barriers). Also, just to clarify, this issue exists in 1.20 - however the code that makes it a noticeable issue does not. @zhsj One workaround for the issue is to compile with
|
I feel CL 520095 is a bit invasive for backporting. Could we use some simpler workaround, like adding some relocation type to IsDirectCall? Will it cause any problem? Thanks. |
This issue seems not fixed completely. When building Go packages containing aws-sdk-go using Go 1.22.0 and Arch Linux RISC-V, similar errors appear, though output differs. Build log taken from https://archriscv.felixc.at/.status/log.htm?url=logs/rekor/rekor-1.3.5-1.log:
|
This failed to build on riscv64 due to a bug in the Go compiler. With the upgrade to Go 1.22 this bug has been fixed. Hence, we can enable mimir on riscv64 again. See: * https://gitlab.alpinelinux.org/alpine/aports/-/commit/44b844d6da762606eba8912f97877c565b70ef4f#note_363441 * golang/go#62465 Co-authored-by: raspbeguy <guy.godfroy@gugod.fr>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
Build successfully.
What did you see instead?
The text was updated successfully, but these errors were encountered: