Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: internal link issue with outline atomics feature on arm64 #39466

Closed
shawndx opened this issue Jun 9, 2020 · 39 comments
Closed

cmd/link: internal link issue with outline atomics feature on arm64 #39466

shawndx opened this issue Jun 9, 2020 · 39 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@shawndx
Copy link
Contributor

shawndx commented Jun 9, 2020

What version of Go are you using (go version)?

$ go version
tip

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env


GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/xiaji01/.cache/go-build"
GOENV="/home/xiaji01/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/xiaji01/.go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/xiaji01/.go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/xiaji01/util/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/xiaji01/util/go/pkg/tool/linux_arm64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build044102122=/tmp/go-build -gno-record-gcc-switches"

What did you do?

GCC-9.4 introduced the outline atomics support for arm64 and the feature was made default since GCC-10.1. Go's internal linking mode doesn't work well with the feature, here is what have been spotted so far and a sample case, further analysis is ongoing.

With GCC-10.1 and default CGO flags, or GCC-9.4+with an additional "-moutline-atomics" flag, the sample case runs into a link issue with '-linkmode=internal', internal linking tries to resolve undefined external references in libgcc.a, while the problematic variable __aarch64_have_lse_atomic and its init function don't get resolved.

$ go build -ldflags='-linkmode=internal'
libgcc(.text): unknown symbol __aarch64_have_lse_atomics

lse.c

int test_cas_atomic_int (int *ptr, int *expected, int desired)
{
  return __atomic_compare_exchange_n (ptr, expected, desired, 0, 0, 0);
}

mult.c

#include <stdlib.h>
#include <stdio.h>

extern int test_cas_atomic_int (int *ptr, int *expected, int desired);

int test_cgo(int a, int b) {
  int old = a;
  int new = b;

  printf("before: old = %d, new = %d\n", old, new);
  test_cas_atomic_int(&old, &old, new);
  printf("after: old = %d, new = %d\n", old, new);
  return new * old + old;
}

mult.go

package main

/*
extern int test_cgo(int a, int b);
*/
import "C"
import (
 "fmt"
)

func main() {
 a := 3
 b := 5
 c := C.test_cgo(C.int(a), C.int(b))
 fmt.Println("test_cgo:", int(c))
}

What did you expect to see?

Trying to apply GCC-10 and its new LSE features to CGO part, expect they work smoothly.

What did you see instead?

Internal linking doesn't work.

@shawndx shawndx changed the title cmd/link: cmd/link: internal link issue with outline atomics feature on arm64 Jun 9, 2020
@shawndx
Copy link
Contributor Author

shawndx commented Jun 9, 2020

Wondering if intentionally loading the lse-init.o from libgcc.a, if it's present and other criteria satisfies, and letting cgo call the init routine at start-up a right way.
@ianlancetaylor

@shawndx
Copy link
Contributor Author

shawndx commented Jun 9, 2020

more details regarding reproducing the issue:

  1. GCC-9.4+
    A. go build -ldflags='-linkmode=internal', no issue
    B. CGO_CFLAGS="-g -O2 -moutline-atomics" go build -ldflags='-linkmode=internal', error out

  2. GCC-10.1
    A. go build -ldflags='-linkmode=internal', error out
    B. CGO_CFLAGS="-g -O2 -mno-outline-atomics" go build -ldflags='-linkmode=internal', no issue

@ianlancetaylor
Copy link
Contributor

It's important for internal linking to work when using the standard library. We don't consider it important for internal linking to work when using cgo code that is not in the standard library. There are always going to be cases where it fails. This is probably just another of those cases.

Why do you want to use internal linking?

@shawndx
Copy link
Contributor Author

shawndx commented Jun 9, 2020

@ianlancetaylor It was exposed by ./all.bash, and could be reproduced by running "go test -ldflags='-linkmode=internal' under misc/cgo/test, then we created the sample case to analyze,here is the screen shot of ./all.bash, we didn't set any additional flags/variables.

By the way, given the following error message, is there any way to help identify which test causes the issue, thought it should be issue5242 and failed to reproduce the issue by extracting relevant code into a small case.

Thanks.

===============================================================

# misc/cgo/test
./test.go: In function 'issue5242':
./test.go:376:5: note: parameter passing for argument of type 'bar' changed in GCC 9.1
376 | int issue5242(foo f, bar b) {
       |     ^~~~~~~~~
# misc/cgo/test.test
libgcc(.text): unknown symbol __aarch64_have_lse_atomics
FAIL	misc/cgo/test [build failed]

@ianlancetaylor
Copy link
Contributor

Thanks. I misunderstood. Sounds like arm64 internal linking no longer works with GCC 10, at least if the C code calls any atomic instructions. The simple fix might be to add "arm64" to the cases for which (*tester).internalLink returns false in cmd/dist/test.go.

@shawndx
Copy link
Contributor Author

shawndx commented Jun 9, 2020

Thanks @ianlancetaylor , have a few more questions.

  1. Did you see any value and feasibility to have internal linking work with the new outline atomics?
  2. Besides the fix to dist test, I wonder if a meaningful message is expected rather than the "unknown symbol" text for user's cgo program, assume we decide not supporting internal linking with the outline atomics being enabled.

@ianlancetaylor
Copy link
Contributor

Looking at the implementation of -moutline-atomics, I now wonder why it doesn't work. The Go linker in internal linking mode will normally link in libgcc, which is whether the required symbols are defined. So it is probably worth somebody looking into exactly what is going wrong.

@ianlancetaylor ianlancetaylor added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 9, 2020
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Jun 9, 2020
@cherrymui
Copy link
Member

cherrymui commented Jun 9, 2020

I wonder if it has something to do with the symbol __aarch64_have_lse_atomics being marked as hidden in libgcc. It is not surprising to me if there is some bugs in the linker with hidden symbols. In the linker there are some scary TODOs with VisibilityHidden here and there, and, admittedly I don't really understand them. I'll take a careful look... (In the case of internal linking with libgcc.a, I don't know why VisibilityHidden even matters...)

@cherrymui
Copy link
Member

Out of curiosity, why GCC 10 defaults to outline atomics? (Seems a weird decision to me. I must have missed something...)

@shawndx
Copy link
Contributor Author

shawndx commented Jun 10, 2020

Looking at the implementation of -moutline-atomics, I now wonder why it doesn't work. The Go linker in internal linking mode will normally link in libgcc, which is whether the required symbols are defined. So it is probably worth somebody looking into exactly what is going wrong.

Thanks, I'd like to work on it, could you please help assign?

@shawndx
Copy link
Contributor Author

shawndx commented Jun 10, 2020

I wonder if it has something to do with the symbol __aarch64_have_lse_atomics being marked as hidden in libgcc. It is not surprising to me if there is some bugs in the linker with hidden symbols. In the linker there are some scary TODOs with VisibilityHidden here and there, and, admittedly I don't really understand them. I'll take a careful look... (In the case of internal linking with libgcc.a, I don't know why VisibilityHidden even matters...)

That symbol is of R_AARCH64_ADR_PREL_PG_HI21 and R_AARCH64_ADD_ABS_LO12_NC types, adding support to hidden visibility of these two types in linker could resolve the symbol, but looks like more changes are needed (tried a simple fix and build worked but execution crashed), among of them might be a call to init_have_lse_atomics at proper place.
Wondering if force loading the object file where the symbol resides (lse-init.o) and calling init_have_lse_atomics in cgo start-up code is reasonable.

@shawndx
Copy link
Contributor Author

shawndx commented Jun 10, 2020

Out of curiosity, why GCC 10 defaults to outline atomics? (Seems a weird decision to me. I must have missed something...)

// The author's serial check-ins
https://patchwork.ozlabs.org/project/gcc/list/?series=73689

// announcement email
https://lists.linaro.org/pipermail/cross-distro/2020-June/000937.html

@cherrymui
Copy link
Member

static void __attribute__((constructor))
init_have_lse_atomics (void)

We need to handle constructors in internal linking now? (I don't think we do currently.) That will be ... interesting.

@ianlancetaylor why do we always statically link in libgcc in internal linking mode? I wonder if we could provide an option for dynamic libgcc?

@ianlancetaylor
Copy link
Contributor

In general the way that cmd/link treats libgcc is driven by Windows systems. We don't want to assume that a Go program built for Windows will be run on a system that has a libgcc DLL available.

I suppose we could have an option to link dynamically against libgcc. It's a bit complex as libgcc is broken into two parts, one part that must be statically linked, so the linker is typically invoked with -lgcc_s -lgcc -lgcc_eh (and let's not even get into --as-needed). If we aren't going to invoke the external linker we would still have to figure out which parts of -lgcc we need to pull in.

I agree that it would be annoying to have to support __attribute__((constructor)) when doing internal linking. Bother.

@cherrymui
Copy link
Member

Thanks, @ianlancetaylor ! Statically linking in libgcc makes sense for Windows.

If we aren't going to invoke the external linker we would still have to figure out which parts of -lgcc we need to pull in.

Okay, it seems this is also a complex option...

It seems init_have_lse_atomics only sets a boolean. @shawn-xdji what happens if this function does not run? Will the program fail or just less efficient?

@shawndx
Copy link
Contributor Author

shawndx commented Jun 11, 2020

It seems init_have_lse_atomics only sets a boolean. @shawn-xdji what happens if this function does not run? Will the program fail or just less efficient?

@cherrymui The program will be less efficient in general, as a load-exclusive-store-exclusive loop will be emitted even on CPUs having the LSE feature if the init function is skipped.

@erifan
Copy link

erifan commented Jun 11, 2020

what happens if this function does not run? Will the program fail or just less efficient?

The symbol __aa64_have_atomics is the same as the implementation mechanism and function of arm64HasATOMICS in Go. Choose different versions of certain atomic functions by checking whether the hardware supports LSE.

If we don't guarantee that any cgo code written by the user will work in internal linking mode, maybe we can find the problematic test and add a cgo option -mno-outline-atomics to workaround this problem.

But libgcc.a seems to contain this symbol, 0000000000000000 B __aarch64_have_lse_atomics
I don't understand why the linker can't find it. There may be some work to do related to hidden symbols for the internal linker ? I am not sure, just guess.

@cherrymui
Copy link
Member

Thanks. If the program can still work correctly, it seems fine for a first step not to call the constructor. Internal linking is not the default mode so it is okay to be less efficient.

@shawn-xdji @erifan if I understand correctly, this only fails in tests, not actually standard library cgo packages (net, os/user, etc.) and commands (cmd/go, etc.)? If so, then it seems fine to skip the test or passing additional flag.

I don't understand why the linker can't find it. There may be some work to do related to hidden symbols for the internal linker ? I am not sure, just guess.

As I commented earlier (#39466 (comment)), it is probably a bug in handling of hidden symbols.

@shawndx
Copy link
Contributor Author

shawndx commented Jun 12, 2020

Thanks @cherrymui ,I'm submitting a change to disable the internal linking for cgo test first.

@gopherbot
Copy link

Change https://golang.org/cl/237858 mentions this issue: cmd/dist: fix internal linking check for arm64

@jcajka
Copy link
Contributor

jcajka commented Jun 24, 2020

This is still observable when building go from master/go1.15beta1 on arm64(with gcc-10), failing the tests in ./all.bash. Is there a plan to fix/workaround this issue for go1.15?

# misc/cgo/test.test
libgcc(.text): unknown symbol __aarch64_have_lse_atomics
FAIL	misc/cgo/test [build failed]

@shawndx
Copy link
Contributor Author

shawndx commented Jun 24, 2020

Hi @jcajka , a new patchset as per Cherry's suggestion is ready, but I was not able to have it upstreamed before out of office this week, hopefully it could be under review early next week.
Could you please change CGO_CFLAGS to add -mno-outline-atomics as a workaround for now? Thanks.

@jcajka
Copy link
Contributor

jcajka commented Jun 24, 2020

@shawn-xdji thank you :). I will try it.

@jcajka
Copy link
Contributor

jcajka commented Jun 25, 2020

works for me

@shawndx
Copy link
Contributor Author

shawndx commented Jun 29, 2020

Thanks @jcajka, could you please help to verify the Patchset 3 of https://go-review.googlesource.com/c/go/+/237858 as well? Thanks a lot.

@jcajka
Copy link
Contributor

jcajka commented Jun 30, 2020

@shawn-xdji sorry for long notice. I have tested it on Fedora 32(gcc10/binutils2.34), rockpro64 sbc and it resolves the test issue and I haven't seen any regressions. I haven't tested it much past the GC compile/testsuite.

On that note I have missed this issue originally in Fedora 32(go1.14)by mistake and honestly I haven't seen it in the wild(Fedora packages) only in the testsuite. But still it would be great to see it fixed in go1.15(next beta/rc) and I would also much appreciate fix in the upcoming go1.14 point release.

Do you plan back port for go1.14?

@shawndx
Copy link
Contributor Author

shawndx commented Jul 3, 2020

Enclose a single-file test-case: https://play.golang.org/p/J-AleD6Zeu6

@shawndx
Copy link
Contributor Author

shawndx commented Jul 3, 2020

Test7978 in misc/cgo/test is the only one case that experiences the problem now, and the test commands which will fail at build phase are:

  1. GOFLAGS=-ldflags=-linkmode=internal go test -tags=internal
  2. go test -buildmode=pie -ldflags=-linkmode=internal

I've tried two fixes:

  1. to isolate the test-case in separate files and guarded under build constraints, it's doable for command 1, but for command 2 looks like a new tag is needed.
  2. utilizing CGO_CFLAGS to pass -mno-outline-atomics for the two commands on arm64 if the C compiler supports the option.

@cherrymui and @ianlancetaylor, I'm sending an updated patch having the fix 2.

@zhsj
Copy link
Contributor

zhsj commented Aug 4, 2020

As Debian has bumped the default GCC version to 10, we can't build Go on Arm64.

https://buildd.debian.org/status/package.php?p=golang-1.14
https://buildd.debian.org/status/package.php?p=golang-1.15

And CL237858 can't apply on 1.14 👀

@cherrymui
Copy link
Member

To be clear, this only affects tests in the Go toolchain itself. It does not affect actual use of Go toolchain, at least in the default mode. If anyone sees a failure in test like this, please be aware that the toolchain is built successfully and usable.

@cherrymui
Copy link
Member

For Go 1.15 I think it is better for this to be as simple as possible. I think option 1 in #39466 (comment) seems reasonable. You could pass -tags=internal for the internal linking PIE case (and we should have done that).

@shawndx
Copy link
Contributor Author

shawndx commented Aug 5, 2020

I prefer to the option #1 too, but was concerned about the side-effect of using an exiting tag, good to know PIE test should be tagged 'internal', will update a new patch to replace the option #2 which is under reviewing.

@shawndx
Copy link
Contributor Author

shawndx commented Aug 5, 2020

Uploaded Patchset #5 for https://go-review.googlesource.com/c/go/+/237858

@gopherbot
Copy link

Change https://golang.org/cl/262357 mentions this issue: cmd/dist: fix build failure of misc/cgo/test on arm64

@ianlancetaylor ianlancetaylor modified the milestones: Backlog, Go1.16 Oct 14, 2020
gopherbot pushed a commit that referenced this issue Oct 27, 2020
misc/cgo/test fails in 'dist test' on arm64 if the C compiler is of GCC-9.4 or
above and its 'outline atomics' feature is enabled, since the internal linking
hasn't yet supported "__attribute__((constructor))" and also mis-handles hidden
visibility.

This change addresses the problem by skipping the internal linking cases of
misc/cgo/test on linux/arm64. It fixes 'dist test' failure only, user is expected to
pass a GCC option '-mno-outline-atomics' via CGO_CFLAGS if running into the same
problem when building cgo programs using internal linking.

Updates #39466

Change-Id: I57f9e85fca881e5fd2dae6c1b4446bce9e0c1975
Reviewed-on: https://go-review.googlesource.com/c/go/+/262357
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
@shawndx shawndx closed this as completed Nov 2, 2020
@cherrymui
Copy link
Member

We only skipped the test. I think we still want to fix internal linking, at least the hidden visibility part. (Not in Go 1.16.)

@cherrymui cherrymui reopened this Nov 2, 2020
@cherrymui cherrymui modified the milestones: Go1.16, Unplanned Nov 2, 2020
@gopherbot
Copy link

Change https://golang.org/cl/383554 mentions this issue: cmd/dist: skip internal linking tests on arm64

gopherbot pushed a commit that referenced this issue Feb 7, 2022
The previous workaround for issue #39466 only disabled this test for
Linux. However, the issue manifests for all arm64 systems with gcc 9.4
and above. The new netbsd-arm64 builder uses NetBSD-current with gcc
10.3, so it fails in the same way.

Updates #39466

Change-Id: I276a99a5e60914e5c22f74a680e461bea17cfe92
Reviewed-on: https://go-review.googlesource.com/c/go/+/383554
Trust: Benny Siegert <bsiegert@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

Change https://go.dev/cl/404296 mentions this issue: cmd/link: fix handling of visibility hidden symbols

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 30, 2022
@dmitshur dmitshur modified the milestones: Unplanned, Go1.19 May 30, 2022
@gopherbot
Copy link

Change https://go.dev/cl/415074 mentions this issue: [release-branch.go1.17] cmd/dist: skip internal linking tests on arm64

gopherbot pushed a commit that referenced this issue Jun 29, 2022
The previous workaround for issue #39466 only disabled this test for
Linux. However, the issue manifests for all arm64 systems with gcc 9.4
and above. The new netbsd-arm64 builder uses NetBSD-current with gcc
10.3, so it fails in the same way.

Updates #39466.
For #53050.

Change-Id: I276a99a5e60914e5c22f74a680e461bea17cfe92
Reviewed-on: https://go-review.googlesource.com/c/go/+/383554
Trust: Benny Siegert <bsiegert@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
(cherry picked from commit 1d60513)
Reviewed-on: https://go-review.googlesource.com/c/go/+/415074
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@golang golang locked and limited conversation to collaborators Jun 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants