Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/cgo: crosscall2 must save all callee-saved registers (xmm, VFP, etc.) #14876

Closed
sqweek opened this issue Mar 19, 2016 · 9 comments
Closed
Milestone

Comments

@sqweek
Copy link
Contributor

sqweek commented Mar 19, 2016

This was a bastard to diagnose. To summarise:

  1. Under windows the floating-point registers XMM6 to XMM15 are categorised as "non-volatile" across function calls (https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx)
  2. memmove_amd64.s clobbers these registers in some cases without restoring their original values (specifically the 65 through 256 byte cases)

For pure go code this is perhaps a non-issue. In my case I have a long-lived C thread (spawned by portaudio) repeatedly calling back into go. After two days debugging it has become apparent that gcc is using XMM6 for a certain floating-point calculation, and is expecting the register to remain unmodified after executing the callback (which seems to be within its rights according to the MSDN link).

However my go code calls the builtin copy. At some stage it tries to copy 66 bytes, causing memmove to clobber XMM6. Once the callback returns the C code keeps referring to XMM6 in further floating point operations, producing garbage.

This particular set of circumstances has left me unable to create a simple reproducer (my workspace is currently full of debugging hacks at every level), but I am entirely satisfied with this explanation. I've observed XMM6 being clobbered after the callback via gcc, and any of the following changes cause the symptom to disappear:

  1. Restructure the code so that every call to copy is of the same size (2,304 bytes)
  2. Avoid the 66-byte copy invocation
  3. Comment out the JBE move_65through128 and JBE move_129through256 lines in memmove_amd64.s
  4. Add a conditional printf to the C function (even if the condition is never met at runtime - presumably gcc decides storing the value in a register is no longer worth it?)

gdb's register info before the callback:

xmm6           {..., v2_int64 = {0x40caafb7a95d3d0e, 0x0}, ...}
xmm7           {..., v2_int64 = {0x3fe0000000000000, 0x0}, ...}

ie. float64(xmm6) = 13663.434856085598 which is inline with previous timings reported by my program:

Adc Time:                0, Current:  13663.414530729, Dac:  13663.498625786
Adc Time:                0, Current:  13663.424691487, Dac:  13663.511714507

After the callback does its 66-byte memmove, gdb says:

xmm6           {..., v2_int64 = {0x7bf99add7ce993ee, 0x79bbab5b7ae9a2b2}, ...}
xmm7           {..., v2_int64 = {0x7701bed9786db4c5, 0x73d0d4a57578c981}, ...}

After accounting for endianness this exactly matches the last 32 bytes in the buffer I passed to copy. So there is little doubt left as to what is happening - the only remaining question is whether gcc is in error for expecting the register to remain unchanged, or go is in error for not restoring the register's value. As I said earlier it looks like go's fault, but I am not an expert in this area and a single MSDN article is perhaps not entirely authoritative.

I'm going to take a break from this problem now and enjoy the rest of my weekend, but happy to provide further information/prepare a minimal-as-possible reproducer as desired.

For completeness, my environment details:

What version of Go are you using (go version)?
go version go1.3.3 windows/amd64
go version go1.6 windows/amd64

(I've mainly been using 1.3.3 but have confirmed the problem still exists in 1.6)

What operating system and processor architecture are you using (go env)?
$ go env
set GOARCH=amd64
set GOBIN=
set GOCHAR=6
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=d:/code/go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1

$ gcc --version
gcc.exe (rev5, Built by MinGW-W64 project) 4.8.1
@sqweek
Copy link
Contributor Author

sqweek commented Mar 19, 2016

Apparently its not just gcc, Wikipedia has this to say:

In x86-64, Visual Studio 2008 stores floating point numbers in XMM6 and XMM7 (as well as XMM8 through XMM15); consequently, for x86-64, user-written assembly language routines must preserve XMM6 and XMM7 (as compared to x86 wherein user-written assembly language routines did not need to preserve XMM6 and XMM7). In other words, user-written assembly language routines must be updated to save/restore XMM6 and XMM7 before/after the function when being ported from x86 to x86-64.

@minux
Copy link
Member

minux commented Mar 19, 2016

memmove is ok as the it's using Go ABI (all registers are caller-save.)

The problem is in cgo callback code (crosscall2): it should also save
callee-saved xmm registers.

I can see that our arm version also fails to save vfp registers (s16-s31
(d8-d15) are callee-save.) ARM64 needs to save lower 64-bit of v8-v15.
AMD64 needs to save (only Windows requires this) lower 128-bit of
xmm6-xmm15. 386 is ok.

Only ppc64x is correct in this regard. However, the problem is that on
certain architectures, we don't know if we can access FP registers,
e.g. ARM with GOARM=5 and 386 with GO386=387.)

@minux minux changed the title memmove's register usage doesn't respect windows/amd64 calling convention? runtime/cgo: crosscall2 must save all callee-saved registers (xmm, VFP, etc.) Mar 19, 2016
@minux minux added this to the Go1.7 milestone Mar 19, 2016
@minux minux self-assigned this Mar 19, 2016
@ianlancetaylor ianlancetaylor modified the milestones: Go1.6.1, Go1.7 Mar 20, 2016
@gopherbot
Copy link

CL https://golang.org/cl/21085 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/21084 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 25, 2016
For #14876.

Change-Id: I33947f74e8058437a784862f1f064974afc99250
Reviewed-on: https://go-review.googlesource.com/21084
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
gopherbot pushed a commit that referenced this issue Mar 25, 2016
For #14876.

Change-Id: I0992859264cbaf9c9b691fad53345bbb01b4cf3b
Reviewed-on: https://go-review.googlesource.com/21085
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@bradfitz bradfitz modified the milestones: Go1.6.1, Go1.6.2 Apr 7, 2016
@rsc
Copy link
Contributor

rsc commented Apr 7, 2016

This sucks but it has been broken for many releases. I think it can wait until Go 1.7 to release a fix.

@rsc rsc modified the milestones: Go1.7, Go1.6.2 Apr 7, 2016
@rsc
Copy link
Contributor

rsc commented Apr 7, 2016

(Unless someone wants to explain why this is a regression since Go 1.5, which it appears not to be.)

@rsc rsc closed this as completed Apr 7, 2016
@rsc rsc reopened this Apr 7, 2016
@minux
Copy link
Member

minux commented Apr 7, 2016 via email

@gopherbot
Copy link

CL https://golang.org/cl/23140 mentions this issue.

@rsc
Copy link
Contributor

rsc commented May 18, 2016

Glad we fixed the others. Let's leave ARM for another day.

@rsc rsc modified the milestones: Go1.8, Go1.7 May 18, 2016
@golang golang locked and limited conversation to collaborators May 25, 2017
@rsc rsc unassigned minux Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants