Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile, cmd/link: use dyld TLV support on darwin #17200

Closed
crawshaw opened this issue Sep 23, 2016 · 33 comments
Closed

cmd/compile, cmd/link: use dyld TLV support on darwin #17200

crawshaw opened this issue Sep 23, 2016 · 33 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@crawshaw
Copy link
Member

crawshaw commented Sep 23, 2016

We have an unpleasant hack for thread-local storage on OS X. The compiler assumes we can use a fixed offset (pthread key) and on initialization we call pthread_create_key until we get it, crashing if we don't. No real programs crash yet, but eventually a Go program depending on enough C libraries will.

More details about how this works: https://github.com/golang/go/blob/0c7ccbf601976a/src/runtime/cgo/gcc_darwin_386.c#L23

In OS X 10.7, the dynamic linker, dyld, was taught some rudiments of thread-local variables (TLVs) for C11's __thread, and the C++ equivalent. If we are willing to drop support for Mac OS X 10.6, we can also drop the fixed offset hack.

I haven't been able to find any documentation for this feature. All there is is dyld source code and speculatively compiling C programs:

http://opensource.apple.com//source/dyld/dyld-195.5/src/threadLocalVariables.c
http://opensource.apple.com//source/dyld/dyld-195.5/src/threadLocalHelpers.s

In particular, here's two interesting pieces of C, and what they compile to on OS X:

int y;

void use_y() {
    y = 7;
}

_use_y:
0000000100000e80    pushq   %rbp
0000000100000e81    movq    %rsp, %rbp
0000000100000e84    leaq    _y(%rip), %rax
0000000100000e8b    movl    $0x7, (%rax)
0000000100000e91    popq    %rbp
0000000100000e92    retq
0000000100000e93    nopw    %cs:(%rax,%rax)
__thread int x;

void use_x() {
    x = 7;
}

_use_x:
0000000100000e60    pushq   %rbp
0000000100000e61    movq    %rsp, %rbp
0000000100000e64    leaq    _x(%rip), %rdi
0000000100000e6b    callq   *(%rdi)
0000000100000e6d    movl    $0x7, (%rax)
0000000100000e73    popq    %rbp
0000000100000e74    retq
0000000100000e75    nopw    %cs:(%rax,%rax)

So accessing a TLV (x) is just like accessing a global (x), except before using the value you call it like a function.

I believe what's happening here is the __thread x is a TLVDescriptor as defined in the threadLocalVariables.c linked above. The first field, thunk, is a callable function pointer that dyld fills out for you, if you ask for it nicely in the appropriate mach-o header field.

The extra call should be easy enough to generate in cmd/compile. And while I haven't extracted the mach-o TLV definition header yet, we only need the one, runtime.tlsg, so it should be easy enough to add that to cmd/link.

I'm not planning on working on this, I just wanted to write it down because I can't find details about this anywhere else. If anyone sees the error message:

runtime/cgo: pthread_key_create failed

or

runtime/cgo: could not obtain pthread_keys

from their Go program on OS X, I hope they find this.

@crawshaw crawshaw added this to the Unplanned milestone Sep 23, 2016
@bradfitz
Copy link
Contributor

If we are willing to drop support for Mac OS X 10.6, we can also drop the fixed offset hack.

We already have, in #9511. We also don't really support OS X 10.7, where we don't currently compile even. (But I think binaries might still sometimes work on 10.7, but we don't have a builder since it doesn't compile there, and our official min version remains 10.8)

@ianlancetaylor
Copy link
Contributor

I think that to avoid problems like the Sierra gettimeofday problem (#16272) we are going to have to change to always use external linking on Darwin, as we already do on Solaris. That should also fix this problem.

@crawshaw
Copy link
Member Author

crawshaw commented Sep 23, 2016

On using darwin's libc without external linking:

@ianlancetaylor I was thinking about that problem recently, as I have a similarly interesting situation with the Fuchsia port, where I'm going to need to switch to a VDSO for making syscalls.

In that case, the platform is ELF-compatible, so I can internally link, bring in the VDSO syscall wrappers with go:cgo_import_dynamic, and let the dynamic linker wire them up.

I think we could do something similar on darwin. Even though there is no spec for mach-o and dyld's behavior, one is effectively enforced on its authors by backwards compatibility. Old OS X binaries have to keep running.

So darwin programs will end up with a dynamic libc dependency, but don't require the darwin C++ toolchain to build.

@minux
Copy link
Member

minux commented Sep 24, 2016 via email

@ianlancetaylor
Copy link
Contributor

I don't like the idea of always using the external linker either. But Apple has clearly demonstrated that, unlike most Unix systems, they regard the stable ABI not as the kernel boundary, but as the shared C library boundary. And while it's true that we can do our own dynamic linking with the shared C library, that format is undocumented and also subject to change.

I don't feel all that strongly about this--I don't use Darwin myself. But we are still picking up the pieces from the Sierra breakage, and I think it is clear that it is going to break again. How can we best serve users of Go going forward?

@crawshaw
Copy link
Member Author

If we generate the same bytes from internal dynamic linking that we get from external dynamic linking, then we get the same number of breakages.

That means the choice is substantially different than using libc calls v. syscalls. I agree we should use darwin's libc.

@mattetti
Copy link
Contributor

The compiler assumes we can use a fixed offset (pthread key) and on initialization we call pthread_create_key until we get it, crashing if we don't. No real programs crash yet, but eventually a Go program depending on enough C libraries will.

Just for the record, we did see our code crash when using c-shared (on macOS). To work around this issue, we applied an unfortunate hack by pushing the fixed address :(

@ianlancetaylor
Copy link
Contributor

If we generate the same bytes from internal dynamic linking that we get from external dynamic linking, then we get the same number of breakages.

True, but a future release of MacOS will change the dynamic linking algorithm, and then we won't generate the same bytes any more. There is nothing at all that constrains Apple from changing things so that linking on version N requires an algorithm different than linking on version N-1. They aren't constrained by documentation or compatibility or tooling, they are only constrained by ensuring that programs linked on version N-1 continue to run on version N. That means that things will inevitably change, because there is nothing that will prevent it.

@crawshaw
Copy link
Member Author

crawshaw commented Sep 26, 2016

Now you are discussing toolchain compatibility, which I believe that is a far less important problem than end-user binaries.

On toolchain compatibility: I agree that we could fall out of sync easily. However, two points. Firstly, the consequences are far easier to manage. We release a new Go and say for macOS N, you need Go M.

Secondly, external linking does not help. The object files we produce for the external linker have as mercurial a format as the dynamic linker. If anything the problem with external linking is worse, because the only compiler Apple cares about is developed and released with a matching linker version in Xcode. The Go toolchain's external linking can be broken by Xcode point releases.

@minux
Copy link
Member

minux commented Sep 26, 2016 via email

@copumpkin
Copy link

Sorry all if I'm missing something, but I was just referred here from #17490. Am I understanding correctly that you're saying that Apple might change how one dynamically links a Mach-O binary to a library? I don't think that really happens much, as the format hasn't changed in years and is publicly documented (including by Apple themselves, although I can't find the page anymore 😮). The only compatibility-breaking changes I know of are that since 10.11, they stopped respecting custom dyld interpreter requests, and they've added a few oddball load commands over the past few years that might prevent binaries using them from running on older systems.

Dynamically linking against libSystem (Apple's recommended route) seems substantially different from writing your own syscall wrappers, which Apple has explicitly documented to be unsupported and dangerous behavior. It's not ideal to depend on Xcode, but writing a linker that knows how to dynamically link exactly one thing (with a well specified format) seems like an easier task than maintaining a mass of syscall wrappers that Apple has said they'll probably break over time.

@ianlancetaylor
Copy link
Contributor

I am saying that Apple does not document the dynamic linker behavior. The only thing they consider stable is invoking the XCode linker to produce a binary. If we do anything else we will eventually run into the problem that Go version N does not build on MacOS version Q. In other words, to build on MacOS version Q you need to run Go version N+1. I agree that if we do dynamic linking we can be in a better situation than we currently are, because today we see that Go version N built on MacOS version Q-1 does not run on MacOS version Q. If we implement the (undocumented) dynamic linking correctly, then we will avoid that problem: Go version N may not build on MacOS version Q, but at least a program bult with Go version N on MacOS version Q-1 will run on MacOS version Q.

That said it's important to note that this will not fix the bootstrapping problem. We will be forced to backport all dynamic linker changes to Go 1.4. Perhaps we can avoid that by forcing Go 1.4 to always use external linking on Darwin when bootstrapping.

@copumpkin
Copy link

copumpkin commented Jul 20, 2017

@ianlancetaylor I'm not necessarily disagreeing, but do you have some sort of statement from Apple about XCode's ld being the only supported way of linking things? Mach-O is a reasonably well documented format that predates Apple's use of it, and although Apple has extended it a bit since adopting it, the basic mechanisms for pulling in a dynamic dependency are fairly set in stone nowadays. I'm not saying that they won't change it someday, but given that they generally don't break old compiled executables in new macOS versions, I doubt it would be a sudden thing. Object format feels much more stable than syscall ABI, which is explicitly called out as not being stable.

Anyway, more broadly, I don't feel terribly strongly about Go using an external ld or implementing its own dynamic Mach-O generation code. The external dependency on Xcode seemed less in line with what Go does elsewhere so that's why I was suggesting a very simple dynamic Mach-O generation process in Go, but it's not as if I object to the other approach 😄

@ianlancetaylor
Copy link
Contributor

@copumpkin Yes, I have been told by Apple developers (that is, developers at Apple who work on the tools and libraries) that the only thing they consider stable is invoking the XCode linker to produce a binary.

@kofalt
Copy link

kofalt commented Jul 26, 2017

I believe I may have found a real-world instance of this problem occurring.

I have been working on a library for Matlab that links against Golang source code using -buildmode=c-shared. On linux, this works great, but on Darwin, Matlab crashes while loading the library:

$ /Applications/Matlab.app/bin/matlab -nojvm
 
>> addpath('/path/to/matlab-golang-crash')
>> loadlibrary('simple', 'simple.h')

runtime/cgo: could not obtain pthread_keys

Initially, I thought the error was in my code, so I created the smallest possible example with a function that has no arguments and no return values. This also crashed, using go 1.8.3 and 1.9rc1.

You can try it out for yourself with my matlab-golang-crash repo.
It's my current understanding that these two tools cannot be used together on Darwin.

There's an example crash here; let me know if there's anything I could do in the short term to mitigate this issue. I'd also be happy to file a separate issue if desired.


Notably, I also contacted Matlab support, and Ritesh Chandna of MathWorks had this to say:

I collaborated with my colleagues to discuss the issue you are facing and from our understanding the crash is coming from the signal handler. It is quite possible that here is a bug in Golangs signal handler or an incompatibility with MATLAB's signal handler. Because many sigsegv signals are generated by the jvm for garbage collection, signal handler problems are quickly found when using MATLAB. One thing to try is to run matlab -nojvm and see if the problem does not occur or happens much less frequently. It will force you to run MATLAB from a command line. If Go is writing anything useful to stdout or stderr before calling abort, we are more likely to see it on the command line than if MATLAB is launched from the .app icon.

It is quite difficult to guarantee compatibility between signal handers in a shared library and a main program. It is best if shared libraries do not contain or install signal handlers.

@ypujante
Copy link

I have run into the same issue as well. I was trying to build a VST (2.4) plugin with a decent language (not C++) and Go seemed like a great choice.

I wrote a small bootstrap (bundle) in C++ which implements the VSTPluginMain entry point and delegate to the Go code for the real work (I could never find a way to build a bundle with Go...). I was very happy to make it work entirely when using MrsWatson: I can process a sound file through my plugin and it works with all the logic written 100% in Go:

// simply multiply every input sample by 0.5 which reduces the volume by 3dB
func (p *MyPlugin) ProcessSamplesFloat32(inputs [][]float32, outputs [][]float32) {
	for i, input := range inputs {
		output := outputs[i]
		for j, sample := range input {
			output[j] = sample * 0.5
		}
	}
}

But when the plugin gets loaded via a real DAW (Reason/Maschine...), then it crashes due to calling abort() and the error message runtime/cgo: could not obtain pthread_keys (which is this code )

@mattetti
Copy link
Contributor

@ypujante we got something similar working but had to use some seriously bad hacks to just prove you could build an audio plugin in Go. The good news is that it works, the bad news is that there is a lot of internal work needed to make things work. I dropped you an email to see if you'd be interested in chatting offline.

@scisci
Copy link

scisci commented Feb 12, 2018

hi @mattetti what was the workaround you used? I just ran into this problem. I wrote a small VST plugin which imports a Go library using buildmode=c-archive. My VST crashes when it loads as a plugin:

runtime/cgo: could not obtain pthread_keys
	tried 0x11e 0x11f 0x120 0x121 0x122 0x123 0x124 0x125...

It looks like the Go developers were under the assumption that this Go initialization code would run immediately, but instead the plugin gets loaded at any time.

Would love to know how to work around this since I just spent a lot of time getting the plugin working in standalone mode.

@ypujante
Copy link

@scisci the workaround is to patch the go source code itself unfortunately (and create your own build of it) which is not sustainable as it will require to do that with every version of go. This is what @mattetti did: https://gist.github.com/mattetti/857bc651115096c921121be464d4669f. I didn't try it myself as I do not want to rely on a hack. I am very disappointed as I wanted to not have to deal with C++ anymore... and I am hoping that the go team will come with an actual fix for this issue (caused by a hack in the first place... see comments...).

@scisci
Copy link

scisci commented Feb 12, 2018

hey @ypujante that worked! Thanks for the quick response you and @mattetti saved this project for me.

Hopefully this issue will be resolved soon so the patch is not necessary.

@mattetti
Copy link
Contributor

mattetti commented Feb 12, 2018 via email

@scisci
Copy link

scisci commented Feb 12, 2018

hey @mattetti, this is only for an internal tool I'm making for myself, so an occasional crash is better than the thing never working. Have you discovered any safer ways of getting around this bug?

@ypujante
Copy link

I agree with @mattetti that this hack should not be used for production and I certainly would not recommend anybody using it for this purpose. That being said for an internal tool, I guess it's ok but prepare yourself for random problems...

I do hope too that the go team fixes this original problem as more and more people are having the same idea (writing VST plugins in a language that doesn't suck...).

@scisci
Copy link

scisci commented Feb 13, 2018

A small update in case anyone else comes across this thread. Using the patch above worked in a one VST host (JUCE demo host), but the plugin would not load into Logic or Ableton Live. I went back and increased the offsets of the patch to solve this. For AMD64, this was changing tlsoffset to 0xaa0, and increasing the pthread_key_t tofree array to 1024.

This allows the plugin to load and mostly work but I do get occasional crashes. Not sure if those are related to this hack or not (but probably).

I have little to no experience working this low level so really working in the dark here, but I don't want to have to rewrite my Go code in c++.

@crawshaw can you enlighten me on the logic for this hack? For instance the comment reads that key numbers start at 0x100, but since we don't know if we will be the first one, we try for 0x108. Does that basically saying that we have to be at most, the 8th call to pthread_key_create? So by increasing this offset we can be later and later?

For instance, maybe in audio software there are many many calls to pthread_key_create and that's whats causing the issue?

@lsegal
Copy link

lsegal commented Feb 13, 2018

can you enlighten me on the logic for this hack?

The hack works by brute-force searching for correct the TLS base address. Increasing the offset allows it to search further down in the list.

For instance, maybe in audio software there are many many calls to pthread_key_create and that's whats causing the issue?

That's the issue. If the host process spawns threads prior to loading the shared library, the offset will have changed.

@kofalt
Copy link

kofalt commented Feb 14, 2018

Thank you for the link to the patch. As per comments upthread I understand it's just a hack and should not be deployed, but it helped confirm my understanding of the problem.

@4h0q
Copy link

4h0q commented Mar 23, 2018

I've bumped into the issue for yet another case. The crash happens when statically linking a go c-archive library with a MacOS project with lots of dependences including other c-archive libraries

go build -buildmode=c-archive -o lib.a

runtime/cgo: could not obtain pthread_keys
	tried 0x110 0x111 0x112 0x113 0x115 0x116 0x117 0x118 0x119 0x11a 0x11b 0x11c 0x11d 0x11e 0x11f 0x120 0x121 0x122 0x123 0x124 0x125 0x126 0x127 0x128 0x129 0x12a 0x12b 0x12c 0x12d 0x12e 0x12f 0x130 0x131 0x132 0x133 0x134 0x135 0x136 0x137 0x138 0x139 0x13a 0x13b 0x13c 0x13d 0x13e 0x13f 0x140 0x141 0x142 0x143 0x144 0x145 0x146 0x147 0x148 0x149 0x14a 0x14b 0x14c 0x14d 0x14e 0x14f 0x150 0x151 0x152 0x153 0x154 0x155 0x156 0x157 0x158 0x159 0x15a 0x15b 0x15c 0x15d 0x15e 0x15f 0x160 0x161 0x162 0x163 0x164 0x165 0x166 0x167 0x168 0x169 0x16a 0x16b 0x16c 0x16d 0x16e 0x16f 0x170 0x171 0x172 0x173 0x174 0x175 0x176 0x177 0x178 0x179 0x17a 0x17b 0x17c 0x17d 0x17e 0x17f 0x180 0x181 0x182 0x183 0x184 0x185 0x186 0x187 0x188 0x189 0x18a 0x18b 0x18c 0x18d 0x18e 0x18f 0x190

@crawshaw any chances of addressing the issue soon?

@ianlancetaylor
Copy link
Contributor

@randall77 is looking at this.

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 23, 2018
@felipejfc
Copy link

felipejfc commented Jun 15, 2018

I've also hit this issue :/

EDIT

After hitting the issue, I've cloned go's source code from this repo and was going to make some patches for it to stop crashing, but it was already fixed in master, I guess it will come as a default in go 1.11

@randall77
Copy link
Contributor

Thanks for verifying that tip fixed this issue.
I will close this issue. If anyone is still having problems with tip (or 1.11, when it comes out) you can reopen.

@randall77
Copy link
Contributor

randall77 commented Jun 16, 2018 via email

@matti777
Copy link

I came across this while including a Go library via gomobile into my iOS project and then adding another library which seems to cause this; see https://github.com/AFNetworking/AFNetworking/issues/4245

This with go version go1.10.3 darwin/amd64

@ianlancetaylor
Copy link
Contributor

@matti777 This issue is closed for 1.11. Please try the 1.11 beta release to see if it fixes your problem. If not, please open a new issue with details.

@golang golang locked and limited conversation to collaborators Jul 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests