New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: runtime.zerovalue can be too small when using shared libraries #11491
Comments
I'm not sure this is a problem. The size is not associated with the size put into the map, it is associated with the size read during a map access. So any package that looks up in a map[K]V must have a zero value of size at least V. I believe the code does that correctly, as long as the linker does the maximum size calculation for each zerovalue in each shared library. I think you might be right that this is no longer used for 2-arg type casts. |
I promise you that it can be a problem :-) (or I wouldn't have spent as long in gdb this morning) The situation is like this: You compile some go packages into a shared library. These go packages contain a type T of say 128 bytes but do not put T into a map, so the shared library contains a reasonably small zerovalue, say 64 bytes long. type.T.zero is set to point at this zerovalue. Then, you link a package that does put T into a map against this shared library. When a map access misses, it copies type.T.zero into the value it returns. Only the first 64 bytes of this are guaranteed to be zero! Hilarity ensues. (It's actually a bit more complicated than that because runtime.zerovalue is a global symbol currently, so all type.*.zero pointers end up pointing to the one from the executable, but the effect is similar). I think all this type.*.zero maintenance is a bit silly really -- it's always the same pointer for all statically defined types, so why not just have a single zeroptr? I implemented this in ee83e7d and it seems to be a net reduction in complexity. The linker could assist by making sure that zeroptr initially points at a buffer at least as large as runtime.zerovalue today but that would only make things a little simpler. |
Right, I see, the pointer in T points to the zero value in the wrong shared library. Yuck. The problem with a single zero value is reflect. Map types made by the reflect package might need a larger zero value than any static map type. It's probably solvable, though, by returning nil to the reflect map operations, and conditioning their copy on checking the return value. |
I think my change handles the reflect use case? It passes ./all.bash & go test reflect, at any rate... |
Yes, locking&allocating at each map access should work. I have been worried about the contention for locking, but I guess >1K values aren't that likely. I'm thinking there may be a slightly better way. The original reason for returning the zero value was that the caller didn't need to condition on the return value, it could just unconditionally copy the return value to the destination. This is good for small values, but for large values the overhead is in the noise, and it actually hurts because we use memmove and not memzero. So maybe we should reintroduce the condition on the return value for maps with values >1K? We'd have to compare the return value against &zero1024, but that seems like it would be ok. |
It only locks & allocates when a map with a larger value type than ever seen before is created, that doesn't seem too bad? I can see the point about memset being preferable to memmove at some point -- so the map code would always return &zero1024 on a miss and the compiler, when compiling access to a map with a value type larger than 1K, would insert a check for that pointer being returned and insert a memset in that case? Sounds doable, but I don't know if using large value types in maps is very common... |
Right, that's the algorithm I was trying to describe. I suspect value types >1k are very uncommon. You almost always want to use pointers-to-big-thing instead. I guess I'm still a bit confused as to how the current setup doesn't work. There should only be a single shared library with the runtime package in it, and zerovalue is in the runtime package. Does the problem occur when compiling other shared libraries with other packages - where the compiler ends up putting a large runtime.zerovalue symbol in that other shared library, and when that shared library is loaded, it can't affect the runtime.zerovalue that was already loaded in the runtime-containing shared library? If my description is accurate, it sounds like the dynamic linker should at least barf when this happens instead of silently ignoring the problem. Is there any way to make the dynamic linker barf in this situation? I can imagine there might be other cases where the compiler introduces runtime data symbols that might interact badly with dynamic linking. |
On 2 July 2015 at 13:47, Keith Randall notifications@github.com wrote:
In trying to explain exactly what goes wrong, I also became very confused.
|
The way to make this work is for runtime.zerovalue to be a common symbol. Common symbols have the exact semantics we want, courtesy of FORTRAN. In ELF that means giving setting the section index of the symbol to SHN_COMMON (== 0xfff2). But changing the behaviour for large value types seems better to me. I'm not sure we even need to go to 1024, probably 256 would be fine. If the value type is <= 256 bytes, the compiler generates code to copy. If the value type > 256 bytes, have the map code return nil for zero, and have the compiler generate if ret == nil { memset } else { memcpy }. |
@mwhudson Do you want to try using SHN_COMMON for zerovalue when compiling shared or dynlink? That change might be small enough for 1.5. |
Sure, I'll give it a go. |
So it turns out that this only works with gold, unless your binutils is built with --enable-elf-stt-common, which Ubuntu's isn't (the default is apparently to not enable in case you want to copy your binary onto a system that has a dynamic loader from before 1999, sigh). I'm fine with depending on gold though. |
We can definitely do the large-value-by-reference fix for 1.6. I think that change is too big for 1.5 though. Anything simpler you'd like to do for 1.5? |
I was trying Ian's suggestion but I ran into https://sourceware.org/bugzilla/show_bug.cgi?id=18288, which is fixed but not part of any release yet. Feels like we should leave it for now, unless Ian has any other ideas. |
My suggestion was to use SHN_COMMON. Did you try that? Using STT_COMMON too does no harm, but as you've discovered in practice you need SHN_COMMON too. |
Yes, but unless I'm missing something the objects in the .so need to be STT_COMMON or the dynamic linker won't use the special "largest one with that name wins" behaviour we need. And that needs either ld.bfd built with non standard options or an unreleased gold. I'd love to be wrong about something here btw :) |
The GNU linker uses a heuristic to decide whether a symbol in a shared library is a common symbol: if the symbol is defined in a section that is allocated but not loaded, it will be treated as a common symbol when merged with a symbol in a regular object in the SHN_COMMON section. Unfortunately, gold doesn't do this. But now it occurs to me that we are being distracted by this. Why not simply make runtime.zerovalue a hidden symbol? |
Because then the type.zero of all types defined in a module is the size of the largest type that is put into a map in that module -- if a type is defined in one module but only put into a hash in another, it can still be too small. |
I'm not quite following. The type descriptor points to a specific zero value, and that pointer was set up at link time (leaving out types created by the reflect package, which are rare). Adding that type to a map is not going to use the local runtime.zerovalue value. |
If zerovalue is local, a map miss will copy from the zerovalue of the module that defines the type, and that zerovalue is the size of the largest type seen in a map among the packages that go into that module. It's easy to make this go wrong -- put a large type but no use of it into a module and then link another module against it that puts this large type into a map. 66719bc makes zerovalue local and also contains a test that fails with that change. (A slightly different test with TortureZeroValue in zvtlib rather than the exe fails on tip). |
Ah, I see. OK, then, if a type is not seen as going into a map, we need to set its zero field to nil, rather than pointing to runtime.zerovalue. I think the map code will do the right thing. |
No, it's always set to runtime.zerovalue and the map code does not do the right thing. But this is too much discussion of code that should just be deleted early in 1.6. |
I think the map code does the right thing if the zero field is nil because of the mapzero function in runtime/hashmap.go. |
So I started to implement the idea of having the map code return a pointer to a fixed size buffer of zeros on a miss and I've gotten stuck. In particular, I don't see the best way of handling the miss case for OINDEXMAP. Basicallly I don't know what to rewrite OINDEXMAP into when the value type is large: the access needs to turn into code like
but that's really a statement, and a map index is obviously an expression. This sort of expansion is clearly done by function inlining, but I don't know how to do that in walkexpr :/ |
I'm not sure either, but there must be a way. walkexpr can handle things like && which also have implicit control flow. |
This replaces the scheme where the compiler tracked the largest type that was the value of a hashtable and allocate a block of zeroes large enough to return as a hash miss with a scheme that is simpler on the runtime but more complicated in the compiler: just have a fixed size block of zeroes (256 bytes here) and handle hash access differently in the compiler when the value type is too large. What I have here doesn't quite work (try cranking the <= 256 in walk.go down to <= 8). Update golang#11491. Change-Id: I1da3aca96bc4e0f32129f3a53e31dd8079f79100
Getting closer,,, the Node level stuff is pretty unfamiliar. |
CL https://golang.org/cl/13784 mentions this issue. |
AIUI runtime.zerovalue is only used in maps: it is what is copied when a lookup misses. The compiler takes some care to make it be the smallest value needed: it is the size of the largest type that is seen being put into a hash (there is a comment that suggests it's also used by 2-arg type cast, but I'm not sure that's true any more). Anyway, this kind of closed world assumption is not safe in a shared library: just because the compiler / linker does not see a type being put into a map does not mean that it is not going to be put into a hash in some module that links the shared library containing the type data.
I guess the safest fix is to make runtime.zerovalue a local symbol and ensure that the zerovalue for a shared library is as large as any type defined by that shared library (possibly only need to consider exported types?). Not sure if there is a small enough fix to be safe for 1.5.
The text was updated successfully, but these errors were encountered: