New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: need a better itab table #20505
Comments
This sounds like pretty much the same use-case that |
Could the compiler generate a perfect hash for this data? That would save CPU by avoiding walks altogether and initialization at startup. The downside being longer compile time, larger object size, and it might not work well with plugins. |
@huguesb It is possible to perfect hash the startup data, but that seems overkill. Just sizing the table appropriately would be enough. We'd need a dynamic table anyway, as not all of the data is available at startup. If we move itabs to the dynamic table on first lookup (the hot/cold table idea), the speed of the static tables is ~irrelevant. |
How do you feel about graph coloring? For a type T that implements interfaces I1, I2, I3, create a clique on I1, I2, I3, to indicate that they must have different colors. Color the resulting graph, and construct tables (slices) mapping colors (now integers) to itabs. To convert Ix to Iy, extract Cx from Ix, then Problem with this is that the graph coloring could be large-ish, I don't really know. It could happen at link time, might be interesting to figure out if there was a way to do it on-demand. Yes, I know it's an NP-complete problem. So is register allocation. |
@dr2chase I totally don't understand what you are suggesting. What is this graph of? Interfaces? Why are two different interfaces interfering? Is this just to make a small index out of an interface name? We need to do this lookup even when the compiler never sees any of these types together (they are in different packages, say). So I don't think we could construct a complete interference graph at compile time anyway. Maybe link time, but then plugins break it. |
Goal is to make a small index out of an interface name. Two interfaces I1 and I2 interfere if type T implements both of them (they can't both have the same index in T's table). Plugins, the risk is either a new type TP collides two previously non-colliding interfaces, thus extending all the tables by one. A new interface IP also potentially extends all the tables. Conceivably, "failure" can continue the probe by some other means. How much of an event is loading a plugin? More on the graph: graph nodes are interfaces, graph edges indicate that two interfaces interfere (cannot share the same color) because they are both implemented by the same type. This is much the same as register allocation graph coloring, where nodes are variables, and an edge exists whenever two variables have overlapping lifetimes. In both cases, coloring is used to find and reuse small indices. |
I think it would be worth doing some dynamic examination of real programs to see what dynamic types actually appear during specific interface-to-interface conversions. Where are all the interface-to-interface conversions coming from? How many are the fmt package checking for Formatter, GoStringer, and Stringer? Is it worth optimizing those specific cases, by, for example, having special hash tables just for those types? I note that a type with methods always an In general, my guess is that the great majority of interface-to-interface conversions are converting to an interface with a single method. Perhaps we could add a pointer to the type descriptor of an interface type with a single method that points to a hash table mapping concrete types to the appropriate itab or nil if the type does not support the interface. The idea here is to simply split up the hash table. |
Once we understand the nature of the table use (per Ian), I think a good next step would be to construct a benchmark program that reproduces the problem at different scales, so we can easily profile and experiment. |
Of the static entries, there were more than 30,000 proto.Message, plus another 3000 or so helpers, like the additional Oneof types. |
I think it would make sense to dynamically resize the itab hash table so that we can address both the dynamic and static itab entries cases in one go. I think this can be simpler than sync.Map: the hash table would need to be a slice (instead of a fixed size array) and there would be an atomic pointer to the hash table slice. Lookups would follow the same "look twice - once without lock, once with" pattern as today. If the entry is not found the second time, we proceed to the insertion case which could be made to resize the table when above a given load factor. During runtime init we may not be able to handle memory allocation (?); if that is a problem we can just disable hash resizing during early init. Another issue is that the itab hash modulo would not be a constant anymore. We may want to avoid having an integer divide on that path. Is there a strong requirement for the prime hash size ? I suspect power of two hash sizes could be made to work too... |
Another option is to precalc the prime sequence and the magic factors (libdivide style) |
CL https://golang.org/cl/44338 mentions this issue. |
CL https://golang.org/cl/44339 mentions this issue. |
CL https://golang.org/cl/44341 mentions this issue. |
CL https://golang.org/cl/44472 mentions this issue. |
I noticed that we don't set an itab's function pointers at compile time. Instead, we currently do it at executable startup. Set the function pointers at compile time instead. This shortens startup time. It has no effect on normal binary size. Object files will have more relocations, but that isn't a big deal. For PIE there are additional pointers that will need to be adjusted at load time. There are already other pointers in an itab that need to be adjusted, so the cache line will already be paged in. There might be some binary size overhead to mark these pointers. The "go test -c -buildmode=pie net/http" binary is 0.18% bigger. Update #20505 Change-Id: I267c82489915b509ff66e512fc7319b2dd79b8f7 Reviewed-on: https://go-review.googlesource.com/44341 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
The runtime keeps a map from [interface type, concrete type] to the itab for that pair. Most interface->interface conversions need to look in this table to find the itab to use. (concrete type->interface conversions don't need this table any more - see CL 20901, CL 20902, and others.)
It is implemented as a hash table with 1009 entries and chained (singly-linked list) buckets.
We've run into servers inside Google that have 43,540 entries in this table. That leads to lots of CPU getting wasted in walking long linked lists in convI2I calls.
Part of the problem is that we're now loading this table at startup with all the itabs that we build at compile time (introduced in the CLs referenced above), instead of just keeping dynamically-used entries in this table. Part of the problem may just be the sheer size of the code for the server in question (lots of protobufs, I think).
We should come up with a better way to organize the itab table. Maybe it would be enough to just implement move-to-front on the bucket lists? Or maybe keep a hot table and a cold table, put the itabs in the cold table at startup, and only move them to the hot table when referenced dynamically? Other ideas welcome.
The current implementation is ~append-only which makes the multithreading easy. I don't want to lose that property by locking on every call.
@walken-google @cherrymui
The text was updated successfully, but these errors were encountered: