Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: efficient access to thread-local data #8884

Open
dvyukov opened this issue Oct 7, 2014 · 2 comments
Open

runtime: efficient access to thread-local data #8884

dvyukov opened this issue Oct 7, 2014 · 2 comments
Milestone

Comments

@dvyukov
Copy link
Member

dvyukov commented Oct 7, 2014

Currently we have 3 performance issues with accesses to thread-local data (g/m/p):
1. Accesses require non-inlinable function calls.
2. The only thread-local var is now g, while most frequently accesses data is in m. So
most of the accesses has an additional indirection.
3. We do lots of duplicate loads of g/m.

We need to:
1. Make the thread-local var m (instead of g).
2. Move stack guard of the current g into m (that's the only hot data in g).
3. Declare runtime.curm variable in runtime, teach the compiler to recognize it and turn
into tls access.
4. Teach compiler to not do unnecessary duplicate loads of curm (like in
https://golang.org/issue/4946).
@rsc
Copy link
Contributor

rsc commented Oct 7, 2014

Comment 1:

I believe that changing from g to m is a mistake.
The most frequently accessed thread-local data is g->stackguard0, which is in g. It is
accessed once per function call. g is also much easier to reason about in programs,
because it cannot change from line to line as a particular function executes.
Eventually I would like to put g back into a dedicated register on amd64, like we do on
arm. Then getting at g->stackguard0 will be just one load, and getting at m will be just
one load too.

@dvyukov
Copy link
Member Author

dvyukov commented Oct 8, 2014

Comment 2:

> The most frequently accessed thread-local data is g->stackguard0, which is in g.
Yes, it's the most frequently accessed, that's I propose to move it to M. But there are
also m->mcache, m->locks, m->p and m->ptr/scalarargs. Duplicating them in G looks bad
because it will bloat G and open door to bugs. While what was called stackguard0 can
moved to M rather than duplicated.
> g is also much easier to reason about in programs, because it cannot change from line
to line as a particular function executes.
It's true that it can change, but I don't see how naming things differently changes
something. It can change regardless of whether you call it 'm' or 'g->m'. If you want to
prevent m from changing, you do 'm->locks++' or 'g->m->locks++'. No difference (other
than additional indirection).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants