Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: remove unnecessary allocations in convT2E #8892

Open
dvyukov opened this issue Oct 7, 2014 · 10 comments
Open

runtime: remove unnecessary allocations in convT2E #8892

dvyukov opened this issue Oct 7, 2014 · 10 comments
Milestone

Comments

@dvyukov
Copy link
Member

dvyukov commented Oct 7, 2014

This was at lengths discussed in:
https://golang.org/issue/8405
and previously on golang-dev:
https://groups.google.com/d/msg/golang-dev/pwUh0BVFpY0/zqJInvU3NkQJ

Namely, we should allocate heap block for scalars iff the scalar look like a pointer
into heap (otherwise GC will ignore it anyway).

This will allow to have 1-bit/word GC pointer type info *and* don't allocate additional
memory for scalars in interfaces in most cases.
@randall77
Copy link
Contributor

Comment 1:

So if we're allocating scalars sometimes in the data word and sometimes pointed to by
the data word, then users of the interface need to distinguish those cases.  So, for
example, assertI2T for scalars must check if the result looks like a pointer into the
heap and of so, dereference it.

@dvyukov
Copy link
Member Author

dvyukov commented Oct 7, 2014

Comment 2:

Right. If it's not intended to be a pointer but looks like a pointer, dereference it.

@rsc
Copy link
Contributor

rsc commented Oct 7, 2014

Comment 3:

I believe we should not do this. It's too clever, and it will come back to bit eus later.

@ysmolski
Copy link
Member

ysmolski commented Nov 6, 2018

@dvyukov, does this issue look relevant to you?

@dvyukov
Copy link
Member Author

dvyukov commented Nov 6, 2018

@ysmolsky I don't understand the question. What do you mean?

@ianlancetaylor
Copy link
Contributor

@dvyukov I think he is asking whether you think this issue should still be open.

Really this is a question for @aclements and @RLH. Is this a feasible optimization with the current GC?

@dvyukov
Copy link
Member Author

dvyukov commented Apr 25, 2020

It should still be a feasible optimization because we still support Go pointers to point to C heap, right?

Another potential implementation would work for value types that are smaller than pointer size (e.g. uint16, uint32 on 64-bits, bool, small structs, etc). Namely: let's say we have amd64, pointer size is 64 bits, let's say we are storing uint32 into an interface{}, when we are storing the value we put 0xffffffff into the high 32-bits. This makes this value invalid/outside-of-heap if treated as pointer. Extraction won't require any special code, we just take the low 32 bits.
This would require to calculate if this optimization is applicable for a type in the compiler (based on type size, and arch), and adding some high bits in runtime/compiler.
Here is a quick proof-of-concept:

--- a/src/cmd/compile/internal/gc/subr.go
+++ b/src/cmd/compile/internal/gc/subr.go
@@ -1857,6 +1857,10 @@ func isdirectiface(t *types.Type) bool {
                return t.NumFields() == 1 && isdirectiface(t.Field(0).Type)
        }
 
+       if t.Size() == 2 && t.Align == 2 {
+               return true
+       }
+
        return false
 }
 
--- a/src/runtime/iface.go
+++ b/src/runtime/iface.go
 func convT16(val uint16) (x unsafe.Pointer) {
-       if val < uint16(len(staticuint64s)) {
-               x = unsafe.Pointer(&staticuint64s[val])
-               if sys.BigEndian {
-                       x = add(x, 6)
-               }
-       } else {
-               x = mallocgc(2, uint16Type, false)
-               *(*uint16)(x) = val
-       }
-       return
+       return unsafe.Pointer(uintptr(val) | 1<<63)
 }

This passed bootstrap and almost passed go tool dist test. I guess it also needs some changes in the reflect package.
And at this point we obviously want to inline this conversion wholesale because there is no point in the function call now.

Besides avoiding mallocgc call and not generating garbage, this also makes accesses faster (no indirection).

Thoughts?

@dr2chase @mknyszek

@dvyukov
Copy link
Member Author

dvyukov commented Apr 25, 2020

This is inspired by a real use case. I need to store lots of integers in interfaces, but these won't fit into staticuint64s (up to thousands/tens of thousands). But they perfectly fit into uint32 and that's what I use as the type.

@josharian
Copy link
Contributor

I’m on my phone, but you probably also need to adjust ifaceData in the compiler. And ifaceeq and efaceeq in the runtime. (Also, it’s be more impressive to get uint8/uint32 working—uint16s are not heavily used.)

Another option, which is less likely to cause rare breakage (e.g. in assembly) is to keep an atomic counter in the slow path of convT32 for smallish values and, when used enough, persistent alloc and populate a staticuint64s-like array with a larger range.

Or keep a per-P cache of recently allocated values so that we can re-use them. (I tried this for string interning, and it was pretty straightforward.)

@dr2chase
Copy link
Contributor

dr2chase commented Apr 27, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants