Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: use staticuint64s #37612

Open
josharian opened this issue Mar 2, 2020 · 23 comments
Open

cmd/compile: use staticuint64s #37612

josharian opened this issue Mar 2, 2020 · 23 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@josharian
Copy link
Contributor

CL 216401 introduced a new static array of uint64s running from 0 to 255.

Two follow-ups here:

@josharian josharian added this to the Go1.15 milestone Mar 2, 2020
@josharian
Copy link
Contributor Author

This may be a decent starter issue for someone wanting to work on the compiler.

@josharian josharian added help wanted Suggested Issues that may be good for new contributors looking for work to do. labels Mar 2, 2020
@dmitshur dmitshur added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 3, 2020
@dpinela
Copy link
Contributor

dpinela commented Mar 3, 2020

I'm not sure what this has to do with the compiler? It seems to be purely a runtime optimization - the linked CL doesn't touch cmd/compile at all.

@randall77
Copy link
Contributor

Look for staticbytes in cmd/compile/internal/gc.

@josharian
Copy link
Contributor Author

Right. But now that we have staticuint64s in the runtime, we can use it in the compiler.

For part one, grep the compiler for staticbytes and use staticuint64s instead. (The offset math will need to change a bit; see Keith's linked CL comment.) Then delete runtime.staticbytes.

For part two, in places where we only optimize byte-sized things, we can probably now optimize constant larger-int-sized things. The sites will be the same sites encountered in part one.

@dpinela
Copy link
Contributor

dpinela commented Mar 3, 2020

There are actually two more remaining uses of runtime.staticbytes, in src/runtime/string.go, for generating 1-character strings.

@josharian
Copy link
Contributor Author

Thanks! Those could also be converted, using similar arithmetic.

@gopherbot
Copy link

Change https://golang.org/cl/221957 mentions this issue: cmd/compile: use staticuint64s instead of staticbytes

@dpinela
Copy link
Contributor

dpinela commented Mar 4, 2020

I've sent a CL implementing the first part of the compiler change. I'll do the other two follow-ups later.

gopherbot pushed a commit that referenced this issue Mar 4, 2020
There are still two places in src/runtime/string.go that use
staticbytes, so we cannot delete it just yet.

There is a new codegen test to verify that the index calculation
is constant-folded, at least on amd64. ppc64, mips[64] and s390x
cannot currently do that.

There is also a new runtime benchmark to ensure that this does not
slow down performance (tested against parent commit):

name                      old time/op  new time/op  delta
ConvT2EByteSized/bool-4   1.07ns ± 1%  1.07ns ± 1%   ~     (p=0.060 n=14+15)
ConvT2EByteSized/uint8-4  1.06ns ± 1%  1.07ns ± 1%   ~     (p=0.095 n=14+15)

Updates #37612

Change-Id: I5ec30738edaa48cda78dfab4a78e24a32fa7fd6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221957
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
@gopherbot
Copy link

Change https://golang.org/cl/221979 mentions this issue: runtime: use staticuint64s instead of staticbytes for 1-length strings

gopherbot pushed a commit that referenced this issue Mar 5, 2020
This was the last remaining use of staticbytes, so we can now
delete it.

The new code appears slightly faster on amd64:

name                   old time/op  new time/op  delta
SliceByteToString/1-4  6.29ns ± 2%  5.89ns ± 1%  -6.46%  (p=0.000 n=14+14)

This may not be the case on the big-endian architectures, since they have
to do an extra addition.

Updates #37612

Change-Id: Icb84c5911ba025f798de152849992a55be99e4f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/221979
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@dpinela
Copy link
Contributor

dpinela commented Mar 13, 2020

I was trying to do the small int constant part, but by the time walkexpr runs, the OLITERAL has been replaced with an ONAME node for a temporary:

before walk smallintiface
.   RETURN l(26) tc(1)
.   RETURN-list
.   .   CONVIFACE l(26) esc(h) tc(1) implicit(true) INTER-@0
.   .   .   NAME-codegen..stmp_3 l(26) x(0) class(PEXTERN) tc(1) used int

I can't figure out how to recover the value of the constant from that ONAME.

@josharian
Copy link
Contributor Author

You’ll probably need to make a companion change to order.go at the same time. I’m AFK for a little while. If that isn’t enough of a hint, let me know and I’ll expand on it when I am at a real keyboard.

@dpinela
Copy link
Contributor

dpinela commented Mar 14, 2020

I tried narrowing the condition in the OCONVIFACE case of (*Order).expr as follows:

// Don't use a temporary for integer constants that fit in a byte; walk will replace
// them with pointers into runtime.staticuint64s.
if _, needsaddr := convFuncName(n.Left.Type, n.Type); (needsaddr || isStaticCompositeLiteral(n.Left)) && !(n.Left.CanInt64() && n.Left.Int64() >= 0 && n.Left.Int64() < 256) {

That almost works, but I get a couple of test failures, which seem directly relevant to this code:

# go run run.go -- writebarrier.go

Unmatched Errors:
writebarrier.go:288: write barrier

FAIL	writebarrier.go	0.092s
# go run run.go -- fixedbugs/issue31782.go
exit status 1
panic: interface conversion: interface {} is nil, not int

goroutine 1 [running]:
main.main()
	/Users/dpinela/dev/golang-dev/go/test/fixedbugs/issue31782.go:23 +0x78
exit status 2

FAIL	fixedbugs/issue31782.go	0.308s

I've had a look at the code in order.go and sinit.go that handles composite literals, but I'm having a hard time following what's going on, so I'm stuck.

@josharian
Copy link
Contributor Author

Please post your changes in a CL so I can take a look. Thanks.

I’m still AFK, and due to coronavirus school closures, it is unclear when I will next have meaningful laptop time. But I will look as best as I can, next opportunity I get.

@josharian
Copy link
Contributor Author

I wonder—have you checked what code we generate right now for this case? I wonder whether we have enough existing optimizations at tip (probably not 1.14) to already optimally handle the case of a small constant int. (There’s an entirely separate question of handling an int that is non-constant but whose unsigned equivalent provably fits in one byte. That we definitely don’t handle well now. But I don’t think that that is what you were looking at, either.)

@gopherbot
Copy link

Change https://golang.org/cl/223358 mentions this issue: cmd/compile: use staticuint64s to optimize interface conversion of small integer constants (WIP)

@dpinela
Copy link
Contributor

dpinela commented Mar 14, 2020

Currently (both 1.14 and tip) we generate a static global with the constant's value and use a pointer to that, like we do for string literals:

"".X STEXT nosplit size=25 args=0x10 locals=0x0
	0x0000 00000 (smallintiface.go:3)	TEXT	"".X(SB), NOSPLIT|ABIInternal, $0-16
	0x0000 00000 (smallintiface.go:3)	PCDATA	$0, $-2
	0x0000 00000 (smallintiface.go:3)	PCDATA	$1, $-2
	0x0000 00000 (smallintiface.go:3)	FUNCDATA	$0, gclocals·568470801006e5c0dc3947ea998fe279(SB)
	0x0000 00000 (smallintiface.go:3)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x0000 00000 (smallintiface.go:3)	FUNCDATA	$2, gclocals·9fb7f0986f647f17cb53dda1484e0f7a(SB)
	0x0000 00000 (smallintiface.go:4)	PCDATA	$0, $1
	0x0000 00000 (smallintiface.go:4)	PCDATA	$1, $1
	0x0000 00000 (smallintiface.go:4)	LEAQ	type.int(SB), AX
	0x0007 00007 (smallintiface.go:4)	PCDATA	$0, $0
	0x0007 00007 (smallintiface.go:4)	MOVQ	AX, "".~r0+8(SP)
	0x000c 00012 (smallintiface.go:4)	PCDATA	$0, $1
	0x000c 00012 (smallintiface.go:4)	LEAQ	""..stmp_0(SB), AX
	0x0013 00019 (smallintiface.go:4)	PCDATA	$0, $0
	0x0013 00019 (smallintiface.go:4)	MOVQ	AX, "".~r0+16(SP)
	0x0018 00024 (smallintiface.go:4)	RET
[...]
""..stmp_0 SRODATA size=8
	0x0000 01 00 00 00 00 00 00 00                          ........

This is optimal in execution time, but not in binary size: pointing into staticuint64s would allow us to elide these globals, saving 8 bytes each in the common case of int constants on 64-bit architectures.

@randall77 randall77 modified the milestones: Go1.15, Go1.16 May 21, 2020
@randall77 randall77 modified the milestones: Go1.16, Go1.17 Dec 14, 2020
@ianlancetaylor
Copy link
Contributor

Moving to Backlog milestone.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.17, Backlog Apr 27, 2021
@skonto
Copy link

skonto commented Jun 3, 2021

Hi, Is this case an example of how to use it?

@zikaeroh
Copy link
Contributor

zikaeroh commented Jul 29, 2021

I'm working on reviving https://golang.org/cl/223358 as an intro to working on the compiler; so far I've found a few interesting things.

The issue31782.go failure is due to the fact that the CL implemented this conversion by leaving OLITERALs in the tree after order. However, walkCompLit calls isStaticCompositeLiteral, which treats OLITERAL as something that can be statically represented, which skips anylit, and convert never gets run on the struct literal as the intent is that nothing has to be converted. The OLITERALs then get miscompiled. I've "solved" this by performing the interface conversion (a subset of convert) early in order. But, I'm not sure that this is a good idea (as order doesn't do this normally). I've also found that walkCompLit is never called with a struct literal that contains an OLITERAL (if a struct fails, it's for other reasons; maybe this used to not be true before the big ir refactoring), so maybe the fix is to disallow OLITERAL there and move this code back to being split between order and convert.

The writebarrier.go issue is actually preexisting; it turns out that ssa apparently doesn't recognize offsets off of global static variables to be accesses that don't need a write barrier. Ever since the original optimization (bool/bytes) was introduced, the following code has had write barriers:

func foo() []interface{} {
    return []interface{}{
        true,   // ERROR "write barrier"
        int8(3) // ERROR "write barrier"
    }
}

Both values are converted to interfaces that contain staticuint64s+8 and similar, which get a write barrier. The CL revealed this because one of the writebarrier.go tests used 7, which is small enough to be staticuint64'd. No existing test verified that small values didn't have write barriers during interface conversions.

I'm not really sure how to solve that one; I asked @mdempsky about this and it seems like ssa would need to change to understand offsets into globals, which seems like a more impactful change. (But, I'm looking into it! 🙂)

@zikaeroh
Copy link
Contributor

I think I've been able to address all of the shortcomings (including the SSA bit), and plan to send my CL stack for review when the tree opens (after dev.typeparams is merged, since recent generic dictionary stuff has modified interface conversion a bit, so I've rebased it there).

@gopherbot
Copy link

Change https://golang.org/cl/342131 mentions this issue: cmd/compile: use staticuint64s for small constants

@gopherbot
Copy link

Change https://golang.org/cl/342129 mentions this issue: cmd/compile: don't emit write barriers for offsets of global addresses

@gopherbot
Copy link

Change https://golang.org/cl/342130 mentions this issue: cmd/compile: treat more address patterns as global

gopherbot pushed a commit that referenced this issue Aug 23, 2021
Currently, write barriers aren't emitted for global addresses, but they
are emitted for addresses offset of global addresses.

This CL changes IsGlobalAddr to recognize offsets of global addresses
as globals too, removing write barriers for staticuint64s based
addresses. The logic added is the same as used in IsStackAddr.

Updates #37612

Change-Id: I537579f85b9ad02987d94f3ee0b4508b90097959
Reviewed-on: https://go-review.googlesource.com/c/go/+/342129
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Projects
Status: Triage Backlog
Development

No branches or pull requests

8 participants