Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices: better generated code of slices.Clone? #60499

Closed
cuiweixie opened this issue May 30, 2023 · 2 comments
Closed

slices: better generated code of slices.Clone? #60499

cuiweixie opened this issue May 30, 2023 · 2 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@cuiweixie
Copy link
Contributor

code of slices.Clone in master is:

// Clone returns a copy of the slice.
// The elements are copied using assignment, so this is a shallow clone.
func Clone[S ~[]E, E any](s S) S {
	// Preserve nil in case it matters.
	if s == nil {
		return nil
	}
	return append(S([]E{}), s...)
}

test code:

package main

import (
	"slices"
)

func main() {
	s := []int{1, 2, 3}
	slices.Clone(s)
}

slices.Clone will be inline.
generate asm in amd64:

clone.go:10    0x1055cf6     488d442428    LEAQ 0x28(SP), AX        
clone.go:10       0x1055cfb     bb03000000    MOVL $0x3, BX           
clone.go:10       0x1055d00     31c9         XORL CX, CX             
clone.go:10       0x1055d02     4889df       MOVQ BX, DI             
clone.go:10       0x1055d05     488d3594470000    LEAQ runtime.rodata+18016(SB), SI  
clone.go:10       0x1055d0c     e8afa4feff    CALL runtime.growslice(SB)    
clone.go:10       0x1055d11     488d5c2428    LEAQ 0x28(SP), BX        
clone.go:10       0x1055d16     b918000000    MOVL $0x18, CX          
clone.go:10       0x1055d1b     0f1f440000    NOPL 0(AX)(AX*1)         
clone.go:10       0x1055d20     e8fbdfffff    CALL runtime.memmove(SB)

generate asm in arm64:

clone.go:10    0x100055d0c       9100c3e0      ADD $48, RSP, R0         
clone.go:10       0x100055d10       aa0303e1      MOVD R3, R1             
clone.go:10       0x100055d14       aa1f03e2      MOVD ZR, R2             
clone.go:10       0x100055d18       f00000e4      ADRP 126976(PC), R4          
clone.go:10       0x100055d1c       911a0084      ADD $1664, R4, R4        
clone.go:10       0x100055d20       97ffad14      CALL runtime.growslice(SB)    
clone.go:10       0x100055d24       9100c3e1      ADD $48, RSP, R1         
clone.go:10       0x100055d28       b27d07e2      ORR $24, ZR, R2             
clone.go:10       0x100055d2c       97fff8c5      CALL runtime.memmove(SB)

if we change slices.Clone to:

// Clone returns a copy of the slice.
// The elements are copied using assignment, so this is a shallow clone.
func Clone[S ~[]E, E any](s S) S {
	// Preserve nil in case it matters.
	if s == nil {
		return nil
	}
	return append(S(nil), s...)
}

generate code in amd64:

clone.go:10    0x1055cf6     31c0         XORL AX, AX             
clone.go:10       0x1055cf8     bb03000000    MOVL $0x3, BX           
clone.go:10       0x1055cfd     31c9         XORL CX, CX             
clone.go:10       0x1055cff     4889df       MOVQ BX, DI             
clone.go:10       0x1055d02     488d3597470000    LEAQ runtime.rodata+18016(SB), SI  
clone.go:10       0x1055d09     e8b2a4feff    CALL runtime.growslice(SB)    
clone.go:10       0x1055d0e     488d5c2428    LEAQ 0x28(SP), BX        
clone.go:10       0x1055d13     b918000000    MOVL $0x18, CX          
clone.go:10       0x1055d18     e803e0ffff    CALL runtime.memmove(SB)

generate code in arm64:

clone.go:10    0x100055d0c       aa1f03e0      MOVD ZR, R0             
clone.go:10       0x100055d10       aa0303e1      MOVD R3, R1             
clone.go:10       0x100055d14       aa1f03e2      MOVD ZR, R2             
clone.go:10       0x100055d18       f00000e4      ADRP 126976(PC), R4          
clone.go:10       0x100055d1c       911a0084      ADD $1664, R4, R4        
clone.go:10       0x100055d20       97ffad14      CALL runtime.growslice(SB)    
clone.go:10       0x100055d24       9100c3e1      ADD $48, RSP, R1         
clone.go:10       0x100055d28       b27d07e2      ORR $24, ZR, R2             
clone.go:10       0x100055d2c       97fff8c5      CALL runtime.memmove(SB)

I think

append(S(nil), s...)

is better than

append(S([]E{}), s...)

because, for example in amd64:
the difference is between XORL AX, AX and LEAQ 0x28(SP), AX.
I think XORL is better than LEAQ in here.
Do you agree with this?
If you agree with me, I will send a cl.

@gopherbot
Copy link

Change https://go.dev/cl/499135 mentions this issue: slices: better codegen of Clone

@bcmills bcmills added this to the Backlog milestone May 30, 2023
@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 30, 2023
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 30, 2023
@mknyszek
Copy link
Contributor

Based on the abandoned CL, it seems like this idea doesn't work. Closing the issue. Please feel free to reopen if I'm wrong.

@mknyszek mknyszek closed this as not planned Won't fix, can't repro, duplicate, stale May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants