Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Use SDIV and UDIV for ARM #19118

Closed
benshi001 opened this issue Feb 16, 2017 · 15 comments
Closed

cmd/compile: Use SDIV and UDIV for ARM #19118

benshi001 opened this issue Feb 16, 2017 · 15 comments

Comments

@benshi001
Copy link
Member

benshi001 commented Feb 16, 2017

The UDIV and SDIV instructions are optional on ARM, so arm gcc generates __armeabi_udiv(a, b) for a/b by default, but it also emits "udiv a, b" while -march=armv7ve is specified.

Golang should also allow user to choose hardware or software division. Maybe by adding a GOARMHDIV environment variable?

@minux
Copy link
Member

minux commented Feb 16, 2017 via email

@benshi001
Copy link
Member Author

A hardware divider is usually much faster than a software one. However, there is no proper way to let the div routine decide in runtime.

  1. A register of ARM shows whether hardware dividers are integrated. And this register is only accessible in PL1, while normal linux programs run in PL0.
    2.A PL0 program can read /proc/cpuinfo for "idiva idivt" flags, but a div routine should not involve a file operation.

@benshi001
Copy link
Member Author

There is no proper way to let the div routine decide in runtime. Unless the user specifies it explicitly.

@randall77
Copy link
Contributor

It is easy enough to check /proc/cpuinfo once on startup and cache the result.

@benshi001
Copy link
Member Author

Check /proc/cpuinfo for "idiva" flag on startup might be a way, how are the core developers' opinion?

@cherrymui
Copy link
Member

Checking on startup sounds good to me.

@minux
Copy link
Member

minux commented Feb 16, 2017 via email

@benshi001
Copy link
Member Author

I would vote for call getauxval() / AT_HWCAP at startup.

@davecheney
Copy link
Contributor

davecheney commented Feb 16, 2017 via email

@josharian josharian changed the title Use SDIV and UDIV for ARM cmd/compile: Use SDIV and UDIV for ARM Feb 17, 2017
@benshi001
Copy link
Member Author

I have implemented this feature.
https://go-review.googlesource.com/#/c/37496/

A rough test shows the performance improves 40-50%.

@benshi001
Copy link
Member Author

benshi001 commented Feb 27, 2017

For a rough test case

**package main

import "fmt"
import "math/rand"
import "time"

func main() {
var c, g int
var a [70000000]uint32
var b [70000000]uint32
var d [70000000]uint32

r := rand.New(rand.NewSource(time.Now().UnixNano()))

for c = 0; c < cap(b); c++ {
	a[c] = uint32(r.Intn(0x7ffffff0))
	b[c] = uint32(r.Intn(0x3ffffff0))
}

for g = 0; g < 10; g++ {
	k := time.Now()
	for c = 0; c < cap(b); c++ {
		d[c] = a[c] / b[c]
	}
	w := time.Now()
	fmt.Println(w.Sub(k))
}

}**

The hardware divider outputs
5.4371964s
4.209324055s
4.205575531s
4.205909284s
4.205245892s
4.218614714s
4.210164271s
4.205585791s
4.205325164s
4.207941386s

And the software divider outputs
8.238922377s
5.943205893s
5.911372837s
5.914151872s
5.909159482s
5.933686273s
5.909419953s
5.90930063s
5.913579159s
5.909221097s

@gopherbot
Copy link

CL https://golang.org/cl/37496 mentions this issue.

@bradfitz bradfitz added this to the Go1.9 milestone Mar 21, 2017
@benshi001
Copy link
Member Author

How do we proceed in CL 37496? Keep the simulated DIV/DIVU/MOD/MODU, or remove them?

@benshi001
Copy link
Member Author

In patch set 14 of CL 37496,

  1. rebased to the newest master branch
  2. keep old DIV/DIVU/MOD/MODU while add DIVHW/DIVUHW

@benshi001
Copy link
Member Author

Any conclusion for this issue? Whether keep simulated div/mod or not?

lparth pushed a commit to lparth/go that referenced this issue Apr 13, 2017
The hardware divider is an optional component of ARMv7. This patch
detects whether it is available in runtime and use it or not.

1. The hardware divider is detected at startup and a flag is set/clear
   according to a perticular bit of runtime.hwcap.
2. Each call of runtime.udiv will check this flag and decide if
   use the hardware division instruction.

A rough test shows the performance improves 40-50% for ARMv7. And
the compatibility of ARMv5/v6 is not broken.

fixes golang#19118

Change-Id: Ic586bc9659ebc169553ca2004d2bdb721df823ac
Reviewed-on: https://go-review.googlesource.com/37496
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
@golang golang locked and limited conversation to collaborators Apr 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants