cmd/cgo, runtime: optimize C.GoString #23830

josharian · 2018-02-14T18:38:40Z

C.GoString ends up being translated into a call to runtime.gostring, which calls findnull, which does a byte-at-a-time search for the terminating NUL.

A quick hack to use bytes.IndexByte to find the NUL shows speed-ups of almost 40%. We cant use this directly, though, because of the danger of a vectorized read past the end of the string faulting.

Perhaps we could call strlen on the C side (which is presumably well-optimized) and pass the length in to the runtime? Or write a safe (length-oblivious, aligned-load-only) IndexByte for use here?

cc @ianlancetaylor @randall77

ianlancetaylor · 2018-02-14T18:50:13Z

I'm not sure I understand the concern about a fault from a vectorized read past the end of the string. Seems like that could happen in Go just as easily as in C.

bradfitz · 2018-02-14T18:54:47Z

@ianlancetaylor, because bytes.IndexByte starts with a slice (which contains a length), the assembly for that knows it's safe to do wide loads of many bytes up to that length only (or capacity, at least). C.GoString only takes a *byte but no length, so what if that byte pointer were 2-3 bytes before the end of mapped virtual memory space? If we invented some bogus slice header to give to bytes.IndesByte for GoString, we could be inviting IndexByte to fault on a bogus address.

That's how I read it, at least.

ianlancetaylor · 2018-02-15T00:53:53Z

Oh, yeah, OK. For C strings we need to do it slightly differently. Since vector instructions are always aligned, we are fine using vectors as long as we haven't found a zero yet, but we can't read ahead. We could write special purpose code for this, but I doubt it's worth it. We could call a function in runtime/cgo that calls strlen, but we would need a custom asm wrapper to put the arguments in the right place. Is this common enough to make this worth it?

josharian · 2018-02-15T01:57:29Z

A quick GitHub search says yes, it is used a lot. Whether it is in performance-critical paths, or whether the cost is worth it, I don't know. I'll move to Unplanned, unless/until someone comes forward and says they need it or someone is inspired to dig deeper and implement themselves.

TocarIP · 2018-02-16T20:41:51Z

I think using indexbyte with fake slice is safe, if all access is limited to 1 page, so doing something like:

p =  (*slice)(unsafe.Pointer(s))
p.len = int(uintptr(unsafe.Pointer(s)) + uintptr(4093)))/4096*4096 // round up to next pagesize
indexbyte(p,0)
...
for { // page at a time loop
p += 4096*i
p.len = 4096
indexbyte(p,0)
}

bcmills · 2018-02-16T20:50:52Z

@TocarIP That requires fairly strong assumptions about the optimizations present in the rest of the Go compiler and runtime. In general we have no guarantee there isn't some other thread writing to the address just after the string, and the Go memory model does not define behavior in the presence of data races, let alone data races that cross language boundaries.

We could, of course, check the rest of the runtime code to ensure that such races are benign, but it seems very difficult to write a test that would reliably detect violations due to future changes.

TocarIP · 2018-02-26T20:08:19Z

I'm not sure I understand what extra assumptions this introduces. IndexByte (at least asm version on amd64) already does out-of-bounds read, as long as they don't cross page for normal go strings.

randall77 · 2018-02-26T20:43:56Z

I think Ilya's strategy should be fine. IndexByte does not depend on data values past the first \0.
Reading data you don't end up looking at doesn't cause a problem on any architecture I know of (caveat: I/O mapped memory, but I'm pretty sure that's not at issue here). The only problem is making sure we don't accidentally read from unmapped memory, which Ilya's code solves.

gopherbot · 2018-02-28T02:24:37Z

Change https://golang.org/cl/97523 mentions this issue: runtime: use bytes.IndexByte in findnull

bradfitz · 2018-03-01T22:23:38Z

Reverted in https://go-review.googlesource.com/c/go/+/97995

/cc @TocarIP

gopherbot · 2018-03-01T23:06:36Z

Change https://golang.org/cl/98015 mentions this issue: runtime: use bytes.IndexByte in findnull

josharian added Performance help wanted labels Feb 14, 2018

josharian added this to the Go1.11 milestone Feb 14, 2018

josharian modified the milestones: Go1.11, Unplanned Feb 15, 2018

gopherbot closed this as completed in 7365fac Mar 1, 2018

bradfitz reopened this Mar 1, 2018

gopherbot closed this as completed in 91102bf Mar 9, 2018

golang locked and limited conversation to collaborators Mar 9, 2019

gopherbot added the FrozenDueToAge label Mar 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/cgo, runtime: optimize C.GoString #23830

cmd/cgo, runtime: optimize C.GoString #23830

josharian commented Feb 14, 2018

ianlancetaylor commented Feb 14, 2018

bradfitz commented Feb 14, 2018

ianlancetaylor commented Feb 15, 2018

josharian commented Feb 15, 2018

TocarIP commented Feb 16, 2018

bcmills commented Feb 16, 2018 •

edited

Loading

TocarIP commented Feb 26, 2018

randall77 commented Feb 26, 2018

gopherbot commented Feb 28, 2018

bradfitz commented Mar 1, 2018

gopherbot commented Mar 1, 2018

cmd/cgo, runtime: optimize C.GoString #23830

cmd/cgo, runtime: optimize C.GoString #23830

Comments

josharian commented Feb 14, 2018

ianlancetaylor commented Feb 14, 2018

bradfitz commented Feb 14, 2018

ianlancetaylor commented Feb 15, 2018

josharian commented Feb 15, 2018

TocarIP commented Feb 16, 2018

bcmills commented Feb 16, 2018 • edited Loading

TocarIP commented Feb 26, 2018

randall77 commented Feb 26, 2018

gopherbot commented Feb 28, 2018

bradfitz commented Mar 1, 2018

gopherbot commented Mar 1, 2018

bcmills commented Feb 16, 2018 •

edited

Loading