Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/cgo, runtime: optimize C.GoString #23830

Closed
josharian opened this issue Feb 14, 2018 · 11 comments
Closed

cmd/cgo, runtime: optimize C.GoString #23830

josharian opened this issue Feb 14, 2018 · 11 comments

Comments

@josharian
Copy link
Contributor

C.GoString ends up being translated into a call to runtime.gostring, which calls findnull, which does a byte-at-a-time search for the terminating NUL.

A quick hack to use bytes.IndexByte to find the NUL shows speed-ups of almost 40%. We cant use this directly, though, because of the danger of a vectorized read past the end of the string faulting.

Perhaps we could call strlen on the C side (which is presumably well-optimized) and pass the length in to the runtime? Or write a safe (length-oblivious, aligned-load-only) IndexByte for use here?

cc @ianlancetaylor @randall77

@ianlancetaylor
Copy link
Contributor

I'm not sure I understand the concern about a fault from a vectorized read past the end of the string. Seems like that could happen in Go just as easily as in C.

@bradfitz
Copy link
Contributor

@ianlancetaylor, because bytes.IndexByte starts with a slice (which contains a length), the assembly for that knows it's safe to do wide loads of many bytes up to that length only (or capacity, at least). C.GoString only takes a *byte but no length, so what if that byte pointer were 2-3 bytes before the end of mapped virtual memory space? If we invented some bogus slice header to give to bytes.IndesByte for GoString, we could be inviting IndexByte to fault on a bogus address.

That's how I read it, at least.

@ianlancetaylor
Copy link
Contributor

Oh, yeah, OK. For C strings we need to do it slightly differently. Since vector instructions are always aligned, we are fine using vectors as long as we haven't found a zero yet, but we can't read ahead. We could write special purpose code for this, but I doubt it's worth it. We could call a function in runtime/cgo that calls strlen, but we would need a custom asm wrapper to put the arguments in the right place. Is this common enough to make this worth it?

@josharian
Copy link
Contributor Author

A quick GitHub search says yes, it is used a lot. Whether it is in performance-critical paths, or whether the cost is worth it, I don't know. I'll move to Unplanned, unless/until someone comes forward and says they need it or someone is inspired to dig deeper and implement themselves.

@josharian josharian modified the milestones: Go1.11, Unplanned Feb 15, 2018
@TocarIP
Copy link
Contributor

TocarIP commented Feb 16, 2018

I think using indexbyte with fake slice is safe, if all access is limited to 1 page, so doing something like:

p =  (*slice)(unsafe.Pointer(s))
p.len = int(uintptr(unsafe.Pointer(s)) + uintptr(4093)))/4096*4096 // round up to next pagesize
indexbyte(p,0)
...
for { // page at a time loop
p += 4096*i
p.len = 4096
indexbyte(p,0)
}

@bcmills
Copy link
Contributor

bcmills commented Feb 16, 2018

@TocarIP That requires fairly strong assumptions about the optimizations present in the rest of the Go compiler and runtime. In general we have no guarantee there isn't some other thread writing to the address just after the string, and the Go memory model does not define behavior in the presence of data races, let alone data races that cross language boundaries.

We could, of course, check the rest of the runtime code to ensure that such races are benign, but it seems very difficult to write a test that would reliably detect violations due to future changes.

@TocarIP
Copy link
Contributor

TocarIP commented Feb 26, 2018

I'm not sure I understand what extra assumptions this introduces. IndexByte (at least asm version on amd64) already does out-of-bounds read, as long as they don't cross page for normal go strings.

@randall77
Copy link
Contributor

I think Ilya's strategy should be fine. IndexByte does not depend on data values past the first \0.
Reading data you don't end up looking at doesn't cause a problem on any architecture I know of (caveat: I/O mapped memory, but I'm pretty sure that's not at issue here). The only problem is making sure we don't accidentally read from unmapped memory, which Ilya's code solves.

@gopherbot
Copy link

Change https://golang.org/cl/97523 mentions this issue: runtime: use bytes.IndexByte in findnull

@bradfitz
Copy link
Contributor

bradfitz commented Mar 1, 2018

@bradfitz bradfitz reopened this Mar 1, 2018
@gopherbot
Copy link

Change https://golang.org/cl/98015 mentions this issue: runtime: use bytes.IndexByte in findnull

@golang golang locked and limited conversation to collaborators Mar 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants