Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: bytes: Introduce a FindFirstMultiByteChar API #34375

Closed
alex opened this issue Sep 18, 2019 · 3 comments
Closed

proposal: bytes: Introduce a FindFirstMultiByteChar API #34375

alex opened this issue Sep 18, 2019 · 3 comments

Comments

@alex
Copy link
Contributor

alex commented Sep 18, 2019

A relatively common operation in code that's trying to be high-performance when dealing with utf8 strings is to contain an optimized path for when the input is all single-character runes.

Generally to accomplish that, you end up with an API that looks like FindFirstMultiByteChar([]byte) int, for example: https://github.com/ianlopshire/go-fixedwidth/blob/master/decode.go#L166-L175

Having such an API in the Go standard library would be helpful on the basis of utility alone. However, this function also lends itself to a high performance vectorized implementation -- in local tests simply unsafely casting the input to a []uintptr produces roughly linear speedups (presumably even greater speedups are available to brave souls willing to write AVX2 instructions).

As a result of the relative commonality, and the possibility for the stdlib to offer a more optimized implementation than users are likely to write on their own, I think it'd be beneficial to include this in the standard library. If there's interest, I'm happy to provide a patch.

@titanous titanous changed the title bytes: Introduce a FindFirstMultiByteChar API proposal: bytes: Introduce a FindFirstMultiByteChar API Sep 18, 2019
@gopherbot gopherbot added this to the Proposal milestone Sep 18, 2019
@rsc rsc added this to Incoming in Proposals (old) Dec 4, 2019
@rsc
Copy link
Contributor

rsc commented Dec 11, 2019

I could see this being in utf8.LeadingASCIICount or something like that, maybe under a better name, but only if it were commonly needed and straightforward to use correctly. I am not sure whether either of those is true. Do you have data about either of those, or even anecdotes about when it would be used?

In general "we know how to implement this function very quickly" is not enough for inclusion in the standard library.

@rsc rsc moved this from Incoming to Active in Proposals (old) Dec 11, 2019
@rsc
Copy link
Contributor

rsc commented Jan 8, 2020

Based on the discussion above, and in particular the lack of compelling use cases, this seems like a likely decline.

Leaving open for a week for final comments.

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Jan 8, 2020
@rsc
Copy link
Contributor

rsc commented Jan 15, 2020

No change in consensus, so declining.

@rsc rsc closed this as completed Jan 15, 2020
@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Jan 15, 2020
@golang golang locked and limited conversation to collaborators Jan 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

4 participants