Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings: genSplit improvement #18973

Closed
dongweigogo opened this issue Feb 7, 2017 · 7 comments
Closed

strings: genSplit improvement #18973

dongweigogo opened this issue Feb 7, 2017 · 7 comments

Comments

@dongweigogo
Copy link

dongweigogo commented Feb 7, 2017

#go1.7, linux, amd64

The genSplit function used by Split, SplitAfter and SplitAfterN is simplified in the IF statement of sep length, maybe for code expression. However it results in extra conditional executions in either case of len(sep)==1 or len(sep)>1

if s[i] == c && (len(sep) == 1 || s[i:i+len(sep)] == sep) {
...
}

Just seperating that into two parts according to sep length gains ~15% improvement in performace (15% time reduction) in either case of len(sep)==1 or len(sep)>1.

func genSplit(s, sep string, sepSave, n int) []string {

if n == 0 {
    return nil
}
if sep == "" {
    return explode(s, n)
}
if n < 0 {
    n = strings.Count(s, sep) + 1
}
c := sep[0]
start := 0
a := make([]string, n)
na := 0
if len(sep)==1 {
    for i := 0; i < len(s) && na+1 < n; i++ {
        if s[i] == c {
            a[na] = s[start : i+sepSave]
                na++
                start = i + 1
            }
        }
} else {
    for i := 0; i+len(sep) <= len(s) && na+1 < n; i++ {
        if s[i] == c && s[i:i+len(sep)] == sep {
            a[na] = s[start : i+sepSave]
            na++
            start = i + len(sep)
            i += len(sep) - 1
        }
    }
}
a[na] = s[start:]
return a[0 : na+1]

}

@valyala
Copy link
Contributor

valyala commented Feb 7, 2017

I'm looking into this right now. The CL will be ready soon

@valyala
Copy link
Contributor

valyala commented Feb 7, 2017

The CL

@gopherbot
Copy link

CL https://golang.org/cl/36510 mentions this issue.

@valyala
Copy link
Contributor

valyala commented Feb 7, 2017

I think the same optimization may be applied to bytes.Count* functions. Will give a try

@valyala
Copy link
Contributor

valyala commented Feb 7, 2017

Applied the same optimizations to bytes.Count at https://go-review.googlesource.com/36510 .

@dongweigogo
Copy link
Author

dongweigogo commented Feb 7, 2017

@valyala

Definitely, this can be applied to both Count* and Split* functions in both bytes and strings packages.

@ALTree ALTree changed the title strings.genSplit improvement strings: genSplit improvement Feb 7, 2017
@dongweigogo
Copy link
Author

dongweigogo commented Feb 7, 2017

@valyala,

In my benchmark results, the performance of your modified genSplit version does not improve as much as simply separating the code (as I post). Is there explaination for that?

@golang golang locked and limited conversation to collaborators Feb 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants