Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings: FieldsFunc speedups #19789

Closed
gaal opened this issue Mar 30, 2017 · 5 comments
Closed

strings: FieldsFunc speedups #19789

gaal opened this issue Mar 30, 2017 · 5 comments

Comments

@gaal
Copy link

gaal commented Mar 30, 2017

go version devel +214be5b302 Sun Mar 26 04:40:20 2017 +0000 linux/amd64

I was looking at profiles and noticed that strings.FieldsFunc was surprisingly hot.

In the following gist, I measure variations on splitting functions, including a proposed improvement of FieldsFunc that's 40% faster.
https://gist.github.com/gaal/497361737dd55125cf2de998b49d948b

Summarizing the results there:

BenchmarkFields/Fields-12                 500000              2327 ns/op          73.04 MB/s
BenchmarkFields/FieldsFuncUnicodeIsSpace-12              1000000              2294 ns/op          74.10 MB/s
BenchmarkFields/FieldsFuncAltUnicodeIsSpace-12           1000000              1394 ns/op         121.94 MB/s
BenchmarkFields/Split-12                                 3000000               543 ns/op         312.93 MB/s

strings.Split does less work than Fields etc., so it's not surprising that it's faster; I include it as a bound of sorts on potential savings.

FieldsFunc's slowness comes in part by it calling its predicate twice as often as it needs to: once to count the length of the output slice, and again to extract the fields. The alternate implementation has the same length in lines of code but instead finds the offsets to the field spans in one pass and copies them in another. It does the spans housekeeping using a preallocated slice that does not escape the function (the parameter 16 was chosen arbitrarily; presumably most clients will not have a large amount of inputs with many fields).

@bradfitz bradfitz changed the title strings.FieldsFunc speedups strings: FieldsFunc speedups Mar 30, 2017
@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 30, 2017
@dsnet
Copy link
Member

dsnet commented Mar 30, 2017

I have an open CL at https://go-review.googlesource.com/c/33108/. Apologies that I haven't gotten around to it lately.

@dsnet dsnet self-assigned this Mar 30, 2017
@martisch
Copy link
Contributor

martisch commented Mar 30, 2017

Issue #17856 is about speeding up Fields.

I made a cl to improve performance of fields based on dsnet lookup approach in CL 33108:
37959

I experimented with the idea of keeping span indexes but was waiting to discuss and commit the ascii speedup version first that only needs one allocation in general.

See comment in 37959: "Or more complex we could keep e.g. the indices of the first 8 (or more) splits in a local array and use those to do the splits in the ascii case later right away without any new scan. Something that we can also apply to FieldsFunc."

name                        old time/op    new time/op     delta
Fields/ASCII/16-2              262ns ± 0%      115ns ± 1%   -56.26%  (p=0.000 n=19+20)
Fields/ASCII/256-2            3.57µs ± 0%     0.72µs ± 1%   -79.74%  (p=0.000 n=19+19)
Fields/ASCII/4096-2           57.9µs ± 0%     10.3µs ± 0%   -82.29%  (p=0.000 n=19+20)
Fields/ASCII/65536-2           920µs ± 0%      216µs ± 0%   -76.47%  (p=0.000 n=20+20)
Fields/ASCII/1048576-2        14.8ms ± 0%      5.3ms ± 1%   -64.44%  (p=0.000 n=17+20)
Fields/Mixed/16-2              295ns ± 0%      180ns ± 0%   -38.98%  (p=0.000 n=20+16)
Fields/Mixed/256-2            3.80µs ± 0%     1.26µs ± 0%   -66.72%  (p=0.000 n=20+20)
Fields/Mixed/4096-2           65.4µs ± 0%     19.7µs ± 0%   -69.86%  (p=0.000 n=17+19)
Fields/Mixed/65536-2          1.07ms ± 0%     0.38ms ± 0%   -64.06%  (p=0.000 n=20+20)
Fields/Mixed/1048576-2        17.1ms ± 0%      6.2ms ± 0%   -63.62%  (p=0.000 n=20+18)

@gopherbot
Copy link

CL https://golang.org/cl/37959 mentions this issue.

@dsnet dsnet assigned martisch and unassigned dsnet Mar 31, 2017
gopherbot pushed a commit that referenced this issue Apr 4, 2017
- use a string lookup to detect if a single byte is a space character
- determine the exact number of fields for ASCII and
  a possibly underestimated number of fields for non ASCII strings
  by doing a separate byte for byte scan of the input string
  before collecting the fields in an extra pass
- provide a fast path for ASCII only strings when collecting the fields
- avoid utf8.DecodeRuneInString and unicode.IsSpace for ASCII characters

Used golang.org/cl/33108 from Joe Tsai as starting point.

name                      old time/op    new time/op     delta
Fields/ASCII/16              284ns ± 1%      116ns ± 2%   -59.30%  (p=0.000 n=9+10)
Fields/ASCII/256            3.81µs ± 1%     0.80µs ± 1%   -79.10%  (p=0.000 n=10+10)
Fields/ASCII/4096           61.4µs ± 1%     12.3µs ± 1%   -79.96%  (p=0.000 n=10+9)
Fields/ASCII/65536           982µs ± 1%      235µs ± 0%   -76.04%  (p=0.000 n=10+9)
Fields/ASCII/1048576        16.7ms ± 2%      5.4ms ± 1%   -67.52%  (p=0.000 n=10+10)
Fields/Mixed/16              314ns ± 1%      168ns ± 1%   -46.33%  (p=0.000 n=9+10)
Fields/Mixed/256            3.92µs ± 1%     1.17µs ± 1%   -70.19%  (p=0.000 n=10+10)
Fields/Mixed/4096           69.1µs ± 1%     19.0µs ± 1%   -72.53%  (p=0.000 n=10+10)
Fields/Mixed/65536          1.12ms ± 1%     0.39ms ± 0%   -65.37%  (p=0.000 n=10+9)
Fields/Mixed/1048576        19.0ms ± 2%      7.3ms ± 4%   -61.75%  (p=0.000 n=10+9)

name                      old speed      new speed       delta
Fields/ASCII/16           56.3MB/s ± 1%  138.1MB/s ± 2%  +145.31%  (p=0.000 n=9+10)
Fields/ASCII/256          67.1MB/s ± 1%  321.0MB/s ± 1%  +378.26%  (p=0.000 n=10+10)
Fields/ASCII/4096         66.7MB/s ± 1%  333.0MB/s ± 1%  +398.97%  (p=0.000 n=10+9)
Fields/ASCII/65536        66.7MB/s ± 1%  278.4MB/s ± 0%  +317.39%  (p=0.000 n=10+9)
Fields/ASCII/1048576      62.7MB/s ± 2%  192.9MB/s ± 1%  +207.82%  (p=0.000 n=10+10)
Fields/Mixed/16           51.0MB/s ± 2%   94.9MB/s ± 1%   +85.87%  (p=0.000 n=10+10)
Fields/Mixed/256          65.4MB/s ± 1%  219.2MB/s ± 1%  +235.33%  (p=0.000 n=10+10)
Fields/Mixed/4096         59.3MB/s ± 1%  215.7MB/s ± 1%  +263.98%  (p=0.000 n=10+10)
Fields/Mixed/65536        58.6MB/s ± 1%  169.1MB/s ± 0%  +188.73%  (p=0.000 n=10+9)
Fields/Mixed/1048576      55.1MB/s ± 2%  144.0MB/s ± 4%  +161.44%  (p=0.000 n=10+9)

Updates #19789
Updates #17856

Change-Id: If2ce1479542702e9cd65a82a462ba55ac8eb3876
Reviewed-on: https://go-review.googlesource.com/37959
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
@gopherbot
Copy link

CL https://golang.org/cl/39512 mentions this issue.

lparth pushed a commit to lparth/go that referenced this issue Apr 13, 2017
- use a string lookup to detect if a single byte is a space character
- determine the exact number of fields for ASCII and
  a possibly underestimated number of fields for non ASCII strings
  by doing a separate byte for byte scan of the input string
  before collecting the fields in an extra pass
- provide a fast path for ASCII only strings when collecting the fields
- avoid utf8.DecodeRuneInString and unicode.IsSpace for ASCII characters

Used golang.org/cl/33108 from Joe Tsai as starting point.

name                      old time/op    new time/op     delta
Fields/ASCII/16              284ns ± 1%      116ns ± 2%   -59.30%  (p=0.000 n=9+10)
Fields/ASCII/256            3.81µs ± 1%     0.80µs ± 1%   -79.10%  (p=0.000 n=10+10)
Fields/ASCII/4096           61.4µs ± 1%     12.3µs ± 1%   -79.96%  (p=0.000 n=10+9)
Fields/ASCII/65536           982µs ± 1%      235µs ± 0%   -76.04%  (p=0.000 n=10+9)
Fields/ASCII/1048576        16.7ms ± 2%      5.4ms ± 1%   -67.52%  (p=0.000 n=10+10)
Fields/Mixed/16              314ns ± 1%      168ns ± 1%   -46.33%  (p=0.000 n=9+10)
Fields/Mixed/256            3.92µs ± 1%     1.17µs ± 1%   -70.19%  (p=0.000 n=10+10)
Fields/Mixed/4096           69.1µs ± 1%     19.0µs ± 1%   -72.53%  (p=0.000 n=10+10)
Fields/Mixed/65536          1.12ms ± 1%     0.39ms ± 0%   -65.37%  (p=0.000 n=10+9)
Fields/Mixed/1048576        19.0ms ± 2%      7.3ms ± 4%   -61.75%  (p=0.000 n=10+9)

name                      old speed      new speed       delta
Fields/ASCII/16           56.3MB/s ± 1%  138.1MB/s ± 2%  +145.31%  (p=0.000 n=9+10)
Fields/ASCII/256          67.1MB/s ± 1%  321.0MB/s ± 1%  +378.26%  (p=0.000 n=10+10)
Fields/ASCII/4096         66.7MB/s ± 1%  333.0MB/s ± 1%  +398.97%  (p=0.000 n=10+9)
Fields/ASCII/65536        66.7MB/s ± 1%  278.4MB/s ± 0%  +317.39%  (p=0.000 n=10+9)
Fields/ASCII/1048576      62.7MB/s ± 2%  192.9MB/s ± 1%  +207.82%  (p=0.000 n=10+10)
Fields/Mixed/16           51.0MB/s ± 2%   94.9MB/s ± 1%   +85.87%  (p=0.000 n=10+10)
Fields/Mixed/256          65.4MB/s ± 1%  219.2MB/s ± 1%  +235.33%  (p=0.000 n=10+10)
Fields/Mixed/4096         59.3MB/s ± 1%  215.7MB/s ± 1%  +263.98%  (p=0.000 n=10+10)
Fields/Mixed/65536        58.6MB/s ± 1%  169.1MB/s ± 0%  +188.73%  (p=0.000 n=10+9)
Fields/Mixed/1048576      55.1MB/s ± 2%  144.0MB/s ± 4%  +161.44%  (p=0.000 n=10+9)

Updates golang#19789
Updates golang#17856

Change-Id: If2ce1479542702e9cd65a82a462ba55ac8eb3876
Reviewed-on: https://go-review.googlesource.com/37959
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
@gopherbot
Copy link

CL https://golang.org/cl/42810 mentions this issue.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe May 24, 2017
@golang golang locked and limited conversation to collaborators Aug 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants