Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: support case-insensitive prefix strings #48955

Open
bboreham opened this issue Oct 13, 2021 · 2 comments
Open

regexp: support case-insensitive prefix strings #48955

bboreham opened this issue Oct 13, 2021 · 2 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@bboreham
Copy link
Contributor

For a pattern like foo.*bar, the regexp compiler will extract the string "foo" and search for it using strings.Index before trying anything else.

I propose that this be extended to the case-insensitive version (?i)foo.*bar.

I did a trial implementation just calling strings.EqualFold repeatedly, and it is much faster:

name                old time/op    new time/op     delta
Match/Easy0i/16-8     2.99ns ± 1%     2.91ns ± 1%    -2.74%  (p=0.008 n=5+5)
Match/Easy0i/32-8      589ns ± 1%       56ns ± 0%   -90.51%  (p=0.008 n=5+5)
Match/Easy0i/1K-8     17.3µs ± 2%      7.5µs ± 5%   -56.52%  (p=0.008 n=5+5)
Match/Easy0i/32K-8     693µs ± 0%      259µs ± 0%   -62.61%  (p=0.008 n=5+5)
Match/Easy0i/1M-8     22.8ms ± 6%      8.4ms ± 1%   -63.28%  (p=0.008 n=5+5)
Match/Easy0i/32M-8     714ms ± 1%      269ms ± 0%   -62.32%  (p=0.008 n=5+5)

name                old speed      new speed       delta
Match/Easy0i/16-8   5.35GB/s ± 1%   5.50GB/s ± 1%    +2.82%  (p=0.008 n=5+5)
Match/Easy0i/32-8   54.3MB/s ± 1%  572.6MB/s ± 0%  +954.04%  (p=0.008 n=5+5)
Match/Easy0i/1K-8   59.1MB/s ± 2%  135.9MB/s ± 5%  +130.11%  (p=0.008 n=5+5)
Match/Easy0i/32K-8  47.3MB/s ± 0%  126.5MB/s ± 0%  +167.44%  (p=0.008 n=5+5)
Match/Easy0i/1M-8   46.0MB/s ± 6%  125.0MB/s ± 1%  +171.98%  (p=0.008 n=5+5)
Match/Easy0i/32M-8  47.0MB/s ± 1%  124.8MB/s ± 0%  +165.39%  (p=0.008 n=5+5)
@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 15, 2021
@toothrot toothrot added this to the Backlog milestone Oct 15, 2021
@seankhliao
Copy link
Member

feel free to send a CL / PR with benchmarks

@gopherbot
Copy link

Change https://golang.org/cl/358756 mentions this issue: regexp: handle prefix string with fold-case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants