Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: regexp causes lock contention #24411

Closed
jkohen opened this issue Mar 15, 2018 · 7 comments
Closed

regexp: regexp causes lock contention #24411

jkohen opened this issue Mar 15, 2018 · 7 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@jkohen
Copy link
Contributor

jkohen commented Mar 15, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.10

Does this issue reproduce with the latest release?

I'm using the latest release.

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

I'm testing the Prometheus server 2.1 (prometheus.io) under high load (>300k samples/s), and I'm seeing contention within Regexp that prevents scaling further (see goroutines profile below). My investigation shows that this is caused by the lock around the machine objects.

I wrapped the Regexp objects in the Prometheus code with a sync.Pool, and the lock contention went away, allowing me to scale up to 500k samples/s and beyond (I'm in the process of load testing and I haven't found the new ceiling at the time of this post).

I would like to propose changing the Regexp implementation to use sync.Pool instead of a slice and a mutex. This is a very simple change limited to the Regexp.get and Regexp.put methods in https://golang.org/src/regexp/regexp.go. I'm happy to send in the change if this proposal gets accepted.

Redacted goroutine profile:

goroutine profile: total 74003
15383 @ 0x42d7ea 0x42d89e 0x43e364 0x43e07d 0x471238 0x70d7dd 0x70b010 0x711aa4 0x14db96f 0x14db308 0x1560847 0x155dc66 0x1560d7f 0x14e70ac 0x14e36b0 0x14e2421 0x45b581
#	0x43e07c	sync.runtime_SemacquireMutex+0x3c								/usr/lib/google-golang/src/runtime/sema.go:71
#	0x471237	sync.(*Mutex).Lock+0x107									/usr/lib/google-golang/src/sync/mutex.go:134
#	0x70d7dc	regexp.(*Regexp).get+0x3c									/usr/lib/google-golang/src/regexp/regexp.go:211
#	0x70b00f	regexp.(*Regexp).doExecute+0x3f									/usr/lib/google-golang/src/regexp/exec.go:419
#	0x711aa3	regexp.(*Regexp).FindStringSubmatchIndex+0x93							/usr/lib/google-golang/src/regexp/regexp.go:965
#	0x14db96e	github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel.relabel+0x5fe	/usr/local/google/home/jkohen/go/src/github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel/relabel.go:60
#	0x14db307	github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel.Process+0x77	/usr/local/google/home/jkohen/go/src/github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel/relabel.go:33

14426 @ 0x42d7ea 0x42d89e 0x43e364 0x43e07d 0x471238 0x70d8ed 0x70b1bc 0x711aa4 0x14db96f 0x14db308 0x1560847 0x155dc66 0x1560d7f 0x14e70ac 0x14e36b0 0x14e2421 0x45b581
#	0x43e07c	sync.runtime_SemacquireMutex+0x3c								/usr/lib/google-golang/src/runtime/sema.go:71
#	0x471237	sync.(*Mutex).Lock+0x107									/usr/lib/google-golang/src/sync/mutex.go:134
#	0x70d8ec	regexp.(*Regexp).put+0x3c									/usr/lib/google-golang/src/regexp/regexp.go:229
#	0x70b1bb	regexp.(*Regexp).doExecute+0x1eb								/usr/lib/google-golang/src/regexp/exec.go:456
#	0x711aa3	regexp.(*Regexp).FindStringSubmatchIndex+0x93							/usr/lib/google-golang/src/regexp/regexp.go:965
#	0x14db96e	github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel.relabel+0x5fe	/usr/local/google/home/jkohen/go/src/github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel/relabel.go:60
#	0x14db307	github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel.Process+0x77	/usr/local/google/home/jkohen/go/src/github.com/jkohen/prometheus/vendor/github.com/prometheus/prometheus/pkg/relabel/relabel.go:33
@ALTree ALTree changed the title Regexp causes lock contention regexp: regexp causes lock contention Mar 15, 2018
@ALTree ALTree added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 15, 2018
@agnivade
Copy link
Contributor

Does copying the regexes across multiple goroutines help ?

https://golang.org/pkg/regexp/#Regexp.Copy

When using a Regexp in multiple goroutines, giving each goroutine its own copy helps to avoid lock contention.

@jkohen
Copy link
Contributor Author

jkohen commented Mar 15, 2018

Copying works in that the contention goes away, but adds pressure to the GC and increases memory usage. The code would have to be radically changed to support regexp copies per-goroutine, and Go maintainers have pushed back on cpu-local variables (which would fit the need ideally).

Note that this issue is about improving the performance of the internal Regexp machine cache, not about adding any new features or capabilities.

@jkohen jkohen closed this as completed Mar 15, 2018
@jkohen jkohen reopened this Mar 15, 2018
@bcmills
Copy link
Contributor

bcmills commented Mar 15, 2018

Go maintainers have pushed back on cpu-local variables (which would fit the need ideally).

That's #18802, and it's under active investigation.

But a CPU-local variable wouldn't necessarily work here: if the input is large (or the regexp is very complex), it's entirely possible that the goroutine will be descheduled (and rescheduled on a different CPU/thread) midway through the call.

sync.Pool seems like the right choice to me, if the benchmarks support it.

@gopherbot
Copy link

Change https://golang.org/cl/101715 mentions this issue: regexp: use sync.Pool to cache regexp.machine objects

@andybons andybons added this to the Go1.11 milestone Mar 26, 2018
@andybons andybons added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 26, 2018
@bradfitz bradfitz reopened this Jul 9, 2018
@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 Jul 9, 2018
@bradfitz
Copy link
Contributor

bradfitz commented Jul 9, 2018

@agnivade
Copy link
Contributor

@rsc - Just wanted to check up on this. I believe your regexp related CLs for 1.12 takes care of this ? Or is there still something to be done on this ?

@bradfitz
Copy link
Contributor

Marking fixed from:

https://golang.org/cl/139779
https://golang.org/cl/139780
https://golang.org/cl/139781
https://golang.org/cl/139782
https://golang.org/cl/139783
https://golang.org/cl/139784

That final commit adds the "Deprecated" text:

// Copy returns a new Regexp object copied from re.
// Calling Longest on one copy does not affect another.
//
// Deprecated: In earlier releases, when using a Regexp in multiple goroutines,
// giving each goroutine its own copy helped to avoid lock contention.
// As of Go 1.12, using Copy is no longer necessary to avoid lock contention.
// Copy may still be appropriate if the reason for its use is to make
// two copies with different Longest settings.
func (re *Regexp) Copy() *Regexp {
	re2 := *re
	return &re2
}

@golang golang locked and limited conversation to collaborators Nov 14, 2019
@rsc rsc removed their assignment Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

8 participants