Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/sync/semaphore: make semaphore resizable #29721

Closed
sherifabdlnaby opened this issue Jan 13, 2019 · 23 comments
Closed

proposal: x/sync/semaphore: make semaphore resizable #29721

sherifabdlnaby opened this issue Jan 13, 2019 · 23 comments

Comments

@sherifabdlnaby
Copy link

Hello,

Semaphores are often used to bound concurrency, for example I was implementing a goroutines pool pattern with weighted jobs, so a weighted semaphore is better and more performant than channel-based semaphore. a common functionality in goroutines pool is to be resizable, however the current sync/x/semaphore doesn't allow resizing the semaphore.

There is another implementation of a non-channel-based semaphores that supports resizing, but I found many bugs/deadlocks using them and they're all less performant than x/sync/semaphore.

The current implementation of x/sync/semaphore can be easily extended to be resizable, without affecting performance by any means.

@bcmills
Copy link
Contributor

bcmills commented Jan 14, 2019

CC @jba @Sajmani

@bcmills
Copy link
Contributor

bcmills commented Jan 14, 2019

This proposal needs more detail.

  • What are some concrete use-cases for resizable semaphores? (Links to existing code that could be replaced or improved by this change would be ideal.)

  • What is the specific API being proposed?

    • Does it add and remove capacity, or set the capacity to an absolute quantity?
    • If it sets the absolute quantity, what happens if that capacity is lower than the number of outstanding tokens?

@bcmills
Copy link
Contributor

bcmills commented Jan 14, 2019

Note that if you have some upper bound, you can already “resize” a semaphore.Weighted by acquiring and releasing surplus tokens:

const (
	maxCap = […]
	initialCap = […]  // ≤ maxCap
)
sem := semaphore.NewWeighted(maxCap)
sem.Acquire(ctx, maxCap - initialCap)
// Effective capacity of sem is now initialCap,
// but it can be “resized” up to maxCap by releasing tokens.

@sherifabdlnaby
Copy link
Author

sherifabdlnaby commented Jan 14, 2019

@bcmills

Example:

cockroachdb uses a semaphore for rate limiting, I found a bug (cockroachdb/cockroach#33554) in the semaphore package they were using that has potential to deadlock.
They decided to keep using this semaphore implementation because the bug was not currently reachable in their code and there aren't many alternative resizable semaphores. cockroachdb/cockroach#33554 (comment)

fixing the bug in that mentioned semaphore implementation halved its performance (300% slower than x/sync/semaphore) according to benchmarks.


What is the specific API being proposed?

Does it add and remove capacity, or set the capacity to an absolute quantity?

It sets the capacity to an absolute quantity.

If it sets the absolute quantity, what happens if that capacity is lower than the number of outstanding tokens?

If semaphore is resized to a number lower than the current acquired tokens it will not acquire any new tokens until a release that brings down current capacity to be lower or equal to the new semaphore capacity.


More details

What happens to the Aquire of N that was marked impossible and not added to waiters list ? and what happens if an Aquire of N that was already in the waiting list when a resize lower the capacity to be < N ?

In the current Implementation an Aquire of N where N is > capacity will not be added to the waiters' linked list because It is impossible to get acquired. and will block forever unless cancelled by its ctx.

In the PR, I've added a new linked-list for the impossible waiters, when Acquire(N) is impossible it will be added to the impossible waiters list, Impossible waiters can become possible if a resize of the semaphore was big enough.

So now when Resize(N) is called this is what happens:

  1. Set a new size for the semaphore.
  2. Will loop through the waiters list removing the now-impossible waiters to the impossible waiters.
  3. Will loop through the Impossible Waiters list adding the now-possible ones to the waiters list.
  4. Will signal possible waiters from the front of the waiters list to unblock ( same as Release() mechanism).

A resize is O(W+I), W = length of waiters list, I = length of impossible waiters list.

@sherifabdlnaby
Copy link
Author

Note that if you have some upper bound, you can already “resize” a semaphore.Weighted by acquiring and releasing surplus tokens:

const (
	maxCap = […]
	initialCap = […]  // ≤ maxCap
)
sem := semaphore.NewWeighted(maxCap)
sem.Acquire(ctx, maxCap - initialCap)
// Effective capacity of sem is now initialCap,
// but it can be “resized” up to maxCap by releasing tokens.

Yes, this will work effectively if you have an upper-bound.
I believe a simple .Resize(N) will be simpler to use.

@bcmills
Copy link
Contributor

bcmills commented Jan 14, 2019

math.MaxInt64 is always a valid upper bound.

So it seems the only behavioral difference today is the FIFO behavior of “impossible” tasks. Those are not generally something you want to have in a well-behaved system anyway, since they consume resources while waiting for cancellation; so, how often does that difference matter?

@sherifabdlnaby
Copy link
Author

sherifabdlnaby commented Jan 17, 2019

@bcmills

math.MaxInt64 is always a valid upper bound.

Yes it's, but this will allow impossible tasks to be added to waiters list and will block other waiters when that impossible task is at the front of the list thus causing a deadlock which is unexpected semaphore behavior.

I understand your point that it is not generally something you want to have in a well-behaved system anyway.

But my opinion is a Resize() would be simpler to use than an ad-hoc way to resize the semaphore that may have unexpected deadlocks because of the semaphore implementation.

@ALTree ALTree changed the title x/sync/semaphore: make semaphore resizable. x/sync/semaphore: make semaphore resizable Jan 7, 2020
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Feb 24, 2021
@rsc
Copy link
Contributor

rsc commented Feb 24, 2021

/cc @jba

@rsc
Copy link
Contributor

rsc commented Feb 24, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Feb 24, 2021
@sherifabdlnaby
Copy link
Author

sherifabdlnaby commented Feb 24, 2021

@rsc This is great news, I've been using my fork for a long time, and having it part of the official/x/sync/semaphore` will be very welcomed.
I updated the PR golang/sync#3 and sync it with upstream changes.

@rsc
Copy link
Contributor

rsc commented Mar 10, 2021

@sherifabdlnaby Can you say something about why you are maintaining a fork instead of using a wrapper that does the trick @bcmills mentioned with allocating a very large-sized semaphore but then using Acquire and Release calls to take capacity away / bring it back?

@sherifabdlnaby
Copy link
Author

sherifabdlnaby commented Mar 10, 2021

@rsc
Using the approach suggested by @bcmills was not ideal in my usecase, in func (s *Weighted) notifyWaiters() #L117 you see that when a resource is released; only the front of the waiters is tried for the acquiring and if no tokens are available it explicitly avoids looking up another possible smaller acquires to avoid starvation.

Hence, when trying to acquire the semaphore with weight > the effective size of the semaphore it blocks the semaphore until a resize that makes the weight <= effective size; it will not let smaller semaphores get acquired despite blocking forever until the semaphore is resized again.

In my implementation the acquires that their weight >= the size of the semaphore are moved to a separate impossibleWaiters queue and is only added to the semaphore queue when it's possible for them to be acquired eventually (checked at every invocation of resize() function).

In my use-case, this was very important to the performance of the semaphore. I use it for dynamically bounding concurrency and needed a more performant method than channel-based semaphores.

@jba
Copy link
Contributor

jba commented Mar 11, 2021

I don't see anything wrong with this proposal technically. It would be different if Acquire rejected impossible requests, but instead it lets them wait indefinitely. The only difference here (assuming Resize is never called) is some extra space, negligible compared to the goroutine's memory.

The effect of Resize cannot be obtained with the existing semaphore, for the reason explained above: when you resize down, you can deadlock because the waiter at the front of the list can never proceed.

The use case of weighted tasks with dynamic concurrency seems like a reasonable one.

@ajwerner
Copy link

cockroachdb uses a semaphore for rate limiting, I found a bug (cockroachdb/cockroach#33554) in the semaphore package they were using that has potential to deadlock.
They decided to keep using this semaphore implementation because the bug was not currently reachable in their code and there aren't many alternative resizable semaphores. cockroachdb/cockroach#33554 (comment)

fixing the bug in that mentioned semaphore implementation halved its performance (300% slower than x/sync/semaphore) according to benchmarks.

Thanks for the reminder on this. In other use cases, CockroachDB has generally decided to stop using the semaphore library altogether. That library is not very good from a fairness or allocation perspective. The quotapool library provides fairness, resizing, and dramatically more flexibility. This library has proven to be quite good. I'll pick it up in our internal limit package and I'll try to work on getting it spun out into an Apache license somewhere.

@rsc
Copy link
Contributor

rsc commented Mar 24, 2021

@sherifabdlnaby thanks for the reply, specifically:

In my implementation the acquires that their weight >= the size of the semaphore are moved to a separate impossibleWaiters queue and is only added to the semaphore queue when it's possible for them to be acquired eventually (checked at every invocation of resize() function).

It sounds like your semaphore is not first come, first served, while the Go implementation is. So in addition to making it resizable you are also asking for the policy to be changed about how requests are satisfied. I am less sure about making that change. Thoughts on that part, @bcmills and @jba?

@sherifabdlnaby
Copy link
Author

@rsc

It sounds like your semaphore is not first come, first served, while the Go implementation is. So in addition to making it resizable you are also asking for the policy to be changed about how requests are satisfied.

The policy stays as first come, first served as long the requests are satisfiable, when a request is impossible to be acquired given the current size of the semaphore, it will be moved to the impossibleWaiters queue; now when It is possible to be acquired (after a resize) it will be put back at the end of the main queue.

So essentially the policy is the same as long as you have not introduced resize() (and having a request that is > the size of the semaphore). making the change backward compatible and non-breaking to existing behavior.

@jba
Copy link
Contributor

jba commented Mar 25, 2021

@sherifabdlnaby, did you check out the quotapool package mentioned above? That sounds like an improvement on what you're proposing here.

@sherifabdlnaby
Copy link
Author

@jba I didn't know about the quotapool package at the time I needed the resizable semaphore. It looks pretty sophisticated and I would definitely try it.
I don't know how much overhead is added to achieve such sophistication, I'll have to benchmark it. The reason I needed a resizable semaphore is I needed a more performant worker pool than the one that uses channels as a semaphore.

@ajwerner
Copy link

I don't know how much overhead is added to achieve such sophistication, I'll have to benchmark it. The reason I needed a resizable semaphore is I needed a more performant worker pool than the one that uses channels as a semaphore.

Let me know what you discover! I tuned the thing somewhat heavily. It has a fast path which avoids using channels if there is sufficient quota. Any allocations are pooled so generally there are no allocations to interact with it. The channels are also pooled as it is designed to not close them but rather recycle them.

@rsc
Copy link
Contributor

rsc commented Mar 31, 2021

Given that @sherifabdlnaby has a customized fork and also @ajwerner has written another package to serve the specific need, and given that to serve this need in x/sync/semaphore would require changing the queueing order for pending requests, it sounds like we probably should leave x/sync/semaphore's semantics alone and encourage the use of alternate packages when they are a better fit.

@rsc rsc changed the title x/sync/semaphore: make semaphore resizable proposal: x/sync/semaphore: make semaphore resizable Mar 31, 2021
@rsc
Copy link
Contributor

rsc commented Apr 1, 2021

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Apr 1, 2021
@sherifabdlnaby
Copy link
Author

Thanks all for your time discussing this 🙏🏻

@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Apr 7, 2021
@rsc
Copy link
Contributor

rsc commented Apr 7, 2021

No change in consensus, so declined.
— rsc for the proposal review group

@golang golang locked and limited conversation to collaborators Apr 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

6 participants