New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: runtime: statefullize most 2 select blocks in loops at run time for each goroutine. #18846
Comments
I read the issue a few times but frankly I couldn't understand what are you
proposing.
Are you proposing adding prioritizes to select cases?
|
part of. But only for select blocks in for loops. |
That's definitely an incompatible behavior change. Select cases are
specified to be selected pseudorandomly and there are programs whose
correctness depends on that property.
|
I think pseudorandom is not the fundamental purpose of the select mechanism, the fundamental purpose is to distribute chances on all cases averagely. |
Maybe statefullizing all direct select blocks in a function calling is easier. |
Before making a proposal, it's important to first define the problem you are going to solve. Then the proposal is a potential solution to that problem. It sounds like the problem you want to solve has to do with the performance of select. Can you post a benchmark showing the specific performance issue? |
The original intention for this proposal is to (efficiently) solve the second problem I mentioned in the first comment. In the following program, it is expected no sends and receives performed.
In other words, there are many sends and receives wasted in the above program. To solve the problem, the below version is often adopted in practice.
However, in practice, for the most passes in the sender and receiver loop, stopSignal is not closed, The first problem I mentioned in the fist comment is really a by-product in solving the second problem. |
Ok, I benchmarked the two versions in the above comment, it is a surprise that the two-selects-in-loop version performs quite well, much better than I expected.
@rsc which version do you recommend to use in practice? |
@golang101 please write a proper testing.Benchmark benchmark. |
Here is a benchmark:
I don't like writing go benchmarks, the results of go benchmarks wave too much. And sometimes, I must make some modifications to my program code to make it benchmarkable. |
@golang101 That's fine but it's very helpful for us if you can write them in Go benchmark format because it means we can vary basic execution settings using the usual mechanisms. The fact is, the language definition requires that f1 sometimes pick dataCh, even though stopSignal is always ready. There's no way around that without a language change, and I don't think there's nearly enough evidence for that. Also the proposed change has flaws: select in a loop is special but select in a function called from within a loop is not? What if the function is inlined (either by the programmer or by the compiler)? Then the semantics change? This is not easy to understand or explain. Also:
This is already true today. Select makes up an order and then picks the first one it finds that is ready to run. The non-blocking select in f2 is also already optimized: it's not a real select, which is why f2 is only 5% slower than f1 despite having 100% more selects. If there's any work here, I'd rather make the f1 select 5% faster, so that you will be happy with the overall performance of f2, instead of changing the semantics of select in a way that is subtle and difficult to explain. |
Yes, there are many details hard to handle in the proposed method if it is used as an overall method. Would it be a good idea to let programmers assign a sort method for some special select blocks?
random: the current implementation I'm still not familiar with go compiler source, I would make some hacking and write some benchmarks to check if it is worth supporting the custom sort methods later. |
The intention of the proposal is to resolve two problems in the current stateless implementation.
Select is slow.Statefullizing select blocks will reduce memory allocations, so less time is consumed.
Besides less memory allocations, we can use a simpler cases sorting method.
For example, assume there are 5 cases, the initial cases order is [0 1 2 3 4].
If case 2 is selected in the first pass, then the order changes to [0 1 3 4 2].
In other words, last selected case will get the smallest chance to be selected in the next pass.
By using this method, no cases will be in starvation status.
The cases order is cached in the hselect struct and the order can be stored as a linked list.
Following example is a common use pattern of select.
Once the channel stopSignal becomes unblocked, the goroutine should exit in the next pass.
However when dataCh is a buffered channel, even if stopSignal becomes unblocked,
the second case may be still selected for several passes.
To make the goroutine exit as early as possible,
following version is adopted in practice instead :
Select is already slow, more selects will make programs even slower.
By using the cases sorting method mentioned above,
the stopSignal case will be selected once stopSignal becomes unblocked in the one-select version,
no needs to adopt the two-selects version.
=====================================
Caching any select blocks would be hard,
caching one would be not able to cover some common select use patterns.
So caching two would be a good compromise.
Or only cache the cases orders, which will use much less cache memory than cache the whole hselect struct.
This optimization may change the select behavior a bit,
but will not break the compatibility.
The text was updated successfully, but these errors were encountered: