-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: go1.8 regression: sync/atomic loop elided #19182
Comments
Hmm. Let me tag some runtime and sync experts @aclements @dvyukov @randall77. |
Looking at the assembly, the increment (and in fact the whole for loop) has been (over-)optimized away. (Edit: I missed the jmp initially -- as @dvyukov points out, there is an empty loop.) I'll bisect. |
A git bisect indicates d6098e4.
/cc @randall77 |
This bug applies also to |
The atomic.AddUint64 loop is compiled to an empty loop. |
Is the runtime miscompiled like this? |
I don't think the runtime contains any loops like that, so yes (same optimizations) and no (no code matching this amusing optimization). Is this breaking any real applications? I had to write a compiler test program (for loop rescheduling, hence could not contain any function calls) very carefully to ensure that the loop remained, but that was only a test. |
@dr2chase: I'm using a for-loop with atomic read+cas, so I guess in go1.8 it won't work anymore: index.go#L55-L62. |
@rjeczalik from @dvyukov's description above, I don't think that type of code would be affected (the loop has an exit condition). |
The code in the original post is real enough. This program has side-effect – it prints to stdout. Someone can use this output to produce different results. |
It also affects |
Isn't 0 a valid value after the loop has executed 2^64 times and the atomic has overflowed? |
Yeah...after 584 years (at 1 ns / loop). |
This is perfectly valid given Go's memory model. Are there any plans to update the memory model to include happens-before semantics on the atomic operations? |
@nikdeapen there is always #5045: "define how sync/atomic interacts with memory model". However, I'm 99% certain that any memory model updates for #5045 won't add more happens-before edges to this program. The issue here is more one of visibility -- as @dvyukov said, the expectation is something like "atomic writes should be visible to other threads in a finite amount of time". Perhaps something similar to that would be codified in a future MM change regarding sync/atomic. |
@josharian where are you getting the 1ns from? Is there a minimum amount of time an operation needs to take? |
@cespare thanks for the reply. Java has volatile (and atomic) variables which provide happens-before semantics that allow them to be used as flags and other types of synchronizers. In Go you would also have provide some type of synchronization along with the atomic variable to ensure memory visibility. But if you are already using another type of synchronization it seems unlikely that you would even need the atomic update. Maybe i'm misssing something, but I just don't see too many realistic use cases for atomic variables if they don't have happens-before semantics. The "atomic writes should be visible to other threads in a finite amount of time" statement seem too "maybe-maybe-not" for a programming language. It seems like this would lead to bugs that are nearly impossible to test for. I'd love to see happens-before on atomics but mutexes and channels work great for now. |
Happens-before is unrelated to this case. Even if we document happens-before relations for atomic variables (note that currently they are still implied), there is nothing that will force the loop to see the updates as there is yet no any happens-before relations. Happens-before is only about visibility of secondary/dependent data, not about primary visibility. To guarantee that the loop prints increasing values over times we need 2 guarantees:
|
@dr2chase I can imagine some real uses of this (though, probably not very representative for Go). For example, during very fine-grained parallelization you may have a dedicated goroutine running on the HT sibling spin wait for input using atomic.Load, do some (inline) computations and store result back using atomic.Store, and then repeat. However, now I realized that any such program will necessary hang dead during GC (as the loop is not preemptible even with 1.7). And since we have forced GCs every few minutes, not allocating will not help. So I bet the original program hangs dead with 1.7 after 5 minutes, which can't be considered as "working". |
Can't verify that – running it on macOS for an hour now, it's working, numbers are increasing. |
"Atomic writes are visible to other goroutine in a finite amount of time."
Memory model rules are usually framed as "if a then b" as opposed to "time
and single events happen". I did a search for the above rule and couldn't
immediately find it in any C/C++ standards doc. A reference would be of
interest. I am also skeptical that a HW architectural spec would discuss
time and contain the guarantees we would need.
…On Tue, Feb 21, 2017 at 11:51 AM, Alexey Palazhchenko < ***@***.***> wrote:
So I bet the original program hangs dead with 1.7 after 5 minutes
Can't verify that – running it on macOS for an hour now, it's working,
numbers are increasing.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#19182 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA7Wn5r4ATQWoIowS1-LY3u8XIBDoqa5ks5rexYDgaJpZM4MFV4Y>
.
|
I think the "hang in 5 minutes" estimate was based on a conservative guess about GC frequency in usual programs (as opposed to this tiny test). Throw a little gratuitous memory allocation in there, it will hang soon enough. The test program I carefully wrote to avoid this was for the rescheduling-check-on-loop-backedges experiment; default compilation with no rescheduling check, call-free infinite loops cause GC to block forever. |
CL https://golang.org/cl/37333 mentions this issue. |
@RLH The phrase to search for in the C++ standard is "forward progress." As far as I can tell the C++ standard does not require such a program to complete, but it does "encourage" it.
|
By-the-way, try that program on a uniprocessor, let me know what it prints. |
Yes thanks, it looks like the C++ documentation would consider a comparable
C++ program to be a valid program and the atomic operation is eventually
observable.
…-- break glass here --
From http://en.cppreference.com/w/cpp/language/memory_model
In a valid C++ program, every thread eventually does one of the following:
- terminate
- makes a call to an I/O library function
- reads or modifies a volatile
<http://en.cppreference.com/w/cpp/language/cv> object
- performs an atomic operation or a synchronization operation
No thread of execution can execute forever without performing any of these
observable behaviors.
Note that it means that a program with endless recursion or endless loop
(whether implemented as a for-statement
<http://en.cppreference.com/w/cpp/language/for> or by looping goto
<http://en.cppreference.com/w/cpp/language/goto> or otherwise) has undefined
behavior <http://en.cppreference.com/w/cpp/language/ub>. This allows the
compilers to remove all loops that have no observable behavior, without
having to prove that they would eventually terminate.
A thread is said to *make progress* if it performs one of the execution
steps above (I/O, volatile, atomic, or synchronization), blocks in a
standard library function, or calls an atomic lock-free function that does
not complete because of a non-blocked concurrent thread.
On Tue, Feb 21, 2017 at 4:11 PM, dr2chase ***@***.***> wrote:
By-the-way, try that program on a uniprocessor, let me know what it prints.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19182 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA7Wn_8CX9lKFzcxV5YGGFxZBLVrNLZYks5re1L6gaJpZM4MFV4Y>
.
|
@dr2chase 5 minutes is actually 2 minutes, and it is based on:
|
Yes, it's atypical, but it's still better to at least document the intention rather than say nothing. This is essentially the limit of what you can say in a formal language spec. |
Re-opening for cherry-pick to 1.8.1. |
This fixes #19182 upstream [1]. [1] golang/go#19182 Package-Manager: Portage-2.3.3, Repoman-2.3.2
CL https://golang.org/cl/39595 mentions this issue. |
Cherry-picked to release. |
…r do-not-remove Added a flag to generic and various architectures' atomic operations that are judged to have observable side effects and thus cannot be dead-code-eliminated. Test requires GOMAXPROCS > 1 without preemption in loop. Fixes #19182. Change-Id: Id2230031abd2cca0bbb32fd68fc8a58fb912070f Reviewed-on: https://go-review.googlesource.com/39595 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
sync/atomic: go1.8 regression: atomic.AddUint64 is unreliable
Sometimes
atomic.AddUint64
has no effect.Update: @cespare has discovered that the loop containing
atomic.AddUint64
has been elided.Update: @rjeczalik has reported that this behavior also occurs with
atomic.StoreUint64
andatomic.CompareAndSwapUint64
functions.atomic.go:
For go1.4, go1.5, go1.6, and go1.7:
The
atomic.AddUint64(&a, uint64(1))
statement works as expected.For go1.8 and go tip:
The
atomic.AddUint64(&a, uint64(1))
statement appears to have no effect.Interestingly, we can make go1.8 and go tip work with a simple code modification that should not have any effect:
atomic_pass.go:
After
AddUint64(&a, uint64(1)
,new
should be greater than zero andruntime.Gosched()
should not be executed. Substitutingprint()
forruntime.Gosched()
also works. If the lineruntime.Gosched() // or print()
is commented out the program reverts to failure.The text was updated successfully, but these errors were encountered: