Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: counter-intuitive comparison of zero-sized pointers wrapped in interfaces #65878

Closed
Merovius opened this issue Feb 22, 2024 · 39 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@Merovius
Copy link
Contributor

Go version

go version go1.22.0 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/mero/.cache/go-build'
GOENV='/home/mero/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/mero/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/mero'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/mero/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/home/mero/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.0'
GCCGO='/usr/bin/gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/mero/tmp/x/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build624578175=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Playground:

package main

import "fmt"

func main() {
	a, b := new(struct{}), new(struct{})
	x, y := any(a), any(b)
	fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{}))
}

What did you see happen?

false true false

What did you expect to see?

Either false false false or true true true.

The spec defines comparison for interfaces as:

Interface types that are not type parameters are comparable. Two interface values are equal if they have identical dynamic types and equal dynamic values or if both have value nil.

Now, from this definition, the observed behavior should clearly be impossible. The dynamic values are not equal - either before or after wrapping them into any. However, the interfaces compare as equal.

I suspect there is an optimization going on, where the comparison function stored in the rtype short-circuits for pointers to zero-sized values, based on the permission for all zero-sized values to have the same address. But that is only correct, if the compiler does the same short-circuiting for ==, in my opinion.

Originally mentioned on golang-nuts.

@Merovius
Copy link
Contributor Author

Merovius commented Feb 22, 2024

Hm, on second thought, the spec allows this, strictly speaking:

Pointer types are comparable. Two pointer values are equal if they point to the same variable or if both have value nil. Pointers to distinct zero-size variables may or may not be equal.

I thought the exception was just that two zero-sized variables are allowed to have the same address, but if there is an exception for the pointer-comparison itself as well, that means a Go implementation would technically be allowed to return the result of a coin-flip, whenever such pointers are compared and literally any behavior here is correct.

TBQH, that still seems a pretty confusing situation. I would personally vote to at least in gc make this consistent (e.g. by making the comparison always true, regardless of the actual addresses). But it changes this from an implementation-bug into a working-as-confusingly-as-specified.

@bserdar
Copy link

bserdar commented Feb 22, 2024

@Merovius I don't think that part of the spec can be applied here. If a and b are pointers to a zero-sized value, then they are either always equal, or always different for a compilation of a program. That is:

fmt.Println(a==b)
fmt.Println(a==b)

Both printlns must print true, or false. It should not be the case that one prints true, and the other false.

The case here is: x, y := any(a), any(b) and a!=b but x==y . The problem is not with pointers to distinct zero-size variables, but with interfaces containing pointers to zero-size variables.

@mateusz834
Copy link
Member

mateusz834 commented Feb 22, 2024

Funny enough printing the pointers, make it return true.

a, b := new(struct{}), new(struct{})
x, y := any(a), any(b)
fmt.Printf("%p %p\n", a, b)
fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{})) // true true true

Also:

a, b := new(struct{}), new(struct{})
x, y := any(a), any(b)
runtime.KeepAlive(&a)
fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{})) // true true false
a, b := new(struct{}), new(struct{})
x, y := any(a), any(b)
runtime.KeepAlive(&a)
runtime.KeepAlive(&b)
fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{})) // true true true
a, b := new(struct{}), new(struct{})
x, y := any(a), any(b)
fmt.Println(uintptr(unsafe.Pointer(a))) // 824634001135
fmt.Println(uintptr(unsafe.Pointer(b))) // 824634232559
fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{})) // false true false

@mateusz834 mateusz834 added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 22, 2024
@Merovius
Copy link
Contributor Author

If a and b are pointers to a zero-sized value, then they are either always equal, or always different for a compilation of a program.

That is not something the spec guarantees, no.

That is: […]. Both printlns must print true, or false. It should not be the case that one prints true, and the other false.

It's certainly confusing, but it is working as specified. There is nothing in the spec saying "a comparison of two values must always return the same thing".

FTR, that's why I made the distinction above. There are two places where pointers to zero-sized values are special-cased:

  1. It is allowed for any variables of zero size to have the same address.
  2. A comparison of two pointers to a zero-sized value may or may not be equal.

Your intuition would be right, if only the first was the case: In that case, every variable has a (consistent) address and the pointer-comparison would always have to compare those addresses.

But with the second exception, the comparison can return literally anything. There is no qualification on that "may or may not".

@Merovius
Copy link
Contributor Author

Merovius commented Feb 22, 2024

@bserdar To be clear, according to that spec sentence, it would be correct for a == a to be false. Confusing. But correct.

And note that this is not the only case where == is irreflexive. For example, x := math.NaN(); fmt.Println(x == x) is also false.

@mateusz834
Copy link
Member

The current implementation seems to return true, but only when a and b escape to the heap:

var escapes *struct{}

func main() {
	a, b := new(struct{}), new(struct{})
	x, y := any(a), any(b)
	escapes = a
	escapes = b
	fmt.Println(a == b, x == y, x.(*struct{}) == y.(*struct{})) // true true true
}
./main.go:10:13: new(struct {}) escapes to heap
./main.go:10:28: new(struct {}) escapes to heap

@bserdar
Copy link

bserdar commented Feb 22, 2024

@Merovius I see the point about NaN, but note that the behavior of NaN is consistent. My point here is that a program should not evaluate a==b differently for each comparison instance. If a==b and nothing modifies them, then another comparison of a==b should return the same. What we are observing here is that a!=b, but any(a)==any(b).

@Merovius
Copy link
Contributor Author

Merovius commented Feb 22, 2024

@bserdar If you want to argue that the program behaves incorrectly, you have to argue from the spec, not from your expectation. You say that comparisons "should" behave a certain way - but if the spec does not say that comparisons behave that way, the program behaves correctly.

Again, to be clear: I agree that it's confusing and that it should be fixed. But the program behaves in accordance with the language specification.

@seankhliao seankhliao changed the title incorrect comparison of zero-sized pointers wrapped in interfaces cmd/compile: incorrect comparison of zero-sized pointers wrapped in interfaces Feb 23, 2024
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 23, 2024
@Liennie
Copy link

Liennie commented Feb 23, 2024

@Merovius

To be clear, according to that spec sentence, it would be correct for a == a to be false. Confusing. But correct.

I don't think that's the case. The spec talks about pointers to distinct zero-size variables. When comparing a to itself, both sides of the comparison point to the same variable, so the relevant part of the spec in this case is Two pointer values are equal if they point to the same variable and the result should always be true.

This also means that if you do something like this:

s := struct{}{}
a, b := &s, &s

a and b should also always be equal.

@Merovius
Copy link
Contributor Author

@Liennie fair enough. You are right.

@go101
Copy link

go101 commented Feb 25, 2024

A bug introduced in Go 1.9.

package main

type T struct {}

func main() {
	var a, b = &T{}, &T{}
	var x, y interface{} = a, b
	println(a == b, x == y)
}
$ gotv 1.8. run main.go
[Run]: $HOME/.cache/gotv/tag_go1.8.7/bin/go run main.go
true true
$ gotv 1.9. run main.go
[Run]: $HOME/.cache/gotv/tag_go1.9.7/bin/go run main.go
false true

@go101
Copy link

go101 commented Feb 25, 2024

The behavior of Go 1.9 and 1.10 is more weird.

package main

type T struct {}

func main() {
  var a, b = &T{}, &T{}
  if (a != b) { println(0) }
  if (a == b) { println(1) }
  println(a == b || a != b)
}
$ gotv 1.8. run a.go
[Run]: $HOME/.cache/gotv/tag_go1.8.7/bin/go run a.go
1
true
$ gotv 1.9. run a.go
[Run]: $HOME/.cache/gotv/tag_go1.9.7/bin/go run a.go
false
$ gotv 1.10. run a.go
[Run]: $HOME/.cache/gotv/tag_go1.10.8/bin/go run a.go
false
$ gotv 1.11. run a.go
[Run]: $HOME/.cache/gotv/tag_go1.11.13/bin/go run a.go
0
true

@Merovius
Copy link
Contributor Author

FWIW this is the issue where the spec phrasing was decided. It was pre Go 1.0. And Russ' comment is pretty clear, that there are no guarantees made whatsoever.

So, I disagree strongly with the level of certainty in the statement "this is a bug". It's counter-intuitive, but it is working as specified and it was intended that there are no guarantees made.

I believe any program relying on pretty much any behavior for comparison to zero-sized pointers (except a == a, as mentioned) is ultimately incorrect as I'm not sure we even can realistically specify behavior. So I don't think anything will really change, practically, for how we should write Go code.

To me, the only question is if we want to be nice and provide more predictable behavior than the spec defines. But even then, we should be aware that after a conversion to unsafe.Pointer for example, there is likely still going to be some confusing behavior.

@go101
Copy link

go101 commented Feb 25, 2024

@griesemer

@go101
Copy link

go101 commented Feb 25, 2024

The title of the issue is not correct. This problem is interface unrelated.
Interfaces are used just to prove there is something wrong.

package main

import "unsafe"

type T struct {
	_ struct{ x [0]func() }
}

func main() {
	var a, b = &T{}, &T{}
	println(uintptr(unsafe.Pointer(a)) == uintptr(unsafe.Pointer(b))) // true
	println(a == b) // false
}

@Merovius
Copy link
Contributor Author

Merovius commented Feb 25, 2024

@go101 as the person who filed the proposal, I reserve the right to decide what it's about (though if someone with more intimate knowledge about the root cause wants to rename it, they can, of course).

In particular, the unsafe.Pointer example seems unconvincing to me, as unsafe.Pointer is no longer a pointer to a zero-sized type. Note, also, that the behavior of converting between unsafe.Pointer and uintptr is completely implementation-defined. So it, also, can pretty much definitionally not behave incorrectly.

[edit] then again, it is actually incorrect in a different way [/edit]

@Merovius Merovius changed the title cmd/compile: incorrect comparison of zero-sized pointers wrapped in interfaces cmd/compile: counter-intuitive comparison of zero-sized pointers wrapped in interfaces Feb 25, 2024
@go101
Copy link

go101 commented Feb 25, 2024

Comparisons of zero-sized pointers wrapped in interfaces is not counter-intuitive, comparisons of zero-sized pointers is.

@go101
Copy link

go101 commented Feb 26, 2024

New one:

package main

var a, b [0]int
var p, q = &a, &b

func main() {
	println(&a == &b, p == q) // false true
}

Totally interface unrelated.

[edit]:

package main

var a, b [0]int
var p, q = &a, &b

func main() {
	if (p == q) {
		p, q = &a, &b
		println(p == q) // false
	}
}
package main

var a, b [0]int
var p, q = &a, &b

func main() {
	if p == q {
		x, y := &a, &b
		if x == p && y == q {
			println(x == y) // false
		}
	}
}

@randall77
Copy link
Contributor

I suspect most of the confusion here comes from this optimization: given two distinct variables in the program, the compiler is allowed to assume their addresses are different.

var x T
var y T

&x == &y optimizes to false. &x != &y optimizes to true.

Even though, when T is zero-sized, x and y might in fact have the same address. So &x == &y compiles to false, but comparing &x to &y, when the compiler doesn't know they are addresses of distinct variables, evaluates to true.

For example:

package main

type T struct{}

func main() {
	var x T
	var y T
	p, q := &x, &y
	println(p == q, id(p) == id(q))
}

//go:noinline
func id(p *T) *T {
	return p
}

This program prints false true. In the first expression p == q, the compiler can tell the addresses are from distinct declarations, and in the second expression id(p) == id(q), it can't, and compares the addresses at runtime.
Note that wrapping a value in an interface is conceptually identical to wrapping with the id function. It hides the source of the pointers being compared from the compiler. Same with wrapping in uintptr(unsafe.Pointer()).

I'm of the opinion that there's nothing to do here. The compiler is adhering to the spec. No one should be depending on equality of pointers to zero-sized variables. There are other weird cases for zero-sized variables having to do with escaping, allocation, stack slots, etc. Here are two fun examples:

package main

type T struct{}

func f() *T {
	var x T
	return &x
}

func main() {
	println(f() == f())
}

When run, it prints false. When run with -gcflags=-l, it prints true. This is because when inlined, the compiler can see the two instances of x, whereas when no inlining happens, both instances of x are allocated by malloc, which has a special case to always return the same address when allocating zero-sized things.

The second weird example:

package main

import "unsafe"

func main() {
	var x [0]int64
	var z [0]int16
	g(&x, &z)
	var y [1]int32
	h(&y)
}

//go:noinline
func g(p *[0]int64, q *[0]int16) {
	println(uintptr(unsafe.Pointer(p)) == uintptr(unsafe.Pointer(q)))
}

//go:noinline
func h(p *[1]int32) {
}

When run, it prints false. Comment out the var y [1]int32; h(&y) part, and it prints true. This is because without y present, x and z are allocated next to each other in the stack frame, and thus have the same address. When y is present, it happens to get laid out between x and z and causes them to have different addresses.

@go101
Copy link

go101 commented Feb 27, 2024

The spec has no ambiguities on whether or not the comparison results should be consistent between different builds/runs. But it is not clear enough on the point within the same run.

@Merovius
Copy link
Contributor Author

@randall77 I agree that the compiler adheres to the spec. My argument is that "always evaluate comparisons of pointers to zero-sized variables to true" also adheres to the spec. It still wouldn't allow anyone to rely on the behavior, of course. I don't suggest changing the spec and there would likely still be edge-cases (like conversion to unsafe.Pointer). But it might solve a good chunk of the confusion.

I don't have strong opinions either way. But I am kind of curious why that's not what's done right now. If nothing else, it seems it would save a few instructions and CPU cycles here and there.

@go101
Copy link

go101 commented Feb 27, 2024

Is the optimization specifically made for pointers to zero-size values?
If it is, then it is totally needless.

@Merovius
Copy link
Contributor Author

@go101 What makes it needless in your opinion? Would you expect var x [1<<30]struct{} to allocate space on the heap? If not, then how do you expect &x[0] == &x[1<<20] to behave? And if new(struct{}) would have to return a unique address, to make pointer-comparisons "work as expected", that also implies a small, but non-zero overhead in GC time to chase pointers.

And note that zero-sized types still have behavior. For example, putting a [0]func() field into a struct prevents it from being compared.

Once you accept that some zero-sized variables need to be allowed to have the same address, it seems to me a natural consequence that you can no longer really meaningfully specify the result of comparing their pointers.

I don't think it is practical to remove the flexibility the spec provides. Which is why, if anything, we should talk about what the implementation can do to make this less confusing. The advantage of arguing that gc should just optimize all such pointer-comparisons to true is that it is even more efficient, while also being more predictable and not requiring us to try and fully specify the exceptions we can't reasonably cover.

@go101
Copy link

go101 commented Feb 27, 2024

You completely misinterpreted what I said. All you said in the last comment is unrelated to what I said.
What I mean is just that &x == &y should return the same value as var a, b = &x, &y; a == b. That is it.
Any optimizations deliberately made to break this is unnecessary.

@randall77
Copy link
Contributor

that &x == &y should return the same value as var a, b = &x, &y; &a == &b

I think you mean &x == &y should return the same value as var a, b = &x, &y; a == b?

@randall77
Copy link
Contributor

If you declare that two pointers to zero-sized things should always compare equal, then you're going to have weird cases in other ways. Like:

package main

type T struct {
	a struct{}
	b int
}

func main() {
	var x, y T
	println(&x == &y, &x.a == &y.a, unsafe.Pointer(&x.a) == unsafe.Pointer(&y.a))
}

Will then print false true false.

It also means a bunch of special cases in the compiler, reflect, etc. that have to deal with this new special case.

@go101
Copy link

go101 commented Feb 27, 2024

@randall77

I think you mean &x == &y should return the same value as var a, b = &x, &y; a == b?

Yes. Sorry, it is a mistake. Fixed now.

Is your last comment replied to me? I don't declare that two pointers to zero-sized things should always compare equal.

@randall77
Copy link
Contributor

Is your last comment replied to me? I don't declare that two pointers to zero-sized things should always compare equal.

No, that's in response to @Merovius' suggestion of making all such comparisons return true.

@atdiar
Copy link

atdiar commented Feb 27, 2024

I think that pointers are not the issue.
Makes sense.

What can be confusing is when non nil distinct pointers are assigned to interface values.
The dynamic types should be equal, but what about the dynamic values?

https://go.dev/play/p/zMky3K0tS1l

I believe that's what @Merovius was initially pointing at in my understanding and what actually @go101 tries to hint at as well. Namely, if each interface holds:

  • a type that is a pointer type
  • a pointer to a pointer var each time (does it in the current implementation?)
    (noting the addresses are different so these pointer values should be different)

Then the interface value comparison should also return false?

Or is there a special optimization here?

(If however the interface value being created is simply another fat pointer to the same value that was pointed at, it could be understood that it might be optimized to reuse its own, previously created, internal pointer value, and such interface values comparisons would always return true)

Other than that I think that the behavior for 0 sized types and regular references is not surprising and the spec does well to not force equality.

Basically, the bit on interface comparisons is actually explained in @randall77 comment
#65878 (comment)
But this is implementation based so how should it affect the spec?

A few more comparisons:
https://go.dev/play/p/LXQ0FvJlIxM

@go101
Copy link

go101 commented Feb 27, 2024

This issue is not related to interfaces in general: #65878 (comment)

@randall77
Copy link
Contributor

@atdiar, I think you are overthinking it. An interface is, conceptually, just a pair of a type and a value of that type. That type need not be a pointer type, and the value need not be a pointer.
Comparisons of interfaces just compare the type, and if the types are the same compares the value as well.

In the implementation of interfaces sometimes there is an additional indirection, but semantically that indirection is undetectable (unless you play unsafe shenanigans).

@atdiar
Copy link

atdiar commented Feb 27, 2024

@randall77

Yes, I think I am getting confused, a little. Let me step back:

If I assign a pointer value to an interface variable, the interface should hold a value that is a pointer value plus a pointer type, right?

Or does it hold a pointer type and the value (the implementation may use a pointer value set to point to the same variable).

That's just me being confused. I think it's the second option as you explained. In which case there is indeed nothing to do here.

https://go.dev/play/p/LXQ0FvJlIxM

That's why a == any(a) and a == any(b)

Essentially, assigning a pointer to an interface just creates a fatter pointer.

@go101
Copy link

go101 commented Feb 27, 2024

The current gc doesn't allocate value to store pointer values in interfaces.

Again, how interface is implemented is unrelated to this issue.
The problem of the issue is that sometimes gc evaluates &x == &y as false,
even if the addresses of x and y are the same.

@atdiar
Copy link

atdiar commented Feb 27, 2024

@go101 sorry my mistake. It allocates a new pointer value. Not a new value. Will fix it.

Edit: so yes, completing your example, that seems to be another issue.
The (non-) issue about interfaces was about how they work and how interface comparison works.

This is strange but that's another question (added some comparisons to your code example) : https://go.dev/play/p/beMBI3V54Bi
Should probably be filed as another issue if this is problematic. (makes comparison intransitive)

@Merovius
Copy link
Contributor Author

@randall77

I suspect most of the confusion here comes from this optimization: given two distinct variables in the program, the compiler is allowed to assume their addresses are different.

Likewise, if it can prove that two pointers contain the same address, it is allowed to optimize the comparison to true. So am I correct in concluding that this comes down to the ordering of passes? i.e. the compiler apparently, when optimizing == knows what variables (and offsets into those) the pointers point to, but hasn't yet decided that these are zero-sized so will end up at the same address?

TBH I still think that while, yes, there will always be weird cases, this is still a particularly weird one and IMO worthy addressing one way or another. But I don't really have good arguments for why it is important. It's not like programs can start relying on any behavior. And I don't have any benchmarks to claim that there is a performance benefit. So this is really at best about minimizing the number of people we have to explain the corner-case of zero-sized variables to. And it does seem that it would require special-casing this in some way (if only by re-ordering passes or teaching the compiler to know about the same-address-optimization earlier).

So… I guess I'm inclined to close this. I don't really see how anyone's mind would be changed.

@atdiar
Copy link

atdiar commented Feb 27, 2024

@go101

Is the optimization specifically made for pointers to zero-size values?

I'd tend to think yes. As soon as a type has a size, it should force the addresses to be different. (unless unsafe things are done)

If it is, then it is totally needless.

I don't know. Seems to me that since we don't know whether the addresses will or will not be the same, they have been declared different in a first time, to mirror any other non zero-sized values.
Is it something that can be left to be evaluated at runtime instead?

@randall77
Copy link
Contributor

the compiler apparently, when optimizing == knows what variables (and offsets into those) the pointers point to, but hasn't yet decided that these are zero-sized so will end up at the same address?

The compiler knows if the things it points to are zero-sized or not. It just currently doesn't care - it performs the same logic either way (addresses of distinct variables are not equal, addresses of the same variable are equal).

It is true that if they are different zero-sized variables, the compiler doesn't necessarily yet know whether their runtime addresses will be equal or not. Some of that layout code runs pretty late. So it would be nontrivial (but maybe not that hard?) to have the constant-folding always agree with any runtime comparison.

@mknyszek mknyszek added this to the Backlog milestone Feb 28, 2024
@mdempsky
Copy link
Member

This gets reported about once a year; e.g., #23440, #47950, #52772, etc. I agree it's counter-intuitive, but I don't see any action to take within cmd/compile here, so I'm going to close this issue.

Perhaps cmd/vet should warn about taking the address of zero-sized variables and/or comparing pointers to zero-sized types. If folks are concerned about this issue, I'd recommend looking into whether either of those checks would meet the threshold for inclusion in cmd/vet.

@go101
Copy link

go101 commented Mar 1, 2024

This issue does provide new information which were not presented in the old issues.
The old ones just mentioned that the comparison results of pointers to zero-size values are not predictable.
The current issue emphasizes that the comparison results of pointers to the same specified zero-size values are inconsistent in the same run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

10 participants