Skip to content

cmd/compile: nilcheck is tightened in a branch allowing the other one to execute when it should have panicked #72860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ggambetti opened this issue Mar 13, 2025 · 14 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@ggambetti
Copy link

Go version

go version go1.24.1 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='cc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='c++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/giannigambetti/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/giannigambetti/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/0m/wl_xfppn5nn5gk2qcwj3ymwc0000gq/T/go-build178340693=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/dev/null'
GOMODCACHE='/Users/giannigambetti/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/giannigambetti/go'
GOPRIVATE=''
GOPROXY='direct'
GOROOT='/opt/homebrew/Cellar/go/1.24.1/libexec'
GOSUMDB='off'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/giannigambetti/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.24.1/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.1'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

Working with a map to a pointer (map[string]*thing) a coworker remarked during a PR review that they were surprised it work, indeed the code would be expected to segfault.

I've reduced the code to a small reproduction sample which exhibits the behaviour: https://go.dev/play/p/6Jv5ePF2o4n.

The suspicious code in question (lines 19-22):

	t, ok := m[key]

	// nil pointer dereference that gets optimized to happen after the `!ok` check ... sometimes.
	valid := t.field >= 0

Computing the value of valid is expected to segfault, as t is always nil in the example snippet.

What did you see happen?

The code snippet executes successfully, printing out got: <nil> instead of segfaulting.

I checked it against all versions of Go available in go.dev/play: 1.24, 1.23, and dev branch. All of them exhibit this behaviour.

My guess is this might be a compiler optimization, because making changes to the code like calling fmt.Printf or printing out the value of valid later both cause the code to segfault as expected.

That is; I think it transforms:

v, ok := m[key]
valid := v.field >= 0
if !ok || !valid {

into

v, ok := m[key]
if !ok || !(v.field >= 0) {

Likely irrelevant, but: anecdotally 1.23 timed out while building this example several times.

What did you expect to see?

This program is expected to segfault. Computing the valid variable unconditionally would result in a dereferenced nil pointer.

This execution appears to violate the Go memory model and the order of evaluation, quoting:

Requirement 1: The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine, given the values read from and written to memory. That execution must be consistent with the sequenced before relation, defined as the partial order requirements set out by the Go language specification for Go's control flow constructs as well as the order of evaluation for expressions.

Thanks to @nwchandler for pointing this apparent spec violation out to me.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 13, 2025
@ianlancetaylor
Copy link
Member

Interesting. I see a segfault with gccgo and with Go versions up through 1.20. From 1.21 onward there is no segfault.

CC @golang/compiler

@ianlancetaylor
Copy link
Member

@khr Appears to be caused by https://go.dev/cl/537775 for #63657.

@Jorropo
Copy link
Member

Jorropo commented Mar 13, 2025

If no one has done so by tomorrow I'll fix this by making NilCheck return a (ptr, mem) tuple and add it back into the mem chains.
It is a side-effect I don't think we can not put it there.

@Jorropo Jorropo changed the title cmd/compile: compiler bug avoids segfault? cmd/compile: nilcheck is tightened in a branch allowing the other one to execute when it should have panicked Mar 13, 2025
@Jorropo Jorropo self-assigned this Mar 13, 2025
@randall77
Copy link
Contributor

I think a better fix is just to prevent the nil check from moving blocks during the tighten pass. CL incoming.

@randall77
Copy link
Contributor

@gopherbot please open backport issues.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #72862 (for 1.23), #72863 (for 1.24).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/657715 mentions this issue: cmd/compile: don't move nilCheck operations during tighten

@seankhliao seankhliao marked this as a duplicate of #72918 Mar 18, 2025
@dmitshur dmitshur added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 18, 2025
@dmitshur dmitshur added this to the Go1.25 milestone Mar 18, 2025
@randall77
Copy link
Contributor

We've decided not to backport this change.
We've found instances of code that depend on the buggy behavior. Particularly, code that does:

ptr, err := f()
x := ptr.field
if err != nil { ... return/panic ... }
... use x ...

Where f returns a pointer and an error. f expects its callers to only use the ptr result if the error is nil. And in particular, f returns a nil pointer whenever it returns a non-nil error.

This code is incorrect, in that it dereferences the ptr result before checking the error. The Go spec says that it should nil pointer panic if f returns a nil pointer and non-nil error. But this issue hides the bug in the user's code by pushing the nil check to after the if block. So even though the user's code has a bug in it, that bug does not manifest before the fix to this issue.

Because there is probably code out there dependent on this bug, we're going to just fix this issue going forward (starting in 1.25). We don't want a point release to break anyone's (even incorrect) code without a very compelling reason.

@Jorropo
Copy link
Member

Jorropo commented Mar 18, 2025

1.25 changelog:

FIXED: the go compiler no longer fixes bugs in your code

😄

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/658955 mentions this issue: doc: document change in nil-ptr checking behavior

gopherbot pushed a commit that referenced this issue Mar 18, 2025
This could bite people during the 1.25 release, so make sure it
has good documentation in the release notes.

Update #72860

Change-Id: Ie9aaa219025a631e81ebc48461555c5fb898f43f
Reviewed-on: https://go-review.googlesource.com/c/go/+/658955
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@Jorropo Jorropo removed their assignment Mar 19, 2025
@kaiserproger
Copy link

kaiserproger commented Mar 20, 2025

Does it work now, actually? I've tried to build from latest master, ran these snippet:

package main

import (
    "errors"
    "fmt"
)

type A struct{
    B int
}

func ptr[V any](source V, condition bool) (*V, error) {
    if condition {
        return nil, errors.New("test error")
    }

    return &source, nil
}

func main(){
    object, err := ptr(A{B: 42}, true)
    fmt.Println(object.B)
    if err != nil {
        fmt.Println("error case")
    }
}

And it compiles and falls into nil pointer panic on run, despite 3309658 says it should not.
What am I doing wrong?

Go version:

go version devel go1.25-ba50de8429 Thu Mar 20 11:23:35 2025 -0700 linux/amd64

@cherrymui
Copy link
Member

cherrymui commented Mar 20, 2025

It's the opposite: it says this program should always panic. Before the change, it (or some variant of it) may unexpectedly succeed instead of panicking.

@kaiserproger
Copy link

It's the opposite: it says this program should always panic. Before the change, it (or some variant of it) may unexpectedly succeed instead of panicking.

Oh, thank you for clarity. Seems like I misinterpret the document and the issue.
Anyway, if now program always panics here, why not to add that as a compiler error? Marking pointer dereference before nil-check as a compiler error IMO will eliminate roughly 9 of 10 panic problems.

@Jorropo
Copy link
Member

Jorropo commented Mar 21, 2025

This is radically different than this issue, this issue tl;dr is:

Hello 👋 we found some piece of code where the go compiler does not implement the go spec.

And then fixing the go compiler to implement the go spec properly.

What you are talking about is changing the go spec.

if now program always panics here, why not to add that as a compiler error?

Because a program which always panic is still a valid go program, so the compiler must compile it properly.

Marking pointer dereference before nil-check as a compiler error IMO will eliminate roughly 9 of 10 panic problems.

This is contrary to what the go spec requires, so we can't do that.

If you want to change the go spec I would be having a hard time believing this wouldn't have been proposed before but 2 minutes of searching failed to bring it up.
🤞 look below to see someone awesome linking the proposal to make pointers behave like rust's where *T would always be valid and nil would be Optional<*T> type.

https://github.com/uber-go/nilaway also exists and does pretty much exactly what you say altho warning opinion part incoming:
it is a waste of my time because it had extremely poor Signal-To-Noise ratio and were more focused on adding if v == nil { return someErr } everywhere* rather than fixing real bugs in my codebase.

*which don't actually fix bugs, just turn panics in errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants