Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: msan cannot handle structs with padding #26167

Open
benesch opened this issue Jul 1, 2018 · 17 comments
Open

cmd/compile: msan cannot handle structs with padding #26167

benesch opened this issue Jul 1, 2018 · 17 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@benesch
Copy link
Contributor

benesch commented Jul 1, 2018

What version of Go are you using (go version)?

go version devel +0dc814c Sat Jun 30 01:04:30 2018 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

Attempted to dereference a C struct with padding from Go with the memory sanitizer enabled:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"
import "fmt"

func main() {
	s := *C.mks()
	fmt.Println(s.c)
}

I compiled with:

CC=clang-6.0 CXX=clang++-6.0 go build -msan -o msan msan.go

Upon execution, msan crashes spuriously:

~/go1.10/misc/cgo/testsanitizers/src/foo$ ./msan
Uninitialized bytes in __msan_check_mem_is_initialized at offset 5 inside [0x701000000000, 8)
==21637==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e8b08  (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08)

SUMMARY: MemorySanitizer: use-of-uninitialized-value (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08) 
Exiting

The problem appears to be that the Go instrumentation is coarser than the C instrumentation. Only bytes 0-5 are marked as initialized by C (bytes 6-8 are padding), but Go asks msan to verify that all 8 bytes are initialized when it stores s.

In this particular example, there are two easy ways to soothe msan. The first is to remove the padding from the struct (e.g., struct s { int i; int c }). The second is to access fields within the struct without storing it into a temporary (e.g., fmt.Println(C.mks().c)). Neither of these "workarounds" are viable for real programs.

@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018

Well, this patch makes the test pass but causes a slew of problems on real programs:

--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,7 +600,13 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
-       if flag_msan {
+       if flag_msan && t.Etype == TSTRUCT {
+               for _, f := range t.FieldSlice() {
+                       addr = s.newValue1I(ssa.OpOffPtr, types.NewPtr(t), f.Offset, addr)
+                       s.instrument(f.Type, addr, wr)
+               }
+               return
+       } else if flag_msan {
                fn = msanread
                if wr {
                        fn = msanwrite

Hmm.

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 1, 2018
@ALTree ALTree added this to the Go1.12 milestone Jul 1, 2018
@ALTree
Copy link
Member

ALTree commented Jul 1, 2018

I assume this also happens with go1.10 (so it's not a recent regression)?

@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018

Yep. Just verified that it happens with go1.10.3. Given that this case isn't tested by any of the sample programs in misc/cgo/testsanitizers I wouldn't be surprised if it's been broken since msan support was introduced.

@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018

Oh boy. This doesn't seem particularly easy to fix. LLVM's msan transformation (https://github.com/llvm-mirror/llvm/blob/6fed3d8070003bd65e0af3b70aaa04f7ca12260e/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L1431) is more nuanced than gc's. Specifically, LLVM doesn't consider a load or a store of uninitialized memory to be problematic. It's what you do afterwards that counts. Your program won't blow up until e.g. you perform arithmetic using uninitialized memory or index into an array using uninitialized memory.

@ianlancetaylor
Copy link
Contributor

Perhaps we could modify gc to only call msanread/msanwrite for non-padding bytes when copying struct types.

@ianlancetaylor ianlancetaylor changed the title runtime/msan: cannot handle structs with padding cmd/compile: msan cannot handle structs with padding Jul 1, 2018
@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018 via email

@ianlancetaylor
Copy link
Contributor

Sorry, I didn't even know the compiler added padding fields at all. In widstruct in cmd/compile/internal/gc/align.go it just sets offsets based on alignment, implicitly adding padding.

@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018

Oh! Perhaps it's cmd/cgo that's to blame. Thanks. I'll keep digging.

@ianlancetaylor
Copy link
Contributor

Ah, yes, cmd/cgo does add padding fields to the Go version of C structs in (*typeconv).pad in cmd/cgo/gcc.go.

@benesch
Copy link
Contributor Author

benesch commented Jul 1, 2018

Ok, great. Excluding the blank identifiers that cmd/cgo writes from msan instrumentation lets Cockroach get even further in boot.

In other news, I've stumbled across an edge case that I'm not sure what to make of. Consider a program that uses an ostensibly-initialized C struct as a Go map key:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"

func main() {
	m := map[C.struct_s]struct{}{}
	m[*C.mks()] = struct{}{}
}

Go will read the uninitialized padding bytes to hash the struct for the map key. Yikes! Though perhaps a valid answer here is "don't do that."

@josharian
Copy link
Contributor

I’m AFK but I believe Syms have an IsBlank method.

@josharian
Copy link
Contributor

Go will read the uninitialized padding bytes to hash the struct for the map key.

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

@benesch
Copy link
Contributor Author

benesch commented Jul 2, 2018

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

Ah. Then perhaps mapassign and friends are simply reporting too coarsely to msan. Maybe here:

msanread(key, t.key.size)

@benesch
Copy link
Contributor Author

benesch commented Jul 2, 2018

I’m AFK but I believe Syms have an IsBlank method.

Yep, thanks. I've included the patch I've been experimenting with below. (I'm sure there's somewhere better to put this code.)

diff --git a/src/cmd/compile/internal/gc/ssa.go b/src/cmd/compile/internal/gc/ssa.go
index ff2b93d..234954d 100644
--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,6 +600,16 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
+       if flag_msan && t.IsStruct() {
+               for _, f := range t.FieldSlice() {
+                       if !f.Sym.IsBlank() {
+                               faddr := s.newValue1I(ssa.OpOffPtr, f.Type.PtrTo(), f.Offset, addr)
+                               s.instrument(f.Type, faddr, wr)
+                       }
+               }
+               return
+       }
+
        if flag_msan {
                fn = msanread
                if wr {

@josharian
Copy link
Contributor

Then perhaps mapassign and friends are simply reporting too coarsely to msan

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Also, not that it matters for this conversation, but it appears to me (on my phone) that the msanread call there could be moved inside the “if map is nil” return path. Might improve perf.

The ideal way to share a patch is gerrit. Or a PR (which gets imported to gerrit).

@benesch
Copy link
Contributor Author

benesch commented Jul 2, 2018

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Yep, definitely an option. Or the calls to msanread could be moved closer to the actual memory reads in the hash functions.

@josharian
Copy link
Contributor

I believe the msanread call is there only to handle the case in which we don’t call the hash function, which should have its own instrumentation.

@andybons andybons modified the milestones: Go1.12, Go1.13 Feb 12, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

7 participants