Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/race: document and fix "trap when race detected" #23577

Open
object88 opened this issue Jan 27, 2018 · 22 comments
Open

runtime/race: document and fix "trap when race detected" #23577

object88 opened this issue Jan 27, 2018 · 22 comments
Labels
Documentation help wanted NeedsFix The path to resolution is known, but the work has not been done. Proposal-Accepted RaceDetector
Milestone

Comments

@object88
Copy link

This is not so much a bug, as an ask.

When using the -race option for go test, I would like to see the arguments that are provided at the call site, down the stack, when a data race is detected. Currently, I only see the methods themselves. If I omit the -race option, I can see the data race as reported by a map concurrency problem, and this does include call parameters.

In my particular example, I am having trouble with a map that the go/types#Checker method uses. The concurrency issue is happening deep in the Go code, rather than something that I have immediate control over. (Note: I do not think that this is a problem in the Go type checker library at this point; no reason to. I'm sure I'm doing something wrong on my side.) I have many goroutines running a type checker at the same time. If I had parameters on the call stack, I could tell a lot more about the situation that causes the issue.

What version of Go are you using (go version)?

% go version
go version go1.9.3 darwin/amd64

Does this issue reproduce with the latest release?

I have not tried with the 1.10 beta.

What operating system and processor architecture are you using (go env)?

% go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/bropa18/work"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.9.3/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.9.3/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/hx/fh6ygqmx4ls2cs9djgk5bhkm3nt0vs/T/go-build080984833=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

Unfortunately, I don't have a small working example of this scenario. I found this while debugging a race condition in my... um, complex program. I can try to create a more minimal example at a later date, if the team thinks it's helpful.

What did you expect to see?

I would like to see the parameters in the call stack. This example from the map's concurrency checker is a great example:

fatal error: concurrent map read and map write

goroutine 1506 [running]:
runtime.throw(0x1299de0, 0x21)
	/usr/local/Cellar/go/1.9.3/libexec/src/runtime/panic.go:605 +0x95 fp=0xc43034cec8 sp=0xc43034cea8 pc=0x102b495
runtime.mapaccess1_faststr(0x124cbc0, 0xc42de16f90, 0xc43806c3c8, 0x1, 0xc438094f00)
	/usr/local/Cellar/go/1.9.3/libexec/src/runtime/hashmap_fast.go:217 +0x43a fp=0xc43034cf20 sp=0xc43034cec8 pc=0x100b99a
go/types.(*Scope).Lookup(...)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/scope.go:71
go/types.(*Checker).selector(0xc42a704fc0, 0xc4382ca180, 0xc42d8b0820)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/call.go:287 +0x1167 fp=0xc43034d210 sp=0xc43034cf20 pc=0x1184f27
go/types.(*Checker).typExprInternal(0xc42a704fc0, 0x13c83a0, 0xc42d8b0820, 0x0, 0x0, 0x0, 0x0, 0xc428b8f400, 0x145f6c8)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:234 +0x9f3 fp=0xc43034d3c0 sp=0xc43034d210 pc=0x11b4e43
go/types.(*Checker).typExpr(0xc42a704fc0, 0x13c83a0, 0xc42d8b0820, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:132 +0x89 fp=0xc43034d418 sp=0xc43034d3c0 pc=0x11b39f9
go/types.(*Checker).typ(0xc42a704fc0, 0x13c83a0, 0xc42d8b0820, 0x1, 0xa1970dd1f3)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:140 +0x63 fp=0xc43034d470 sp=0xc43034d418 pc=0x11b3b03
go/types.(*Checker).typExprInternal(0xc42a704fc0, 0x13c8460, 0xc42d8b0840, 0x0, 0x0, 0x0, 0x0, 0xc438094f00, 0xc430505060)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:276 +0x913 fp=0xc43034d620 sp=0xc43034d470 pc=0x11b4d63
go/types.(*Checker).typExpr(0xc42a704fc0, 0x13c8460, 0xc42d8b0840, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:132 +0x89 fp=0xc43034d678 sp=0xc43034d620 pc=0x11b39f9
go/types.(*Checker).typ(0xc42a704fc0, 0x13c8460, 0xc42d8b0840, 0xc420025300, 0x13ca860)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:140 +0x63 fp=0xc43034d6d0 sp=0xc43034d678 pc=0x11b3b03
go/types.(*Checker).collectParams(0xc42a704fc0, 0xc43823ef00, 0xc4380633b0, 0xc43823ef01, 0x0, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:408 +0x4c2 fp=0xc43034d838 sp=0xc43034d6d0 pc=0x11b6372
go/types.(*Checker).funcType(0xc42a704fc0, 0xc438218cf0, 0x0, 0xc42d8b0e20)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:149 +0x1e0 fp=0xc43034d948 sp=0xc43034d838 pc=0x11b3d10
go/types.(*Checker).funcDecl(0xc42a704fc0, 0xc438095220, 0xc42fdcff20)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:333 +0xe2 fp=0xc43034d9e8 sp=0xc43034d948 pc=0x1189a92
go/types.(*Checker).objDecl(0xc42a704fc0, 0x13ca860, 0xc438095220, 0x0, 0xc43034db28, 0x0, 0x8)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:87 +0x38b fp=0xc43034dad8 sp=0xc43034d9e8 pc=0x11883bb
go/types.(*Checker).packageObjects(0xc42a704fc0, 0xc43219e400, 0x88, 0x90)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/resolver.go:463 +0x127 fp=0xc43034db78 sp=0xc43034dad8 pc=0x11a5207
go/types.(*Checker).checkFiles(0xc42a704fc0, 0xc4380e6640, 0x3, 0x3, 0x0, 0x0)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:239 +0xbd fp=0xc43034dbc0 sp=0xc43034db78 pc=0x1185e9d
go/types.(*Checker).Files(0xc42a704fc0, 0xc4380e6640, 0x3, 0x3, 0x1, 0x1)
	/usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:230 +0x49 fp=0xc43034dc00 sp=0xc43034dbc0 pc=0x1185db9
github.com/object88/langd.(*Loader).processComplete(0xc4200a0120, 0xc4234d0500)
	/Users/bropa18/work/src/github.com/object88/langd/loader.go:444 +0x5b5 fp=0xc43034deb0 sp=0xc43034dc00 pc=0x11fc465
github.com/object88/langd.(*Loader).processStateChange(0xc4200a0120, 0xc4242e6030, 0x30)
	/Users/bropa18/work/src/github.com/object88/langd/loader.go:360 +0x75f fp=0xc43034dfc8 sp=0xc43034deb0 pc=0x11fbcef
runtime.goexit()
	/usr/local/Cellar/go/1.9.3/libexec/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc43034dfd0 sp=0xc43034dfc8 pc=0x105af41
created by github.com/object88/langd.(*Loader).Start.func1
	/Users/bropa18/work/src/github.com/object88/langd/loader.go:242 +0x15d

In particular, I can see the parameters to my processComplete method:

github.com/object88/langd.(*Loader).processComplete(0xc4200a0120, 0xc4234d0500)

By seeing the parameters, I can see whether or not my processComplete method is getting called with the same parameters simultaneously.

What did you see instead?

==================
WARNING: DATA RACE
Read at 0x00c44156b380 by goroutine 55:
  runtime.mapaccess1_faststr()
      /usr/local/Cellar/go/1.9.3/libexec/src/runtime/hashmap_fast.go:208 +0x0
  go/types.(*Checker).selector()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/scope.go:71 +0x1af1
  go/types.(*Checker).exprInternal()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:1227 +0x2be0
  go/types.(*Checker).rawExpr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:959 +0x91
  go/types.(*Checker).exprOrType()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:1573 +0x6c
  go/types.(*Checker).call()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/call.go:15 +0x8a
  go/types.(*Checker).exprInternal()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:1424 +0x2a39
  go/types.(*Checker).rawExpr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:959 +0x91
  go/types.(*Checker).multiExpr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:1530 +0x72
  go/types.(*Checker).expr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/expr.go:1524 +0x56
  go/types.(*Checker).varDecl()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:166 +0x3bc
  go/types.(*Checker).objDecl()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:81 +0x461
  go/types.(*Checker).packageObjects()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/resolver.go:463 +0x17e
  go/types.(*Checker).checkFiles()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:239 +0xca
  go/types.(*Checker).Files()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:230 +0x56
  github.com/object88/langd.(*Loader).processComplete()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:444 +0x830
  github.com/object88/langd.(*Loader).processStateChange()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:347 +0x3d2

I can observe that processComplete was invoked, but it's missing context:

github.com/object88/langd.(*Loader).processComplete()

Without the parameter information, I don't know whether or not the same data is being passed into the method.

@dgryski
Copy link
Contributor

dgryski commented Jan 27, 2018

/cc @dvyukov

@dvyukov
Copy link
Member

dvyukov commented Jan 27, 2018

I guess we could use the standard printing routine for the first stack. Race runtime would need to call back to ask for the whole current stack trace.

@object88
Copy link
Author

object88 commented Jan 27, 2018

I can't speak to the breadth of use cases, of course, but for the circumstance that I noticed this, the parameters would only be helpful if provided for both stacks in the conflict. In my particular case, I wanted to know if the calls into my processComplete were invoked with different parameters; it would tell me if the problem was that I was mismanaging the timing around the calls to processComplete, or whether I was mismanaging the shared objects involved in that call.

Aside, I am curious about how the race detector is implemented, but I didn't find the text "DATA RACE" in github.com/golang/go. Is there another repository that this lives in?

@odeke-em
Copy link
Member

@object88 thank you for the report and for asking about the runtime race detector. You might find this useful

runtime/race package contains the data race detector runtime library.
It is based on ThreadSanitizer race detector, that is currently a part of
the LLVM project (http://llvm.org/git/compiler-rt.git).
To update the .syso files use golang.org/x/build/cmd/racebuild.
Current runtime is built on rev 68e1532492f9b3fce0e9024f3c31411105965b11.

and as the doc there says, please take a look at https://github.com/golang/build/tree/master/cmd/racebuild aka package golang.org/x/build/cmd/racebuild

@odeke-em odeke-em changed the title Include parameters to DATA RACE stack output from go test -race proposal: runtime/race: propagate arguments from call site down to the stack during race detection Jan 27, 2018
@gopherbot gopherbot added this to the Proposal milestone Jan 27, 2018
@dvyukov
Copy link
Member

dvyukov commented Jan 27, 2018

Providing arguments for both stacks is infeasible.
Race detector memorizes stacks for all memory accesses ever happened in the execution (not possible to know ahead of time if any of them will lead to a race in future or not). Currently race detector spends 8 bytes per stack, and that's still lots of memory overall. Memorizing several hundreds of bytes per memory access is infeasible unfortunately.

@object88
Copy link
Author

Out of curiosity, is that 8 bytes per stack, or 8 bytes per stack frame?

Regardless, I hear that there's a very large memory cost involved. Rather than add to the cost of maintaining the parameters, etc., is there a feasible way (generally speaking, not necessarily using what currently exists) to read the stack from another goroutine?

@dvyukov
Copy link
Member

dvyukov commented Jan 28, 2018

Out of curiosity, is that 8 bytes per stack, or 8 bytes per stack frame?

8 bytes per full stack trace.

@dvyukov
Copy link
Member

dvyukov commented Jan 28, 2018

is there a feasible way (generally speaking, not necessarily using what currently exists) to read the stack from another goroutine?

runtime.GoroutineProfile does this.
But note that in our case the other goroutine is already in a completely different stack that is unrelated to the race. So I don't see how it can help here.

@object88
Copy link
Author

But note that in our case the other goroutine is already in a completely different stack that is unrelated to the race.

Do you mean that the race detector does not cause the program to come to a complete pause across all goroutines, so that even if we were to inspect the other goroutine, it's in all likelihood have changed state to such a degree that we may not get any useful information, or at least not with any consistency?

@dvyukov
Copy link
Member

dvyukov commented Jan 28, 2018

Generally there is no point in time when both racing goroutines are at these stacks at the same time, so it does not matter if it pauses or not.

@object88
Copy link
Author

I'm a little confused; I wouldn't expect the racing goroutines to have identical stacks. What I have observed is that while they have the same small number of initial stack frames, the execution has diverged at the time of the race. I didn't include the whole data race dump initially for brevity's sake, so perhaps I haven't been clear. Here is a full example:

==================
WARNING: DATA RACE
Write at 0x00c441f6d200 by goroutine 145:
  runtime.mapassign_faststr()
      /usr/local/Cellar/go/1.9.3/libexec/src/runtime/hashmap_fast.go:774 +0x0
  go/types.(*Scope).Insert()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/scope.go:106 +0x136
  go/types.(*Checker).declare()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:28 +0x98
  go/types.(*Checker).declarePkgObj()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/resolver.go:114 +0x144
  go/types.(*Checker).collectObjects()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/resolver.go:383 +0x9e2
  go/types.(*Checker).checkFiles()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:237 +0xae
  go/types.(*Checker).Files()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:230 +0x56
  github.com/object88/langd.(*Loader).processComplete()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:452 +0x813
  github.com/object88/langd.(*Loader).processStateChange()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:360 +0xa7a

Previous read at 0x00c441f6d200 by goroutine 46:
  runtime.mapaccess1_faststr()
      /usr/local/Cellar/go/1.9.3/libexec/src/runtime/hashmap_fast.go:208 +0x0
  go/types.(*Checker).selector()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/scope.go:71 +0x1af1
  go/types.(*Checker).typExprInternal()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:234 +0xed4
  go/types.(*Checker).typExpr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:132 +0x96
  go/types.(*Checker).typ()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:140 +0x74
  go/types.(*Checker).typExprInternal()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:276 +0xd1e
  go/types.(*Checker).typExpr()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:132 +0x96
  go/types.(*Checker).typ()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/typexpr.go:140 +0x74
  go/types.(*Checker).varDecl()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:143 +0x51f
  go/types.(*Checker).declStmt()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/decl.go:416 +0xcaf
  go/types.(*Checker).stmt()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/stmt.go:307 +0x5c83
  go/types.(*Checker).stmtList()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/stmt.go:110 +0x95
  go/types.(*Checker).funcBody()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/stmt.go:42 +0x2f4
  go/types.(*Checker).functionBodies()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/resolver.go:476 +0x122
  go/types.(*Checker).checkFiles()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:241 +0xd8
  go/types.(*Checker).Files()
      /usr/local/Cellar/go/1.9.3/libexec/src/go/types/check.go:230 +0x56
  github.com/object88/langd.(*Loader).processComplete()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:452 +0x813
  github.com/object88/langd.(*Loader).processStateChange()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:360 +0xa7a

Goroutine 145 (running) created at:
  github.com/object88/langd.(*Loader).Start.func1()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:242 +0x197

Goroutine 46 (running) created at:
  github.com/object88/langd.(*Loader).Start.func1()
      /Users/bropa18/work/src/github.com/object88/langd/loader.go:242 +0x197
==================

@rsc
Copy link
Contributor

rsc commented Jan 29, 2018

@dvyukov, is there a way to run the program so it crashes at the point where the race is detected? Then at least someone can run under a debugger and get a stack trace for the live half of the race.

@dvyukov
Copy link
Member

dvyukov commented Jan 30, 2018

GORACE="abort_on_error=1 halt_on_error=1" halts execution immediately. However, runtime seems to fail to unwind race runtime and does not show the racy stack:

==================
WARNING: DATA RACE
Read at 0x00c42014e018 by goroutine 7:
  runtime.slicecopy()
      src/runtime/slice.go:192 +0x0
  command-line-arguments_test.TestRaceSliceStruct.func1()
      src/runtime/race/testdata/mop_test.go:1749 +0xae

Previous write at 0x00c42014e018 by goroutine 6:
  command-line-arguments_test.TestRaceSliceStruct()
      src/runtime/race/testdata/mop_test.go:1752 +0xda
  testing.tRunner()
      src/testing/testing.go:777 +0x16d

Goroutine 7 (running) created at:
  command-line-arguments_test.TestRaceSliceStruct()
      src/runtime/race/testdata/mop_test.go:1747 +0xbd
  testing.tRunner()
      src/testing/testing.go:777 +0x16d

Goroutine 6 (running) created at:
  testing.(*T).Run()
      src/testing/testing.go:824 +0x564
  testing.runTests.func1()
      src/testing/testing.go:1063 +0xa4
  testing.tRunner()
      src/testing/testing.go:777 +0x16d
  testing.runTests()
      src/testing/testing.go:1061 +0x4e1
  testing.(*M).Run()
      src/testing/testing.go:978 +0x2cd
  main.main()
      _testmain.go:298 +0x22a
==================
SIGABRT: abort
PC=0x7f7bb8e8ac37 m=0 sigcode=18446744073709551610

goroutine 36 [running]:
runtime: unknown pc 0x7f7bb8e8ac37
stack: frame={sp:0x7fff901b1118, fp:0x0} stack=[0xc42010a800,0xc42010b000)

runtime: unknown pc 0x7f7bb8e8ac37
stack: frame={sp:0x7fff901b1118, fp:0x0} stack=[0xc42010a800,0xc42010b000)

created by command-line-arguments_test.TestRaceSliceStruct
	src/runtime/race/testdata/mop_test.go:1747 +0xbe

goroutine 1 [chan receive]:
testing.(*T).Run(0xc42014a000, 0x5de902, 0x13, 0x5e5b08, 0xc4200bdc01)
	src/testing/testing.go:825 +0x597
testing.runTests.func1(0xc42014a000)
	src/testing/testing.go:1063 +0xa5
testing.tRunner(0xc42014a000, 0xc4200bddb0)
	src/testing/testing.go:777 +0x16e
testing.runTests(0xc420110040, 0x6d22e0, 0x81, 0x81, 0xc420132000)
	src/testing/testing.go:1061 +0x4e2
testing.(*M).Run(0xc420132000, 0x0)
	src/testing/testing.go:978 +0x2ce
main.main()
	_testmain.go:298 +0x22b

goroutine 35 [chan receive]:
command-line-arguments_test.TestRaceSliceStruct(0xc42014a0f0)
	src/runtime/race/testdata/mop_test.go:1753 +0xff
testing.tRunner(0xc42014a0f0, 0x5e5b08)
	src/testing/testing.go:777 +0x16e
created by testing.(*T).Run
	src/testing/testing.go:824 +0x565

@dvyukov
Copy link
Member

dvyukov commented Jan 30, 2018

I wouldn't expect the racing goroutines to have identical stacks.

They don't. Did I say otherwise somewhere?

@object88
Copy link
Author

No, I misunderstood, my apologies.

@rsc
Copy link
Contributor

rsc commented Feb 5, 2018

It seems like we should:

  1. Document GORACE="abort_on_error=1 halt_on_error=1".
  2. Make the stack trace from the runtime work.
  3. Make sure that delve can get a stack trace too.

@rsc rsc changed the title proposal: runtime/race: propagate arguments from call site down to the stack during race detection runtime/race: document and fix "trap when race detected" Feb 5, 2018
@rsc rsc added NeedsFix The path to resolution is known, but the work has not been done. and removed Documentation labels Feb 5, 2018
@rsc rsc modified the milestones: Proposal, Go1.11 Feb 5, 2018
@object88
Copy link
Author

object88 commented Feb 7, 2018

I would like to participate in this work, even if it is just documentation; what's a good way to go about contributing?

@ianlancetaylor
Copy link
Contributor

The docs for -race at https://golang.org/cmd/go point to https://golang.org/doc/articles/race_detector.html, which does in fact document halt_on_error but does not document abort_on_error. We should document abort_on_error in doc/articles/race_detector.html in the Go repo.

We should also make sure that it reports the correct stack trace; above, @dvyukov says that it does not. I don't know what's involved there.

For general contribution instructions, see https://golang.org/doc/contribute.html.

Thanks.

@dvyukov
Copy link
Member

dvyukov commented Feb 7, 2018

We should also make sure that it reports the correct stack trace; above, @dvyukov says that it does not. I don't know what's involved there.

Race runtime raises SIGABRT inside of itself on partially switched Go stack (see comment on runtime·racecallbackthunk). It maybe hard to unwind from there. The simplest way to achieve this would probably be to call back into Go runtime from race runtime, and then Go runtime can print perfectly valid stack and terminate.
However, updating race runtime involves rebuilding all syso files, but last time I looked racebuild utility was broken (something with gomote). I wrote something about this somewhere, but did not hear back as far as I remember. I don't know what's the state of racebuild now.

@dgryski
Copy link
Contributor

dgryski commented Feb 7, 2018

@dvyukov Aside from the fact that the tooling is currently broken, are there any fixes in the upstream race detector that we would want to import or is it relatively stable?

@josharian
Copy link
Contributor

Not urgent, but it would be nice to pull in the fix for #20139.

@dvyukov
Copy link
Member

dvyukov commented Feb 7, 2018

Nothing particularly important.
I forgot about #20139, but, yes, would be nice to pull for some workloads.
But generally I used to update race runtime once per release to keep things under control, there are usually some issues accumulate over time. There is now also #23731 and it would be nice to separate general race runtime update from ppc work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation help wanted NeedsFix The path to resolution is known, but the work has not been done. Proposal-Accepted RaceDetector
Projects
None yet
Development

No branches or pull requests

8 participants