Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/exp/shiny: memory corruption on windows #48086

Open
theinternetftw opened this issue Aug 31, 2021 · 6 comments
Open

x/exp/shiny: memory corruption on windows #48086

theinternetftw opened this issue Aug 31, 2021 · 6 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@theinternetftw
Copy link

theinternetftw commented Aug 31, 2021

What version of Go are you using (go version)?

$ go version
go version go1.17 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=%LocalAppData%\go-build
set GOENV=%AppData%\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\dev\lang\go_path\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\dev\lang\go_path
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\dev\lang\go\1.17
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\dev\lang\go\1.17\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.17
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\dev\cygwin64\tmp\go-build1333133088=/tmp/go-build -gno-record-gcc-switches

What did you do?

https://gist.github.com/theinternetftw/95151dd990fe955644275e13993ef69e

What did you expect to see?

To never see the "memory corruption!" Println

What did you see instead?

The "memory corruption" Println appears quite regularly. (As did memory corruption of large slices in my actual program)

Discussion

I have stopped this behavior by having shiny not call SendMessage outside of the main os thread.

I stopped major investigation once I fixed it for myself and didn't see any recurrence, but a few thoughts on what might be causing it that may help:

  • Perhaps somehow waiting for SendMessage when it involves a window in particular is what causes the corruption, or when SendMessage takes some time to return a result.

  • It might be relevant that the os thread lock is permanently grabbed by the main driver thread, but lockOSThread is nonetheless called later when the syscall.Syscall family of functions is used.

@gopherbot gopherbot added this to the Unreleased milestone Aug 31, 2021
@cherrymui cherrymui added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Sep 1, 2021
@cherrymui
Copy link
Member

It might be relevant that the os thread lock is permanently grabbed by the main driver thread, but lockOSThread is nonetheless called later when the syscall.Syscall family of functions is used.

lockOSThread locks the goroutine to the current OS thread, which is not necessarily the main thread. So the lockOSThread around syscall.Syscall does not guarantee the syscall is made on the main thread.

@cherrymui
Copy link
Member

cc @nigeltao
cc @alexbrainman @bufflig for Windows.

@nigeltao
Copy link
Contributor

nigeltao commented Sep 3, 2021

I really don't know much about Windows programming generally and SendMessage specifically, so I'll leave this for others.

@alexbrainman
Copy link
Member

@theinternetftw

I built your program using current tip of both main Go repo and golang.org/x/exp. I run your program 20 times and I don't see memory corruption message printed even once.

What I am doing wrong?

Thank you.

Alex

@theinternetftw
Copy link
Author

theinternetftw commented Sep 4, 2021

There's a comment at the top of the code that shows a script to restart the program in a loop continuously until it errors out. It was by using such a script that it took around a minute to trigger the bug.

In that comment I gave a bash script for running the program in a loop (I'm using cygwin). The batch file version would be:

@echo off
:loop
.\repro.exe
if %errorlevel% == 1 (exit)
goto loop

Today, on two different computers (both Windows 10 version 19042.1165), after compiling with gotip, the fastest it printed the message was 2s, the slowest was 5m50s, the average was around 3m. Several times slower than when I tested it just before originally posting this bug, for whatever reason.

( Edit: I had a paragraph here about a version of the repro code that triggers the bug faster, but I just discovered that while on one computer, it triggers the bug reliably within 10 seconds, on a different computer it's not triggered faster at all. I'll leave it here for completeness, but it may or may not be useful to you: http://theinternetftw.com/code/shinyrepro2.zip )

Cheers.

@alexbrainman
Copy link
Member

The batch file version would be:

@theinternetftw thank you very much for your instructions. Indeed I can reproduce your problem.

Unfortunately I don't know what the problem is, so leaving it for others to decide what to do here.

Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
None yet
Development

No branches or pull requests

5 participants