Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: allow reading sysfs, etc with ignoring event scanning error #30817

Closed
stgraber opened this issue Mar 13, 2019 · 7 comments
Closed

runtime: allow reading sysfs, etc with ignoring event scanning error #30817

stgraber opened this issue Mar 13, 2019 · 7 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@stgraber
Copy link

What version of Go are you using (go version)?

go version devel +870cfe6484 Wed Mar 13 21:44:45 2019 +0000 linux/amd64

Does this issue reproduce with the latest release?

No, this is only reproducible with master, 1.10, 1.11 and 1.12 are all unaffected.

What operating system and processor architecture are you using (go env)?

Linux buildd01 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

go env Output
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GOPROXY=""
GORACE=""
GOROOT="/lxc-ci/build/cache/gimme/versions/go"
GOTMPDIR=""
GOTOOLDIR="/lxc-ci/build/cache/gimme/versions/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build092171511=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Attempted multiple parallel reads of the same file from /sys

package main

import (
	"fmt"
	"io/ioutil"
	"sync"
	"time"
)

func main() {
	wg := sync.WaitGroup{}

	readSysfs := func() {
		for i := 0; i < 100; i++ {
			_, err := ioutil.ReadFile("/sys/devices/system/cpu/cpu0/topology/core_id")
			if err != nil {
				fmt.Printf("error: %v\n", err)
			}
			time.Sleep(100*time.Millisecond)
		}
		wg.Done()
	}

	for i := 0; i < 4; i++ {
		wg.Add(1)
		go readSysfs()
	}

	wg.Wait()
}

What did you expect to see?

No errors, all reads succeeding as they do with all other tested Go versions

What did you see instead?

root@buildd01:~# /lxc-ci/build/cache/gimme/versions/go/bin/go run test.go 
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable

Tests so far show about a 15% failure rate on that ReadFile call.

@stgraber
Copy link
Author

We've seen this popping up in LXD's automated testing starting over the past two days or so, suggesting a pretty recent regression.

@tmthrgd
Copy link
Contributor

tmthrgd commented Mar 13, 2019

This is possibly caused by CL 166497 for #30624.

@mikioh mikioh changed the title Current Go master breaks reading files from /sys runtime: allow reading sysfs, etc with ignoring event scanning error Mar 14, 2019
@mikioh mikioh added this to the Go1.13 milestone Mar 14, 2019
@mikioh
Copy link
Contributor

mikioh commented Mar 14, 2019

Thanks for the report. In general, using your own polling stuff is better to read sysfs-like virtual files because the runtime-integrated network poller never uses EPOLLPRI or a pair of EPOLLPRI+EPOLLERR which is able to provide the special sign, for example, actual data reception from the underlying device. For backward compatibility, I'll make the poller a bit conservative to allow most user-configured files to ignore event scanning errors except the case of /dev/net/tun-like misconfigured stuff blocking the subsequent I/O calls forever.

@ianlancetaylor, any opinion?

@ianlancetaylor
Copy link
Contributor

I don't know enough to have an opinion. Why would the poller sometimes return EPOLLERR here?

@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 15, 2019
@mikioh
Copy link
Contributor

mikioh commented Mar 15, 2019

For example, on some special files, EPOLLIN+EPOLLOUT indicates the underlying stuff is ready for operation and EPOLLPRI+EPOLLERR indicates actual data reception. Other stuff uses another combination. Fortunately, we may use an individual POLLERR, EPOLLERR or EV_ERROR as a critical state by convention the same as marking all events as the end of a session; see https://go-review.googlesource.com/c/go/+/167777

@gopherbot
Copy link

Change https://golang.org/cl/167777 mentions this issue: runtime, internal/poll: report only critical event scanning error

@mvdan
Copy link
Member

mvdan commented Mar 17, 2019

I hit this while trying to use https://github.com/aclements/perflock on a recent Go build. I was scratching my head for a good fifteen minutes until I realised it wasn't setting the right CPU frequency because of these read errors.

@mikioh mikioh removed the OS-Linux label Mar 18, 2019
@golang golang locked and limited conversation to collaborators Mar 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants