Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context: (*cancelCtx).Done() channel type assertion error #60681

Closed
hangpark opened this issue Jun 8, 2023 · 7 comments
Closed

context: (*cancelCtx).Done() channel type assertion error #60681

hangpark opened this issue Jun 8, 2023 · 7 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@hangpark
Copy link

hangpark commented Jun 8, 2023

What version of Go are you using (go version)?

$ go version
go version go1.19.9 linux/arm64

Does this issue reproduce with the latest release?

This issue appears sporadically and a consistent method of reproduction has not yet been identified.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="go1.19.9"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3855752231=/tmp/go-build -gno-record-gcc-switches"

What did you do?

We're building our server application using the golang:1.19-alpine3.17 Docker image and running it on alpine:3.17, as outlined in the attached Dockerfile:

Dockerfile

XXXXX's are masked contents.

# syntax=docker/dockerfile:1
FROM golang:1.19-alpine3.17 AS builder
RUN apk add --update --no-cache gcc g++ make git openssh-client \
  && mkdir -p -m 0600 ~/.ssh \
  && ssh-keyscan github.com >> ~/.ssh/known_hosts \
  && git config --global url."git@github.com:XXXXX/".insteadOf "https://github.com/XXXXX/" \
  && go env -w GOPRIVATE=github.com/XXXXX
WORKDIR /app
COPY ./go.mod ./go.sum ./Makefile ./
RUN --mount=type=ssh make init -o init-lint
COPY ./ ./
RUN make -o init

FROM alpine:3.17
ENV TZ=XXXXX
RUN apk add --update --no-cache ca-certificates tzdata \
  && cp /usr/share/zoneinfo/$TZ /etc/localtime \
  && echo $TZ > /etc/timezone
WORKDIR /app
COPY --from=builder /app/bin/ ./
COPY ./configs/ ./configs/
CMD ["./XXXXX"]
EXPOSE 80

The application is an HTTP server based on gin v1.7.7, handling an average of 1K RPM (requests per minute) per container in our AWS EKS setup.

Our deployment includes liveness and readiness health checkers on each pod. These checkers execute simple HTTP requests and expect a 200 OK response. This HTTP request invokes a basic handler logic that returns a 200 OK status without any business-related processing. Of course, the application also has numerous endpoints for actual business operations.

Issue Description

Since May 31, we've encountered an error serveral times, evidenced by the attached stacktrace.

*runtime.TypeAssertionError: interface conversion: interface {} is , not chan struct {}
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/sentry.go", line 58, in Sentry.func1.1.1
  File "/usr/local/go/src/context/context.go", line 361, in (*cancelCtx).Done
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/timeout.go", line 81, in Timeout.func2.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/sentry.go", line 75, in Sentry.func1.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/recovery.go", line 23, in Recovery.func1.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/wraperror.go", line 17, in WrapError.func1.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/logging.go", line 28, in AccessLogging.func1.1.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/util/timeutil/timeutil.go", line 41, in Process
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/middleware/logging.go", line 27, in AccessLogging.func1.1
  File "/go/pkg/mod/github.com/XXXXX/XXXXX@v1.0.25/pkg/httpserver/ginrouter/ginrouter.go", line 242, in (*ginRouter).toGinHandler.func1
  File "/go/pkg/mod/github.com/gin-gonic/gin@v1.7.7/context.go", line 168, in (*Context).Next
  File "/go/pkg/mod/github.com/gin-contrib/gzip@v0.0.3/handler.go", line 60, in (*gzipHandler).Handle
  File "/go/pkg/mod/github.com/gin-gonic/gin@v1.7.7/context.go", line 168, in (*Context).Next
  File "/go/pkg/mod/github.com/gin-gonic/gin@v1.7.7/gin.go", line 555, in (*Engine).handleHTTPRequest
  File "/go/pkg/mod/github.com/gin-gonic/gin@v1.7.7/gin.go", line 511, in (*Engine).ServeHTTP
  File "/usr/local/go/src/net/http/server.go", line 2947, in serverHandler.ServeHTTP
  File "/usr/local/go/src/net/http/server.go", line 1991, in (*conn).serve

This specific error occurred on four seperate occasions: May 31, Jun 4, Jun 5, and Jun 8.
(Note that the docker image we're using updated golang version to 1.19.9 on May 3, and then to 1.19.10 on Jun 7, but Jun 8 case was using 1.19.9 yet)

Importantly, all four instances of this panic occurred while processing the health check endpoint.

The root of the panic seems to originate from the code found at File "/usr/local/go/src/context/context.go", line 361:

func (c *cancelCtx) Done() <-chan struct{} {
	d := c.done.Load()
	if d != nil {
		return d.(chan struct{}) // !!Line 361!!
	}
	c.mu.Lock()
	defer c.mu.Unlock()
	d = c.done.Load()
	if d == nil {
		d = make(chan struct{})
		c.done.Store(d)
	}
	return d.(chan struct{})
}

But we have not yet determined why this type conversion cannot be done.

And, the most interesting part is the panic message:

*runtime.TypeAssertionError: interface conversion: interface {} is , not chan struct {}

The actual type of the interface {} is not specified in the message - it merely appears as an empty string between is and ,.

As this panic occurs only sporadically and is not currently reproducible, we're seeking insights from anyone who might have experienced a similar issue. If additional information is required for debugging, let us know what further data we can collect from our production application and how to get it.

@seankhliao
Copy link
Member

gin has a custom context implementation, have you ruled out it being the source of error?

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 8, 2023
@bcmills
Copy link
Contributor

bcmills commented Jun 8, 2023

Have you tried running the program under the race detector? (Does it flag anything?)

@hangpark
Copy link
Author

hangpark commented Jun 9, 2023

@seankhliao If the context you mentioned is the *gin.Context struct of gin, which contains a lot of information about a single request, then yes, we are using it. However, we do not use its methods. Instead, it holds the go native *http.Request, which allows us to access the go native context within it, so we're using it.

Furthermore, you can check the stack trace to see that the panic occurred inside the go native cancellable context. It is true that the context created by gin serves as the parent context, as many middlewares recreate child contexts from it using the context.WithXXX() built-in functions. But even if we used context gin implemented, is it possible for a custom context implementation as a parent to interfere with its child context created using built-in functions?

Additionally, it is strange that the panic message does not display the actual type of the "done" channel (it is definitely not nil). How is this possible? Could it be due to unsafe pointer operations?

If we are missing any detailed knowledge about gin, please provide us with that information. Thank you.

@hangpark
Copy link
Author

hangpark commented Jun 9, 2023

@bcmills No, I haven't tried that yet, but I can set it up and run with it. Is it sufficient to build the application with the -race option? Since panics are only occurring sporadically, it may take some time to get a result.

@bcmills
Copy link
Contributor

bcmills commented Jun 9, 2023

Is it sufficient to build the application with the -race option? Since panics are only occurring sporadically, it may take some time to get a result.

Yep, that should do the trick. Thanks!

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. compiler/runtime Issues related to the Go compiler and/or runtime. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jun 9, 2023
@bcmills bcmills added this to the Backlog milestone Jun 9, 2023
@ianlancetaylor
Copy link
Contributor

Additionally, it is strange that the panic message does not display the actual type of the "done" channel (it is definitely not nil). How is this possible?

It's not.

Could it be due to unsafe pointer operations?

Yes.

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants