Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/godoc/proxy: proxying to playground sometimes hangs perpetually #28134

Closed
andybons opened this issue Oct 10, 2018 · 11 comments
Closed

x/tools/godoc/proxy: proxying to playground sometimes hangs perpetually #28134

andybons opened this issue Oct 10, 2018 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Soon This needs to be done soon. (regressions, serious bugs, outages)
Milestone

Comments

@andybons
Copy link
Member

andybons commented Oct 10, 2018

Running the program below gives very different results. Sometimes hanging for three minutes until finally giving up, or going through without issue. This only tests the /compile endpoint. It's uncertain whether this affects the /share endpoint as well.

Worth flagging since we just switched over to flex on golang.org.
/cc @broady @dmitshur @bradfitz

package main

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"net/http"
	"os"
	"time"
)

func main() {
	buf := bytes.NewBuffer([]byte("version=2&body=package+main%0Aimport+%22fmt%22%0Afunc+main()%7Bfmt
.Println(%22Hello%2C+%E4%B8%96%E7%95%8C%22)%7D"))
	t := time.Now()
	resp, err := http.Post("https://golang.org/compile", "application/x-www-form-urlencoded; charset=
UTF-8", buf)
	if err != nil {
		log.Fatal(err)
	}
	defer resp.Body.Close()
	io.Copy(os.Stderr, resp.Body)
	fmt.Fprintln(os.Stderr)
	fmt.Fprintf(os.Stderr, "Took %v\n", time.Since(t))
}
@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 10, 2018
@andybons andybons added this to the Unplanned milestone Oct 10, 2018
@bradfitz
Copy link
Contributor

My guess is the http.Transport's persistent connections die over time (credentials timeout in socket gateway?) and because there are no heartbeats or anything, it just blocks forever.

I'd start by disabling the Transport's KeepAlives and see if that fixes it.

If so, we can escalate the bug to the Flex team.

@broady
Copy link
Contributor

broady commented Oct 11, 2018

From logs...

2018/10/11 00:28:43 proxy.go:69: ERROR compile error: making request: Post https://play.golang.org/compile: context canceled

Does that confirm/deny your hypothesis, Brad?

Request times are all over the place, too. Some over 60 seconds.

image

@bradfitz
Copy link
Contributor

That doesn't actual tell me much.

@broady
Copy link
Contributor

broady commented Oct 11, 2018

Hm, OK. Will add some more logging on both ends.
It sounds like the remote (play.golang.org) closed the connection, though?

credentials timeout in socket gateway?

Definitely not this -- socket gateway is not used anymore.

@bradfitz
Copy link
Contributor

Actually what would help the most is a goroutine dump (of all goroutines) if the proxy is taking longer than N seconds. (start a timer.AfterFunc, cancel it on successful proxy, then have the AfterFunc func log all goroutine stacks)

That'd show me where the Transport code is stuck.

credentials timeout in socket gateway?

Definitely not this -- socket gateway is not used anymore.

I think it probably still exists a few layers down in some form, even if the API is now just "use Linux system calls".

Or other theory: the Flex tasks for play.golang.org are changing in count or starting/stopping or somehow rescheduling and the "TCP" connections from the proxy are just getting stuck without ever seeing a RST. (I still mean to do #25883 for Go 1.12, which would fix this if it were the problem.)

@agnivade
Copy link
Contributor

Yes, I have been seeing this too.

A short reproducer using curl. If someone just wants to copy paste.

curl 'https://golang.org/compile' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Referer: https://golang.org/pkg/path/' -H 'Origin: https://golang.org' -H 'X-Requested-With: XMLHttpRequest' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' --data 'version=2&body=package+main%0A%0Aimport+(%0A%09%22fmt%22%0A%09%22path%22%0A)%0A%0Afunc+main()+%7B%0A%09fmt.Println(path.Ext(%22%2Fa%2Fb%2Fc%2Fbar.css%22))%0A%09fmt.Println(path.Ext(%22%2F%22))%0A%09fmt.Println(path.Ext(%22%22))%0A%7D%0A' -v

@broady
Copy link
Contributor

broady commented Oct 11, 2018

Unrelated, but it seems /share proxying is even more broken. I'll look at that next.

I've deployed a version with some extra logging, but can't reproduce the hang on that new version. Very odd. I'm going to redirect production traffic over to that momentarily to see if there's some other interaction going on.

I think it probably still exists a few layers down in some form, even if the API is now just "use Linux system calls".

Flex runs on a GCE VM, so there's no gVisor happening here.

@broady
Copy link
Contributor

broady commented Oct 11, 2018

Goroutine dump:
https://gist.github.com/broady/6914f318c2fdaec46f29f24747ec03bf

It's odd there are so many goroutines for the /share endpoint - it looks like there isn't even any incoming requests for share (or they just never complete). So, this probably actually is related to /share being broken?

Is the share proxy exhausting all of the connections available? Note that they're both hitting the same host (play.golang.org).

Oddness about the httputil reverse proxy:

  • Running locally, it serves a 404.
  • In prod, it appears to hang.

Maybe there's some Flex header that needs to be filtered out? (Still doesn't explain the 404 running locally)

Replacing the reverse proxy with a simplified proxy seems to alleviate the issue. I'm out of time to investigate today, but I'll send a CL with a workaround -- a simplified proxy for /share.

@gopherbot
Copy link

Change https://golang.org/cl/141718 mentions this issue: godoc/proxy: workaround for infinite redirect on ReverseProxy

@gopherbot
Copy link

Change https://golang.org/cl/153858 mentions this issue: [release-branch.go1.11] godoc/proxy: remove use of httputil.ReverseProxy for /share

gopherbot pushed a commit to golang/tools that referenced this issue Dec 12, 2018
…oxy for /share

ReverseProxy doesn't re-set the Request's Host field, only
Request.URL.Host.
The HTTP/2 client prefers Request.Host over Request.URL.Host, so this
results in the request being sent back to the host that originally
accepted the request.
This results in an infinite redirect (and consumption of many connections to
itself).
See Issue golang/go#28168 for details.

Replace it with a simple proxy that drops all the headers (except
Content-Type).

I tried setting the proxy.Director, but it still didn't work. Could do
with some more investigation.

Fixes golang/go#28134.

Change-Id: I5051ce72a379dcacfbe8484f58f8cf7d9385024d
Reviewed-on: https://go-review.googlesource.com/c/141718
Run-TryBot: Chris Broadfoot <cbro@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(cherry picked from commit 837e805)
Reviewed-on: https://go-review.googlesource.com/c/153858
Run-TryBot: Andrew Bonventre <andybons@golang.org>
@andybons
Copy link
Member Author

This is live on golang.org. Thanks, @broady!

@golang golang locked and limited conversation to collaborators Dec 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Soon This needs to be done soon. (regressions, serious bugs, outages)
Projects
None yet
Development

No branches or pull requests

5 participants