Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http/httputil: ReverseProxy appends trailing slash to url with empty path #50337

Open
nkreiger opened this issue Dec 24, 2021 · 8 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nkreiger
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.18beta1 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/noahkreiger/Library/Caches/go-build"
GOENV="/Users/noahkreiger/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/noahkreiger/go/pkg/mod"
GONOPROXY="github.com/nkreiger"
GONOSUMDB="github.com/nkreiger"
GOOS="darwin"
GOPATH="/Users/noahkreiger/go"
GOPRIVATE="github.com/nkreiger"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/noahkreiger/go/go1.18beta1"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/noahkreiger/go/go1.18beta1/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.18beta1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS="-Wno-error -Wno-nullability-completeness -Wno-expansion-to-defined -Wbuiltin-requires-header"
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4w/ymct9nsd21j2tmf_8k7csf340000gn/T/go-build285603268=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Attempted to serve a reverse proxy where the base URL had a RawPath and Path value of "", an empty string.

// Serve a reverse proxy for a given url
func serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {
	// parse the url
	url, _ := url.Parse(target)

	// create the reverse proxy
	proxy := httputil.NewSingleHostReverseProxy(url)

	// Note that ServeHttp is non blocking and uses a go routine under the hood
	proxy.ServeHTTP(res, req)
}

target = "http://localhost:808/health"

req.URL.Path = ""
req.URL.RawPath = ""

What did you expect to see?

I expected the new single host reverse proxy to rout the request to http://locahost:8080/health

What did you see instead?

It was routed to http://localhost:8080/health/ which causes the route to throw a 404 Not Found because of the extra suffix "/"

Potential Easy Fix

return a + "/" + b

it this line to be

case !aslash && !bslash:
		if b == "" {
			return a
		}
		return a + "/" + b
	}

so if the base path is empty, it just returns the new path passed in instead of appending an extra slash...

@seankhliao seankhliao changed the title affected/package: net/http/httputil net/http/httputil: ReverseProxy appends trailing slash to url with empty path Dec 24, 2021
@gopherbot
Copy link

Change https://golang.org/cl/374276 mentions this issue: net/http/httputil: This change modifies Go to not add trailing slash on a direct reverse proxy Fixes #50337

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 28, 2021
@thanm
Copy link
Contributor

thanm commented Dec 28, 2021

@neild per owners.

@septiajio
Copy link

Same here, I got 404 because of extra suffix "/"

@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
@krackers
Copy link

krackers commented Mar 20, 2023

Is it valid per spec to have an empty url path though? I thought the resource being requested always has to be well-defined, so if you have

GET / HTTP/2

then you'd have path as /, with no way for a client to send a request with it being empty. See the following part of RFC 2616:

The most common form of Request-URI is that used to identify a
   resource on an origin server or gateway. In this case the absolute
   path of the URI MUST be transmitted (see [section 3.2.1](https://www.rfc-editor.org/rfc/rfc2616#section-3.2.1), abs_path) as
   the Request-URI, and the network location of the URI (authority) MUST
   be transmitted in a Host header field. For example, a client wishing
   to retrieve the resource above directly from the origin server would
   create a TCP connection to port 80 of the host "www.w3.org" and send
   the lines:

       GET /pub/WWW/TheProject.html HTTP/1.1
       Host: www.w3.org

   followed by the remainder of the Request. Note that the absolute path
   cannot be empty; if none is present in the original URI, it MUST be
   given as "/" (the server root).

@tmalaher-telus
Copy link

@krackers has a point.

Maybe modify reverseproxy.go:singleJoiningSlash as follows (comments added):

func singleJoiningSlash(a, b string) string {
	aslash := strings.HasSuffix(a, "/")
	bslash := strings.HasPrefix(b, "/")
	switch {
	case aslash && bslash:
		return a + b[1:] // avoid doubled slashes when both are present
	case !aslash && !bslash: // neither has a slash, make sure there is one, or at least the result is non-blank
		if b=="" {
		    if a=="" {
		        return "/" // if both are blank, return a slash
		    }
		    return a // avoid adding a trailing slash if b is blank
		}
		return a + "/" + b // both non-blank, separate with slash
	}
	return a + b // only one of a or b has a trailing/leading slash, just concatenate
}

The reverseproxy.go code is a bit strange. I've quoted it below and added //??? comments where I want to editorialize.

func joinURLPath(a, b *url.URL) (path, rawpath string) {
	if a.RawPath == "" && b.RawPath == "" { 
                //??? this is the only place singleJoiningSlash is called (other than in tests)
		return singleJoiningSlash(a.Path, b.Path), ""
 	}
	// Same as singleJoiningSlash, but uses EscapedPath to determine
	// whether a slash should be added
	apath := a.EscapedPath()
	bpath := b.EscapedPath()

	aslash := strings.HasSuffix(apath, "/")
	bslash := strings.HasPrefix(bpath, "/")

	switch {
	case aslash && bslash:
		return a.Path + b.Path[1:], apath + bpath[1:]
	case !aslash && !bslash:
                //??? do we have the same issue here? Does it matter?
		return a.Path + "/" + b.Path, apath + "/" + bpath
	}
	return a.Path + b.Path, apath + bpath
}

@krackers
Copy link

krackers commented May 20, 2023

I was actually meaning that since it's not valid to have a GET request with empty resource being requested, we shouldn't need to handle this case in httputil itself.

As I understand, the api of reverseproxy is really meant to route paths under a directory-like structure. So if you reverse proxy to http://locahost:8080 and request /health then the request made is for http://locahost:8080/health. Likewise if you route to http://locahost:8080/health and request /foo then the request is for http://locahost:8080/health/foo.

Now OP's situation is that if you route to http://locahost:8080/health and then request /, then the request is for http://locahost:8080/health/ which may not be defined. However think this case should be handled with a custom handler func in your code, rather than trying to hack an empty-path to signal that a trailing slash should not be appended (since an empty-path is invalid per spec).

@nicolasopt
Copy link

nicolasopt commented Oct 13, 2023

Hi,

This is my solution:

`func serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {

url, _ := url.Parse(target)

req.URL = url    // <----------  add this
url.Path = ""   // <----------  add this

proxy := httputil.NewSingleHostReverseProxy(url)

proxy.ServeHTTP(res, req)

}`

@neild
Copy link
Contributor

neild commented Apr 19, 2024

As @krackers says: An HTTP request URL can't contain an empty path. Should we be adding a special case to handle one, given that our existing behavior doesn't seem particularly wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

9 participants