Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: JoinPath generates bad path if initial url path is empty string #58605

Open
perj opened this issue Feb 20, 2023 · 18 comments
Open

net/url: JoinPath generates bad path if initial url path is empty string #58605

perj opened this issue Feb 20, 2023 · 18 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@perj
Copy link

perj commented Feb 20, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOOS="linux"

What did you do?

https://go.dev/play/p/wXMvYKC1FSY

What did you expect to see?

200 OK

What did you see instead?

400 Bad Request

The request sent to the server starts with GET api/endpoint. It should be GET /api/endpoint. The former generates a 400 error in url.ParseRequestURI.

@ianlancetaylor ianlancetaylor changed the title net/url: url.URL.JoinPath generates bad path if initial url path is empty string. net/url: JoinPath generates bad path if initial url path is empty string Feb 20, 2023
@ianlancetaylor
Copy link
Contributor

CC @carlmjohnson

The issue here is that before the call to req.URL.JoinPath the req.Path field the empty string. After the call the req.Path field is api/endpoint. Should the empty string imply the root?

@earthboundkid
Copy link
Contributor

Yes, I think so. It's a sort of unfortunate fact about url.URL.Path that "" and "/" are different but in most cases should be treated as equivalent.

@earthboundkid
Copy link
Contributor

Hmm, a quick and dirty attempt at fixing it causes these test errors:

   url_test.go:2199: JoinPath("a", []) = "/a", <nil>, want "a", nil
    url_test.go:2211: Parse("a").JoinPath([]) = "/a", <nil>, want "a", nil
    url_test.go:2199: JoinPath("a", ["b"]) = "/a/b", <nil>, want "a/b", nil
    url_test.go:2211: Parse("a").JoinPath(["b"]) = "/a/b", <nil>, want "a/b", nil
    url_test.go:2199: JoinPath("a", ["../b"]) = "/b", <nil>, want "b", nil
    url_test.go:2211: Parse("a").JoinPath(["../b"]) = "/b", <nil>, want "b", nil
    url_test.go:2199: JoinPath("a", ["../../b"]) = "/b", <nil>, want "b", nil
    url_test.go:2211: Parse("a").JoinPath(["../../b"]) = "/b", <nil>, want "b", nil
    url_test.go:2199: JoinPath("", ["a"]) = "/a", <nil>, want "a", nil
    url_test.go:2211: Parse("").JoinPath(["a"]) = "/a", <nil>, want "a", nil
    url_test.go:2199: JoinPath("", ["../a"]) = "/a", <nil>, want "a", nil
    url_test.go:2211: Parse("").JoinPath(["../a"]) = "/a", <nil>, want "a", nil

I'm not sure what I think the output should be here.

@perj
Copy link
Author

perj commented Feb 21, 2023

I'm sort of thinking empty path might only be the same as / if the hostname is not empty. If it's empty then it's a relative url, right?

I'd probably just special case this...

if elem[0] == "" && u.Host != "" {
	elem[0] = "/"
}

@earthboundkid
Copy link
Contributor

diff --git a/src/net/url/url.go b/src/net/url/url.go
index d530a50d40..bdc85ef6fa 100644
--- a/src/net/url/url.go
+++ b/src/net/url/url.go
@@ -1209,6 +1209,9 @@ func (u *URL) JoinPath(elem ...string) *URL {
 	if strings.HasSuffix(elem[len(elem)-1], "/") && !strings.HasSuffix(p, "/") {
 		p += "/"
 	}
+	if p != "" && p[0] != '/' && u.Host != "" {
+		p = "/" + p
+	}
 	url := *u
 	url.setPath(p)
 	return &url
diff --git a/src/net/url/url_test.go b/src/net/url/url_test.go
index 577cf631c8..df7f9dea9c 100644
--- a/src/net/url/url_test.go
+++ b/src/net/url/url_test.go
@@ -2202,6 +2202,9 @@ func TestJoinPath(t *testing.T) {
 		u, err := Parse(tt.base)
 		if err == nil {
 			u = u.JoinPath(tt.elem...)
+			if u.Path != "" && u.Path[0] != '/' && u.Host != "" {
+				t.Errorf("Parse(%q).JoinPath(%q).Path = %q", tt.base, tt.elem, u.Path)
+			}
 			out = u.String()
 		}
 		if out != tt.out || (err == nil) != (tt.out != "") {

This changes it to add a slash if the hostname is not empty, and it passes tests, but like I said, I'm not sure what the correct behavior is here.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 21, 2023
@thanm
Copy link
Contributor

thanm commented Feb 21, 2023

@neild @rsc per owners

@neild
Copy link
Contributor

neild commented Feb 21, 2023

The doc for Path states:

Path  string // path (relative paths may omit leading slash)

This implies absolute paths must not omit the leading slash, but URL.RequestURI contains a special case to return "/" when Path is empty. So perhaps the rule is that absolute paths must not omit the leading slash, but an empty path is equivalent to "/".

I think the most consistent behavior here would be for JoinPath to consider an empty Path to be equivalent to "/", same as RequestURI.

@earthboundkid
Copy link
Contributor

How about this then:

diff --git a/src/net/url/url.go b/src/net/url/url.go
index d530a50d40..c7415897e2 100644
--- a/src/net/url/url.go
+++ b/src/net/url/url.go
@@ -1200,7 +1200,10 @@ func (u *URL) JoinPath(elem ...string) *URL {
 		// Return a relative path if u is relative,
 		// but ensure that it contains no ../ elements.
 		elem[0] = "/" + elem[0]
-		p = path.Join(elem...)[1:]
+		p = path.Join(elem...)
+		if elem[0] != "/" || u.Host == "" {
+			p = p[1:]
+		}
 	} else {
 		p = path.Join(elem...)
 	}

@gopherbot
Copy link

Change https://go.dev/cl/469935 mentions this issue: net/url: consider an empty base Path as equivalent to / in JoinPath

@neild
Copy link
Contributor

neild commented Feb 21, 2023

That's still handling the case where Host is unset differently. I think consistency with RequestURI would be to just convert Path to "/" when it is empty.

https://go.dev/cl/469935 takes that approach. What do you think?

@perj
Copy link
Author

perj commented Apr 11, 2023

Not sure if I'm supposed to be pushing this in any kind of manner. If so, consider yourselves pushed. :)

peanball added a commit to sap-contributions/pcap-release that referenced this issue Apr 12, 2023
URL.JoinPath, is buggy: golang/go#58605. By setting the path to '/' when it's empty we work around the issue and can still use JoinPath.
@robryk
Copy link
Contributor

robryk commented Aug 19, 2023

Note that currently the value of Path in result of url.Parse depends on whether the original URL ended in / or not: https://go.dev/play/p/H6H902lJTam.

I'd argue that this might be considered to also be a bug in url.Parse: for two URLs that (I believe) should be equivalent it generates meaningfully different URL structs.

@haton14
Copy link

haton14 commented Sep 5, 2023

https://go.dev/play/p/l7jfSgP24id
I also receive a 400 response in func (c *Client) Do(req *Request) due to the malformed struct generated by JoinPath.

@jpcope
Copy link

jpcope commented Oct 12, 2023

Glad I caught this in a unit test before updating my client lib from http.NewRequest. Easy enough to workaround though - set Path to "/" on the base url.URL type if empty before calling JoinPath - if you know the right-hand-side arg is non-empty.

@rsc
Copy link
Contributor

rsc commented Mar 14, 2024

This change broke a variety of tests inside Google, indicating that it would probably break a variety of tests and potentially real-world uses outside Google as well. I rolled back the change. If we roll it forward again we should probably put it behind a GODEBUG, like urljoinpathslash=0 to go back to the old behavior.

@rsc rsc reopened this Mar 14, 2024
@neild
Copy link
Contributor

neild commented Mar 18, 2024

I think this is a case where if we would want to add a GODEBUG, that's sufficient evidence that we just shouldn't make the change.

@earthboundkid
Copy link
Contributor

What were the broken tests at Google testing?

@neild
Copy link
Contributor

neild commented Mar 19, 2024

Two cases of non-test code, and one case of test code doing variations on:

u, err := url.JoinPath("https://", "hostname", "some", "path", "components")

I think this is clearly a misuse of JoinPath; internally, it's producing a URL with no Host and a path of "hostname/some/path/components", which happens to stringify as if "hostname" was a host. However, it "works" today and this change would have broken real code, not just tests.


One case of non-test code doing this and relying on the result:

u, err := url.JoinPath("", "foo")

I don't understand why you'd do this, but it's not wrong.


One case of non-test code doing this and a test relying on the result when part1 is not absolute:

u, err := url.Parse("https://hostname")
if err != nil {
  return nil, err
}
u = u.JoinPath("part1", "part2")

I think that this might be a case where the production code will always use an absolute part1, but I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

10 participants