Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: add OmitHost bool to URL #46059

Closed
TimothyGu opened this issue May 8, 2021 · 9 comments
Closed

net/url: add OmitHost bool to URL #46059

TimothyGu opened this issue May 8, 2021 · 9 comments

Comments

@TimothyGu
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.16.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/timothy-gu/.cache/go-build"
GOENV="/home/timothy-gu/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/timothy-gu/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/timothy-gu/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/timothy-gu/dev/go/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1221289449=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/OFrp4cOVW8j

import (
	"fmt"
	"net/url"
)

u, _ := url.Parse("myscheme:path")
fmt.Printf("myscheme:path\t%s\n", u)

u, _ = url.Parse("myscheme:/path")
fmt.Printf("myscheme:/path\t%s\n", u)

u, _ = url.Parse("myscheme://path")
fmt.Printf("myscheme://path\t%s\n", u)

u, _ = url.Parse("myscheme:///path")
fmt.Printf("myscheme:///path\t%s\n", u)

What did you expect to see?

All the parsed URLs roundtrip.

myscheme:path	myscheme:path
myscheme:/path	myscheme:/path
myscheme://path	myscheme://path
myscheme:///path	myscheme:///path

What did you see instead?

myscheme:path	myscheme:path
myscheme:/path	myscheme:///path    <-- three slashes
myscheme://path	myscheme://path
myscheme:///path	myscheme:///path

The issue here is that myscheme:/path and myscheme:///path are treated as the same URL, both parsing to

&url.URL{
	Scheme: "myscheme",
	Host: "",
	Path: "/path",
}

Yet, they are materially different URLs. According to RFC 3986, myscheme:/path should be treated as having a path-absolute, which does not have an authority defined. On the other hand, myscheme:///path does have an authority, albeit an empty one.

Whether authority is undefined or empty is important for the recomposition algorithm in the RFC (i.e., URL.String):

      if defined(authority) then
         append "//" to result;
         append authority to result;
      endif;

I can imagine two different ways to solve this problem:

  1. Add a ForceAuthority boolean to url.URL, such that a true value implies an authority that is present even if Host is "" and User is nil. This has the advantage of being analogous to ForceQuery. However, it can run into compatibility problems: existing code that creates a file URL from scratch will now have their URL serialized to file:/home/... rather than the expected file:///home/....

  2. Add a NoAuthority boolean to url.URL, such that a false value implies an authority is present. url.Parse will set this field if a / (but not //) is present right after the scheme. This maintains the previous behavior for any manually created URLs, but fixes the Parse/String roundtrip for URLs with a single slash.

I propose approach 2.

As a historical note, this limitation was known when introduced in cdd6ae1 (Go 1.1) to fix #4189:

go/src/net/url/url_test.go

Lines 166 to 174 in b38b1b2

// non-authority with path
{
"mailto:/webmaster@golang.org",
&URL{
Scheme: "mailto",
Path: "/webmaster@golang.org",
},
"mailto:///webmaster@golang.org", // unfortunate compromise
},
At the time, the argument against fixing this bug was a desire to avoid introducing more fields to url.URL. Since Go 1.1, we have added quite a few new fields to url.URL: RawPath (1.5), ForceQuery (1.7), and RawFragment (1.15). I think we should still maintain a high bar when introducing new fields, but the memory cost for adding a new boolean is low (zero in fact, if we pack the structure properly).

/cc @rsc @bradfitz

@gopherbot gopherbot added this to the Proposal milestone May 8, 2021
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) May 8, 2021
@TimothyGu
Copy link
Contributor Author

I'll also add that among other URL parsers tested, Go is almost unique in not distinguishing between myscheme:/path and myscheme:///path. Chrome, Firefox, and Safari, as well as Ruby, Rust, Node.js, and the C uriparser library, all distinguish them.

Python represents an exception, but it normalizes myscheme:///path to myscheme:/path, the opposite from Go. (Python also treats file URLs differently from non-web URLs, which is also different from Go.)

@rsc
Copy link
Contributor

rsc commented May 12, 2021

We don't call that section Authority, we just call it Host.
It sounds like we should add OmitHost bool // do not emit empty host (authority) to url.URL?

@rsc
Copy link
Contributor

rsc commented May 12, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) May 12, 2021
@TimothyGu
Copy link
Contributor Author

Ah, sorry I mixed up Host and Hostname(). OmitHost is a good name.

@rsc
Copy link
Contributor

rsc commented May 19, 2021

It sounds like people are in favor of adding OmitHost to URL. Will retitle.

@rsc rsc changed the title proposal: net/url: distinguish empty and not present authority proposal: net/url: add OmitHost bool to URL May 19, 2021
@rsc
Copy link
Contributor

rsc commented May 19, 2021

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Accept in Proposals (old) May 19, 2021
@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) May 26, 2021
@rsc
Copy link
Contributor

rsc commented May 26, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: net/url: add OmitHost bool to URL net/url: add OmitHost bool to URL May 26, 2021
@rsc rsc modified the milestones: Proposal, Backlog May 26, 2021
@odeke-em
Copy link
Member

odeke-em commented Mar 5, 2022

Kindly cc-ing @willpoint and @kirbyquerby to send a fix.

@gopherbot
Copy link

Change https://go.dev/cl/391294 mentions this issue: net/url: add OmitHost bool to url.URL

@dmitshur dmitshur modified the milestones: Backlog, Go1.19 Mar 14, 2022
longsleep added a commit to longsleep/caddy that referenced this issue Dec 6, 2022
Since Go 1.19 the net/url URL.Parse function is changed when there is no
host authority. This change broke the unix socket URL handling in the
reverse proxy code. This change adds auto detection for the behavior
and thus supports both old and new Go versions without having to add
build constraints.

Related: golang/go#46059
@golang golang locked and limited conversation to collaborators Mar 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

5 participants