Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: add go cli version to the User-Agent string #35699

Closed
ankushchadha opened this issue Nov 19, 2019 · 14 comments
Closed

proposal: cmd/go: add go cli version to the User-Agent string #35699

ankushchadha opened this issue Nov 19, 2019 · 14 comments

Comments

@ankushchadha
Copy link

ankushchadha commented Nov 19, 2019

What: The proposal is to add Go CLI version number to the User-Agent string.

Why: This information will help goproxies (such as gocenter.io) to understand the uptake of new versions of Go and will help to build a better ecosystem.

@gopherbot gopherbot added this to the Proposal milestone Nov 19, 2019
@katiehockman katiehockman changed the title Proposal: Add go cli version to the User-Agent string proposal: cmd/go: add go cli version to the User-Agent string Nov 19, 2019
@katiehockman
Copy link
Contributor

/cc @bcmills @jayconrod

@jayconrod
Copy link
Contributor

I asked about this a while back when proxy.golang.org was being developed. We were considering different behavior for the <modpath>@<version>/@v/list endpoint based on the Go client version.

@rsc argued against supporting this. I think his reasoning was that this makes it more complicated for proxies to cache responses from other proxies.

At the moment, the simplest way to implement a proxy is to serve files from a module cache directory tree. If a request comes in and something is missing, you can run go mod download to fetch from downstream proxies or origin servers. If the Go version affected the result, this wouldn't work because you might have run go mod download previously with the wrong version of go.

I know the original motivation for this proposal was analytics and not changing behavior, but I expect that if the client version were included anywhere, it would be way too tempting to use it to alter behavior.

In any case, the client version may not be useful for logging. Clients have their own caches, and proxy requests frequently come from other proxies and CI systems, not necessarily the go command itself. So the number of proxy requests from a given go version doesn't correlate all that well to usage of that version.

@hyangah
Copy link
Contributor

hyangah commented Nov 19, 2019

I was tempted to use the information of the client's Go version to change responses accordingly if available (for example, it could help proxy's Go version migration) but as @jayconrod pointed out, it introduces some complexities to caching. User-Agent header field is for information only and some of public http caching services and CDNs ignore it in caching decisions. If we ever want to provide Go version to allow proxies to alter the behavior, the information should be either query parameters or paths.

But I think this is still useful for enterprise users whose proxy is the very first in the proxy chain - detecting old go clients and warning about it, etc. If a large fraction of traffic doesn't have the user-agent because they use a special ci systems or proxies, that's also another useful information to consider when trying to analyze the proxy logs.

@rsc
Copy link
Contributor

rsc commented Nov 20, 2019

For the specific case of the go command's user agent, we very much do not want proxies to start (or even to be able to start) sending different responses to different versions of the go command. That doesn't work with plain HTTP proxies in the middle.

For the use case mentioned by @ankushchadha, namely gathering statistics and telemetry about users of a Go proxy, that's a significant change to make, with important privacy implications. Since Go 1, we have never done that. (There were some public stats about package usage collected way back in 2010 in the experimental goinstall. When we turned that into go get, the reporting went away.)

This is not something we should do lightly - it is a slippery slope to start here. Where does it stop? I don't think we're ready for that very large conversation today. For the foreseeable future I think it's something we should not do at all. If we really needed to gather stats for something, I think we'd want to have it be opt-in, not on by default.

@hyangah
Copy link
Contributor

hyangah commented Nov 22, 2019

Even though I don't think go cli version is more private than what proxy protocol already carries, I understand @rsc's concern about the trend and agree that we need to keep the kind of info sent to the proxy/origin at the minimum (only those necessary for the functionality).

I am fine with withdrawing this proposal - enterprises can track the tool usage with a better and more reliable mechanism than this.

@ankushchadha What do you think?

@yoav
Copy link

yoav commented Nov 22, 2019

The user-agent header is used by pretty much every single package manager that exists today in the standard RFC format where the client version is part of the "product token" value.

The original intent behind providing this header includes, among other reasons, statistical information. There are some good reasons for this (more below) and I'm not sure why exposing the version of golang used by a client amounts to a privacy smell, especially when this is already an established industry standard.

If the question is a question of trust - not allowing proxy implementers to fall into the trap of crafting responses that take into account the client version, then I think there are many pitfalls that can tempt a careless implementor to choosing the wrong path. One can also ask himself if the default "Go-http-client/1.1" string that is currently returned by clients is not misleading, as it requires further investigation to realize that this is merely a pseudo value, that does not necessarily reflect the goal of the http spec. To remove any confusion, it can be explicitly stated in the modules spec that proxy implementers should not rely on the user-agent value for creating different responses.

Contrary to what may be the common belief, enterprises do not track or even have practical ways to track versions of developer tools installed on a developer machine or on a CI host - it is simply not something traditional IT would deal with. The only way for someone running an enterprise proxy to evaluate the usage of client versions is look at the user agent.
This is required not just for tracking, for example, the uptake of a new version, but also to be able to support users - determine things like the use of vulnerable clients, the ability to safely deprecate support for older clients, the ability to remove an unsafe tls cypher that is not required by newer client versions, etc.
These are all real world examples and, currently, it is impossible to achieve without knowing the golang version being used. I sincerely hope this makes sense and that this proposal can be reconsidered.

Thanks

@ianlancetaylor
Copy link
Contributor

@yoav I think the uses are clear. But changing this value exposes information to far more than just enterprises tracking their internal users. It exposes the information about what Go release someone is using across the entire Internet. Perhaps that is fine, but it's not obviously fine. As @rsc says above, this seems like something that has to be opt-in, not on by default.

@yoav
Copy link

yoav commented Nov 25, 2019

Thanks, @ianlancetaylor. Putting aside the question of whether the version information should be considered sensitive, with TLS, this is only exposed to the proxy endpoint, or not exposed at all when using an organization-wide proxy (since it will commonly cache remote modules using its own UA).

Having the real version in the UA product token is tremendously helpful for endpoint maintainers and my feedback is coming from JFrog's experience in supporting production proxies for thousands of organizations.
Of course, this is a call the golang team needs to make, but not including this goes against accepted industry standards, making golang a unique exception in that respect. An "obfuscate-by-default" approach probably means that, practically, the current situation will stay as is. A different, less consistent, approach may be to include this information in module proxy requests, rather than globally.

@ianlancetaylor
Copy link
Contributor

Telemetry is a complicated topic. The phrasing of this issue suggests changing the User-Agent string in the net/http package, which would affect many Go programs. Exposing a version number only in module proxy requests would be a different matter, but even there it's not clear that different organizations want to report their version usage to any proxy that they use. I wouldn't describe this as obfuscate-by-default; I would call it private-by-default.

Perhaps we need to think in terms of some way that people can choose whether to report version information, so that enterprises can use that information when using a private proxy. For example, perhaps the GOPROXY environment variable could indicate whether to report the version number. I really don't know if that is a good idea, but it might address the enterprise use case without leading everyone to expose what version of Go they are using to every proxy they use.

@bcmills
Copy link
Contributor

bcmills commented Nov 25, 2019

@ianlancetaylor, my interpretation is that the request is for the go command itself to set its User-Agent, rather than changing the default net/http User-Agent string for each Go release.

@yoav
Copy link

yoav commented Nov 26, 2019

@bcmills, this was the original intent, I believe.
@ianlancetaylor, "obfuscate" because the pseudo value being sent today (Go-http-client/1.1) obscures the go version. Maybe it's too strong of a word, but since there is no Go-http-client library that is versioned separately to go releases, I find the version number misleading.

@ianlancetaylor
Copy link
Contributor

Sorry for the confusing use of 1.1. It's not intended to be misleading. The 1.1 is not intended to reflect the Go version being used. The docs say

// NOTE: This is not intended to reflect the actual Go version being used.
// It was changed at the time of Go 1.1 release because the former User-Agent
// had ended up on a blacklist for some intrusion detection systems.
// See https://codereview.appspot.com/7532043.

@rsc rsc added this to Active in Proposals (old) Nov 27, 2019
@rsc
Copy link
Contributor

rsc commented Nov 27, 2019

Regardless, we have not been exposing this information to date, and we should not start without careful thought, especially when the proposed rationale is for collecting information about users. That's a big deal.

For collection from willing enterprises, that's a bigger discussion too, and likely would encompass a lot more than GOPROXY go command versions - build times, other performance, etc.

This specific request seems like a likely decline. We can have a broader discussion elsewhere.

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Dec 4, 2019
@rsc
Copy link
Contributor

rsc commented Dec 4, 2019

No final comments, so declining.

@rsc rsc closed this as completed Dec 4, 2019
@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Dec 11, 2019
@golang golang locked and limited conversation to collaborators Dec 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

9 participants