Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Header.Get should return the last header rather than the first #51493

Open
adam-p opened this issue Mar 4, 2022 · 12 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@adam-p
Copy link
Contributor

adam-p commented Mar 4, 2022

What version of Go are you using (go version)?

$ go version

go version go1.17 windows/amd64

Does this issue reproduce with the latest release?

Yes.

Description

I recently did research into the misuse of the X-Forwarded-For header to get the "real" client IP, especially by rate limiters. Of the six Go-based rate limiter implementations I looked at, every single one was using http.Header.Get and running the risk of falling prey to spoofing, resulting in rate limiter escape and memory exhaustion. (Concrete non-redacted example here.)

According to the HTTP/1.1 RFC (2616)1:

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.

I think that's saying that the order of same-name headers is important for "comma-separated list" headers and no others (since there shouldn't be multiple headers for non-list headers). So whether Header.Get returns the first or last header doesn't matter for non-list headers.

Two such "comma-separated list" headers are X-Forwarded-For and Forwarded. With those headers, the leftmost part of the list is completely untrustworthy, while the rightmost is added by trusted reverse proxies (if they exist). So it matters very much whether Header.Get returns the first or the last.

In general, if the "real" client IP is to be used for anything remote security-related, it must be the rightmost value that's used (or rightmost-ish, depending on number of reverse proxies). The leftmost value can be used, but since it's trivial to spoof (just add the XFF header to the request), it must be done with great care.

So, I argue that:

  • first or last is irrelevant to non-list headers
  • one of the most common list headers is X-Forwarded-For (weasel words: "one of")
  • the most common way to use X-Forwarded-For is (or should be) to take the rightmost value of all present XFF headers
  • taking the rightmost value from the first XFF header is a security failure
  • for many other multi-header values -- like Set-Cookie -- it doesn't matter if the first or last is used (weasel word: "many")
  • ... therefore Header.Get should return the last header

I haven't done a comprehensive survey of different multi-headers, so I can't make those weasel-word statements more concrete.

As an example from another language, Twisted uses the last header value when calling the equivalent of Header.Get.

(For even more thoughts about multiple XFF header problems, see my blog post.)

The simplest fix for this would be change the behaviour of Header.Get. This would cause problems with code that expects to get the first header -- the obvious example being code that actually wants the leftmost XFF IP value.

Another approach would be to add a Header.GetLast function. Unfortunately, a lot of people would still naively pick Header.Get, and it obviously wouldn't fix any existing incorrect uses of Header.Get. It would still be an improvement, however -- right now people look at the API and thing "I don't want a slice, just a value" and they have only one choice.

Another possibility might be to add a go vet check for Header.Get combined with "split by comma" combined with "take last". (Except sometimes it's "reverse-find command and subslice from there".) (I don't know anything about how go vet works, so no idea if that's feasible.)

Another approach would be to break the API and add an argument to Header.Get. But that's obviously not going to happen.

ETA another suggestion I thought of below: Add a method that returns all of the headers with the same name combined into a comma-separated list. This is per RFC. It's also what some reverse proxies already do (like AWS ALB). And it makes it easy to correctly get the N-from-rightmost XFF IP.

Footnotes

  1. I believe this is inherited unchanged into HTTP/2.

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 4, 2022
@seankhliao
Copy link
Member

Note net/http.Header.Get currently documents:

Get gets the first value associated with the given key.

Changing it to get the last value is also considered a breaking change.

On X-Forwarded-For, in the environments I work in, there can be multiple layers of proxies, so using the last value would be wrong (points to internal proxy), and the edge proxies already don't trust / discard any incoming X-Forwarded-For. Eg envoy has extensions for trusted client addresses ref

cc @neild

@neild
Copy link
Contributor

neild commented Mar 4, 2022

As @seankhliao says, header.Get is working as documented. For better or worse, there's no way to change it now.

I'm dubious that any API change or vet check can compensate for the need to exercise extreme care around using client-provided headers for security purposes.

@adam-p
Copy link
Contributor Author

adam-p commented Mar 5, 2022

On X-Forwarded-For, in the environments I work in, there can be multiple layers of proxies, so using the last value would be wrong (points to internal proxy)

I think you're conflating the "last header value" with the "last value in the comma-separated list within that header". Using the last value would be wrong, but you'd still be counting from the right so you'd still want the last header.

...Except when your trusted proxy IPs cross header boundaries.

Which makes me think of another, probably better, API addition suggestion: Add a method that returns all of the headers with the same name combined into a comma-separated list. This is per RFC. It's also what some reverse proxies already do (like AWS ALB). And it makes it easy to correctly get the N-from-rightmost XFF IP.

@seankhliao
Copy link
Member

It doesn't really matter which one you get, the point was you cannot trust/use the information in there unless your proxies sanitize / inject extra information.

Structured headers are #41046

@adam-p
Copy link
Contributor Author

adam-p commented Mar 5, 2022

It doesn't really matter which one you get, the point was you cannot trust/use the information in there unless your proxies sanitize / inject extra information.

Yeah, but that's part of my point. If you use the last header you're much more likely to be using the header with the values added by your own proxies than you will be if you use the first. So the current Header.Get design is less-secure-by-default.

As @seankhliao says, header.Get is working as documented. For better or worse, there's no way to change it now.

I know. Part of my goal is to surface the problem to the general consciousness. There are a lot of projects doing this wrong -- even big ones -- and this bit of the stdlib design is part of the problem.

@beoran
Copy link

beoran commented Mar 5, 2022

The method Get is working is documented. I think the only thing we could do is add a warning to the documentation of .Get and/or of Headers about the security risks involved.

@adam-p
Copy link
Contributor Author

adam-p commented Mar 5, 2022

The method Get is working is documented. I think the only thing we could do is add a warning to the documentation of .Get and/or of Headers about the security risks involved.

That is probably the optimal realistic outcome.

@francislavoie
Copy link

Related: #50465

@francislavoie
Copy link

Also, I'd like to suggest that an alternative here would be to add a new .Last method which would be the same as .Get except returns the last header value. And also document the reason why someone might use one over the other. i.e. signature:

func (h Header) Last(key string) string

@beoran
Copy link

beoran commented Mar 6, 2022

Is the last header really the best one, though? In this case all headers should probably be considered.

@francislavoie
Copy link

The point is, it depends on the usecase whether the first or last or all values should be considered. But right now, Go's API only makes it easy to look at the first or get a slice of all the values, and getting the last requires doing a bit more work:

func lastHeaderValue(h http.Header, field string) (value string, ok bool) {
	values := h.Values(field)
	if len(values) == 0 {
		return "", false
	}
	return values[len(values)-1], true
}

@adam-p
Copy link
Contributor Author

adam-p commented Mar 6, 2022

While I think that "get last" is better than "get first", it's still not ideal. For example, getting the "rightmost" XFF IP is actually "count from the right depending on number of trusted proxies; or search from the right for the first IP that is not on the 'trusted' list; or search from the right for the first non-private/internal IP". That means that the desired IP might not actually be in the last header, since the search might need to go back further.

If we're considering a new method, then I think that the best "get single header value" would be either:

  • Header.Full(headerName) -- returns comma-separated join of all matching headers
  • Header.Full(headerName, separator string) -- joined with arbitrary separator

This is partly inspired by looking at NodeJS's message.headers, which does this:

Duplicates in raw headers are handled in the following ways, depending on the header name:

  • Duplicates of age, authorization, content-length, content-type, etag, expires, from, host, if-modified-since, if-unmodified-since, last-modified, location, max-forwards, proxy-authorization, referer, retry-after, server, or user-agent are discarded.
  • set-cookie is always an array. Duplicates are added to the array.
  • For duplicate cookie headers, the values are joined together with '; '.
  • For all other headers, the values are joined together with ', '.

So the default is comma (good for XFF; adheres best to RFC 2616), but cookie gets a semicolon. I think comma is best for set-cookie, as that character seems to be forbidden in the value (and is RFC-compliant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants