Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: net/http: normalize comma-separated headers #62471

Open
CAFxX opened this issue Sep 6, 2023 · 2 comments
Open

proposal: net/http: normalize comma-separated headers #62471

CAFxX opened this issue Sep 6, 2023 · 2 comments
Labels
Milestone

Comments

@CAFxX
Copy link
Contributor

CAFxX commented Sep 6, 2023

Consider the following example:

    // normally the headers arrive from untrusted sources, this is just for exposition
	h := http.Header{}
	h.Add("Accept", "bar, baz")
	h.Add("Accept", "foo")

	// Intuitively this should print: []string{"bar", "baz", "foo"}
	// instead of:                    []string{"bar, baz", "foo"}
	fmt.Printf("%#v\n", h.Values("Accept"))

I am not sure we can or want to modify the behaviour of Values but I would argue we should still offer a way to get a "normalized" list of values - as in any case if someone really receives a request with those headers they will definitely want to perform that normalization. This is especially important as not doing it (as the current functions "lead" to do) may lead to header-confusion issues.

Just to define "normalization", I am thinking of something like this:

for _, s := range h.Values(key) {
	for _, v := range strings.Split(s, ",") {
		values = append(values, strings.Trim(v, " \t"))
	}
}

that, AFAICT, is compliant with RFC 7230.

Assuming we don't want to change the behavior of the existing functions, this could be exposed (the name is a strawman) as

func (h Header) ValuesParsed(k string) (values []string) {
	for _, s := range h.Values(k) {
		for _, v := range strings.Split(s, ",") {
			values = append(values, strings.Trim(v, " \t"))
		}
	}
	return
}

(a proper implementation will likely attempt to minimize overhead, so possibly something closer to this example)

As a side note, it's a bit unfortunate that Values is already taken, as RFC 7230 uses the notation #(values) explicitly to mean a comma-separated list of values (that is semantically equivalent to having those values, in order, spread across multiple headers with the same name).

A sender MUST NOT generate multiple header fields with the same field
name in a message unless either the entire field value for that
header field is defined as a comma-separated list [i.e., #(values)]
or the header field is a well-known exception (as noted below).

A recipient MAY combine multiple header fields with the same field
name into one "field-name: field-value" pair, without changing the
semantics of the message, by appending each subsequent field value to
the combined field value in order, separated by a comma. The order
in which header fields with the same field name are received is
therefore significant to the interpretation of the combined field
value; a proxy MUST NOT change the order of these field values when
forwarding a message.

@gopherbot gopherbot added this to the Proposal milestone Sep 6, 2023
@ianlancetaylor
Copy link
Contributor

CC @neild @bradfitz

@seankhliao
Copy link
Member

I think this could be rolled into #41046

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants