Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Allow obtaining original header capitalization #37834

Open
JohnRusk opened this issue Mar 13, 2020 · 14 comments
Open

net/http: Allow obtaining original header capitalization #37834

JohnRusk opened this issue Mar 13, 2020 · 14 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@JohnRusk
Copy link

What version of Go are you using (go version)?

1.13

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Windows, x64

What did you do?

Make a HTTP request, read it's headers, send those headers somewhere else.

What did you expect to see?

Headers are preserved exactly

What did you see instead?

Header capitialization gets canonicalized

I realize this has been discussed before (about 7 years ago).

However, in the time since then many users have been adversely affected by this. E.g.
traefik/traefik#466
Azure/azure-storage-azcopy#113

So I'd like to find out, would the Go team consider not a change to the current behaviour, but simply a way for an HTTPResponse to provide an additional map, that maps canonicalizedName -> originalName. If we could just get that map, then those of us who really need header case preservation could use the information it contains to achieve what we need. (Find out the original capitalization when we read a response, and then directly manipulate the outbound request map when we forward that data on).

I'd be happy to contribute the code for the above, if we expected it to be accepted.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 13, 2020
@toothrot toothrot added this to the Backlog milestone Mar 13, 2020
@toothrot
Copy link
Contributor

Hey @JohnRusk! I understand that one of your concerns is with Azure/azure-storage-azcopy#113, in which the API is expecting capitalized headers, despite RFC2616 specifying that header fields are to be treated as case-insensitive.

Issue #5022 was resolved by preserving case on outbound requests:
f0396ca

I don't think it's reasonable to keep a mapping of every header field transformation around for every request. As @bradfitz mentioned, http/2 explicitly lowercases all fields. I feel like making changes here for broken servers (which I understand are out of your control) doesn't seem reasonable.

Is it possible to keep a map of these special header fields around in your application and re-write them on your outbound request? I assume there is a subset that are sensitive, and it is not all HTTP headers. Their casing is preserved on outbound requests as I understand #5022.

/cc @bradfitz @rsc

@JohnRusk
Copy link
Author

Is it possible to keep a map of these special header fields around in your application and re-write them on your outbound request?

Not really, because the set of header fields that exists is customer-determined, and may vary from customer to customer and even from file to file.

I don't think it's reasonable to keep a mapping of every header field transformation around for every request.

What about some kind of hook then, some kind of function that get's called as they are pulled of the wire, and before they are canonlicalized. If the hook is null, it doesn't get called.

e.g.(roughly)

headerName := ... <raw value off the wire>
if request.headerNameNotifier != nil {
   request.headerNameNotifier(headerName)
}
headerName := canonalicalize(headerName)

Then customers like us could provide request.headerNameNotifier and everyone else could leave it nil.

@bradfitz
Copy link
Contributor

There are a few dup bugs in this space with a bunch of conversation. I can't find them at the moment (GitHub search is failing me) but they're somewhere.

@JohnRusk
Copy link
Author

I had trouble searching too. I have managed to find a few though. Are any of these the ones you were thinking of?

@JohnRusk
Copy link
Author

@ACECEO yes, it's a workaround in the sense that it works. No in the sense that it's a totally different product, with a different emphasis and style of usage. For customers who use the tool I work on, AzCopy, this Go issue remains a blocker.

@JohnRusk
Copy link
Author

@bradfitz Any update re thoughts on this? As noted above, the key issue is that we currently can't write case-preserving code that reads headers and forwards them on. In our tool, AzCopy, this is a big deal for certain customers who need to move their data around in Azure.

@zezha-msft
Copy link

Hi @ACECEO @toothrot, any additional thought?

@cy33hc
Copy link

cy33hc commented Jun 4, 2020

RFC2616

I've read the RFC2616, it just mentions that the header is case-insensitive. Nothing says that you should modify the the headers as you forward them.

@bradfitz
Copy link
Contributor

bradfitz commented Jun 4, 2020

I feel like I've replied to this a number of places, but I'll summarize here as well.

  • HTTP/2 makes this whole issue moot. So this bug only matters for HTTP/1<->HTTP/1 interactions, which becomes less relevant with each passing month.
  • Code that's sensitive to HTTP/1 header case is already broken; catering towards such code implicitly encourages such brokenness or at least allows it to continue, without providing pressure on those applications to get fixed

As such, any fix here must have super minimal cost, both in terms of API surface (cogntitive load cost, reading more godoc) and runtime cost (we shouldn't allocate or populate a new data structure with this info unless the calling code opted in to wanting that). Notably, we can't change the representation of http.Header at this point.

It probably needs to be opt-in at the http.Server level. Perhaps a new bool there. That means "middleware" can't assume the original case is present, and can't ask for it to be present retroactively, as we'd only parse/preserve it if they opted in. So perhaps that's too inconvenient.

Maybe we just add a new RawHTTP1Header []byte field to http.Request, documenting that it's only for HTTP/1 and it's the exact bytes from the Request-Line to the final closing newlines. Then let code outside of the standard library deal with it. It'd also need to be documented that the slice is not a copy and it aliases some internal buffer and is only valid until the next request on the connection. That kinda sucks as an API requirement, but it's not too dissimilar to e.g. https://golang.org/pkg/bufio/#Reader.ReadSlice. And it sucks less than making all users pay for an allocation that very little code will ever use.

If somebody has an alternate API proposal or implementation I'll take a look, assuming it meets these general requirements.

@JohnRusk
Copy link
Author

JohnRusk commented Jun 4, 2020

Hi Brad. That suggestion about RawHTTP1Header sounds like a good option. Minimal impact, but enough to let us do what we need to. I presume the internal buffer, which it aliases, already exists. Is that right? If so, what about exposing it as method instead of a field? Perhaps called something like GetRawHTTP1Header() or CopyRawHttp1Header()? The method would make a copy from the internal buffer. But the copy would only happen if the method was called. So the only users who would pay the price of the allocation are those who actually need and use this feature. Perhaps that would be cleaner, in terms of API design, because the result of the method call is indeed a copy, and so there's no need to worry about the duration of its validity. (You just need to call the method before the next request is made on the connection).

Will it work client side too? (Just checking because you mention http.Server in the earlier paragraph, but we need it client-side, and I imagine other users will too.)

@bradfitz
Copy link
Contributor

bradfitz commented Jun 5, 2020

Perhaps called something like GetRawHTTP1Header() or CopyRawHttp1Header()? The method would make a copy from the internal buffer.

My reservation with a method is that they render pretty prominently in godoc HTML compared to, say, a struct field.

Perhaps that would be cleaner, in terms of API design, because the result of the method call is indeed a copy, and so there's no need to worry about the duration of its validity.

It would be cleaner, but OTOH this is definitely in the realm of "you should know what you're doing" territory so making a super comfortable & safe API isn't the highest priority.

BTW, the other constraint that would need to be documented on the lifetime of this slice is that it's only valid before the Request.Body is read from. Because that can also advance the buffer.

But now that I think about it more, I'm not even sure that this API works, as I think we only use a 2KB or 4KB bufio.Reader to read from the connection (so that's all we have in memory at a time) and we permit by default up to 1MB of headers (https://golang.org/pkg/net/http/#DefaultMaxHeaderBytes). So RawHTTP1Header []byte on by default won't work, at least in the general case. (And documenting that it only works for "small" HTTP requests seems pretty terrible.)

So we probably do need some sort of opt-in mechanism, and then it can't work in HTTP middleware packages that don't have control over the http.Server setup.

Will it work client side too?

Yeah, it probably should. I don't think we'd want to accept a change that only did one, lest it turn out that whatever API we pick isn't sufficient for the other. Seeing it work with both would be a good sign that the API was sufficient.

@JohnRusk
Copy link
Author

JohnRusk commented Jun 6, 2020

Good point re whether it should be documented prominently or not. (I agree with you that "not prominent" is preferable).

The question of whether it only works on small requests seems key. I think folks who have meaningful info in headers (such as some of our users) could easily have more than 4KB. Do you have any other suggestions? Shall we (our team here) see if we can come up with any alternative that meets the general requirements that you've outlined?

@gufertum
Copy link

As I understand this problem, this prevents many users to just use traefik as a plugin-proxy, because it ALTERS the headers. There is a ton of software which is using case-sensitive headers and a transparent proxy should not change them...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants