New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: UTF in HTTP header field value #49627
Comments
@neild, any idea what to do here? |
This looks like there must be a duplicate, or perhaps there is some other reason. |
Percent-encoding the header is clearly not correct. If you want to encode a header value you should do that yourself, since there is (so far as I know) no universal specification for the encoding of header values. We do perform some validation of header values, but it is currently limited to changing newlines to whitespace. This does prevent an invalid header value from completely changing the content of a request, but it permits invalid header values. If we do anything here, I think it should be to drop invalid header values entirely. I suspect someone has managed to depend on the current behavior, however, and sending a bad header and receiving an error back is arguably more informative to the user than dropping the header altogether. |
https://datatracker.ietf.org/doc/html/rfc2616 https://datatracker.ietf.org/doc/html/rfc2047 As it is documented standard would be nice to have it in implementation. |
RFC 7230 supersedes 2616, and says:
I also note that the example above doesn't appear to be an invalid header value, since it contains only characters which match the I still think that if you've got a header field that should be RFC-2047 encoded, then you should apply that encoding yourself. |
That does not look like an invalid header field value. It would be an invalid header field name, but as @neild pointed out it's valid to have arbitrary data in header fields according to RFC 7230. The reason it renders as |
Note that we will send an invalid header field value if you put actually invalid data in the value. e.g., We don't have a good path to return errors in header value validation to the user, however, so our options are limited to eliding invalid data, dropping the header, sending the bad value and letting the server sort it out, or (not a good idea) panicking. |
If there are characters that should never go to the header field value they should be escaped or removed. When I use a library I expect that it will do these validations for me. Thank you for your patience. Golang has the best community! |
What version of Go are you using (
go version
)?1.17.3
Does this issue reproduce with the latest release?
Yes
What did you do?
Using this: https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/net/http/header.go;l=28
w.Header().Add("utf-test", "ąčęėįšųū„“ž")
What did you expect to see?
utf-test: %C4%85%C4%8D%C4%99%C4%97%C4%AF%C5%A1%C5%B3%C5%AB%E2%80%9E%E2%80%9C%C5%BE
What did you see instead?
utf-test: ����įšųū��ž
Current
http/header
implementation is automatically producing invalid HTTP requests if UTF is used in the header field value. So I guess it would be way better to use percent-encoding even though it is not in RFC.The text was updated successfully, but these errors were encountered: