net/http: Request.Form field data may include HTML entities #45479
Labels
FrozenDueToAge
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
I'm no HTTP expert, so this could be working as intended, but I notice that when an HTML form is posted, the document's encoding is also used to encode the form data. So, if the document is Latin-1, and a text field contains "馃槂", which cannot be represented as Latin-1, then the client (Chrome, in my case) produces a URL
?f=%26%23128515%3B
, which is the URL-encoding of"😃"
, which is the HTML entity reference for U+1f603, a smiley face.Is Chrome's behavior, of using an HTML entity reference
&#...;
when the form encoding cannot express the form data,(a) normal for a client, or
(b) Chrome attempting to fail gracefully when asked to do the impossible?
If (a), are servers expected to handle HTML entity references in form data? I could see no mention of this expectation in the net/http package code or docs. I also can't see how a server could distinguish a Chrome-introduced HTML entity reference from a form that literally contained that sequence, which makes me think (a) is not the answer.
If (b) , is the usual advice "don't let that happen", in other words, make sure the HTML document's encoding is explicitly set to something like UTF-8?
A word of documentation in net/http might help. StackOverflow and the usual sources were surprisingly unhelpful.
The text was updated successfully, but these errors were encountered: