x/text/encoding: UTF-16 decoder handles unpaired surrogates incorrectly #39492
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
When decoding some strings containing unpaired surrogates, UTF-16 decoder produces wrong number of
\ufffd
runes. Some examples:\xdc\x00\xdc\x00
: expected result\ufffd\ufffd
(two copies of\ufffd
), actual result\ufffd
.\xd8\x00\x00
: expected result\ufffd
, actual result\ufffd\ufffd
.The expected results are derived from a WhatWG spec.
Also, the name of internal function
isHighSurrogate
is misleading: it actually checks whether the argument is a low surrogate.Code to reproduce:
The text was updated successfully, but these errors were encountered: