-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: NFD-normalized unicode identifiers result in malformed error messages #33271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think the spec is pretty clear.
and
and
What is vague? There are outstanding issues to broaden the definition of identifiers, but as it stands the spec quite clearly does not include combining characters. |
Ok. Yeah I misread that originally, missing the definition of |
I'm not following why just the error message should be fixed: why that code is illegal? The spec probably should switch from using the |
For reference, this issue was not a feature request for that. When I originally did this what I considered a bug was the error message not properly protecting the use of the combining mark - such that it composes with the quotes on compatible text renders, as seen at the end of this error (rendered in atom): The reason I had referenced the spec was I was checking what the appropriate definition and missed the As of now, I am not very well acquainted with go's parser, and have not yet pinned down the creation point of this error. Later (probably after gophercon) I can try to read through and make a fix. In order to see how other systems handled this, I looked at the Wikipedia page on combining characters. Wikipedia prepends a
|
@jadr2ddude Thanks for clarifying. What you are asking makes sense, but I'm not sure "combining characters in identifiers" is a common enough mistake to warrant special handling in the displaying of error messages. Moreover, it seems that applying the combining character to a Also cc @griesemer |
Eh, this isnt too critical and the problem might not be worth the fix. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I put in an NFD-normalized unicode identifier.
Smallest reproducible playground: https://play.golang.org/p/wf6_glAAIRP
Internally, the ä is encoded as two unicode code points rather than one. If it is NFC normalized (one code point), it works. The spec is fairly vague as to how it should handle NFD unicode.
What did you expect to see?
Either:
What did you see instead?
On some text renderers, the combining mark actually merges into the quote at the end of the error message.
Solutions
First, the spec should probably be clarified as to how it processes multi-codepoint letters.
Additionally, combining marks should probably not be written into the error message in raw form.
The text was updated successfully, but these errors were encountered: