Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/text: incorrect convert from utf8 to gb18030 #41990

Closed
xiongjiwei opened this issue Oct 15, 2020 · 3 comments
Closed

x/text: incorrect convert from utf8 to gb18030 #41990

xiongjiwei opened this issue Oct 15, 2020 · 3 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@xiongjiwei
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.13.8 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

What did you do?

https://play.golang.org/p/HzLLJkSCkd3

What did you expect to see?

AAA1

What did you see instead?

83389837

@gopherbot gopherbot added this to the Unreleased milestone Oct 15, 2020
@robpike
Copy link
Contributor

robpike commented Oct 15, 2020

You are printing the output using %x, which will show the bytes of the result as hex values. Use %s instead. The output will still not be AAA1, but it will be more plausible. Here is a slightly rewritten version. https://play.golang.org/p/krw1NpUSeCV

Please let us know if this addresses the issue for you.

@xiongjiwei
Copy link
Author

xiongjiwei commented Oct 15, 2020

@robpike thank you for your commet, but I use %x on purpose, unicode codepoint \u+e000 encoded by gb18030 codepoint should be AAA1.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 15, 2020
@kennytm
Copy link

kennytm commented Sep 30, 2021

For reference, "\x83\x38\x98\x37" is the GB18030 encoding of U+F014, not U+E000. Both characters are within the PUA though.

(Also the bug is still not fixed.)

@golang golang locked and limited conversation to collaborators Sep 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants