Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: Windows encoding is incorrect when output is piped or redirected #7826

Closed
gopherbot opened this issue Apr 20, 2014 · 4 comments
Closed

Comments

@gopherbot
Copy link

by zr95.vip:

Before filing a bug, please check whether it has been fixed since the
latest release. Search the issue tracker and check that you're running the
latest version of Go:

Run "go version" and compare against
http://golang.org/doc/devel/release.html  If a newer version of Go exists,
install it and retry what you did to reproduce the problem.

Thanks.

What does 'go version' print?

go version go1.2.1 windows/amd64

What steps reproduce the problem?
If possible, include a link to a program on play.golang.org.

1. Build the famous helloworld program on golang.org homepage.

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}

And run go build. Suppose the executable is hello.exe.

2. In cmd or Windows Powershell (code page set to cp936):
  .\hello.exe | more
3. OR:
  .\hello.exe > a.txt
  type a.txt

What happened?

It prints:

Hello, 涓栫晫

What should have happened instead?

It should print:

Hello, 世界

Please provide any additional information below.

Only Non-Ascii characters are affected.

The output may vary with different code page settings.

Redirected outputs in cmd and Powershell are different:

.\hello.exe > a.txt

Run this in cmd, a.txt contains:

0000000: 4865 6c6c 6f2c 20e4 b896 e795 8c0a       Hello, .......

This is a valid UTF-8 string. The content is correct if a.txt viewed in editors
supporting UTF-8 encoding.

In Powershell, a.txt contains:

0000000: fffe 4800 6500 6c00 6c00 6f00 2c00 2000  ..H.e.l.l.o.,. .
0000010: 936d 2b68 6b66 0d00 0a00                 .m+hkf....

This is the UTF-8 output, treated as cp936, and encoded into UTF-16.

Problem also show up with godoc:

godoc math/cmplx Rect | more

then there's the problem displaying the symbol θ.
@ianlancetaylor
Copy link
Contributor

Comment 1:

Labels changed: added repo-main, release-none, os-windows.

@alexbrainman
Copy link
Member

Comment 2:

Your program is supposed to output utf-8 encoded "Hello, 世界" string. And it does
that. (please check contents of a.txt) What "more" and "type" commands do with it is a
different matter altogether. I don't see how this has anything to do with Go.
I can see how things "are not nice". But I don't see how we can improve the situation.
Perhaps you have some suggestions.
Alex

Status changed to WaitingForReply.

@gopherbot
Copy link
Author

Comment 3 by zr95.vip:

I agree that Go did the right thing and that it's more like an issue of Windows commands.
Maybe Windows users should build their utf-8 version of "more" and "type" tools. And
there is not so much that Go can do.
There are other encoding issues with Go's builtin functions print, println and panic.
I'll create another entry for that.

@minux
Copy link
Member

minux commented Apr 25, 2014

Comment 4:

Status changed to Retracted.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants