Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/sys/windows: use windows.ReadFile read chinese failed #54389

Closed
fzdwx opened this issue Aug 11, 2022 · 11 comments
Closed

x/sys/windows: use windows.ReadFile read chinese failed #54389

fzdwx opened this issue Aug 11, 2022 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@fzdwx
Copy link

fzdwx commented Aug 11, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18 windows/amd64

Does this issue reproduce with the latest release?

Yes, upgread to v0.0.0-20220808155132-1c4a2a72c664 reproduce

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=on 
set GOARCH=amd64                                          
set GOBIN=                                                
set GOCACHE=C:\Users\Administrator\AppData\Local\go-build 
set GOENV=C:\Users\Administrator\AppData\Roaming\go\env   
set GOEXE=.exe                                            
set GOEXPERIMENT=                                         
set GOFLAGS=                                              
set GOHOSTARCH=amd64                                      
set GOHOSTOS=windows                                      
set GOINSECURE=                                           
set GOMODCACHE=C:\Users\Administrator\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\Administrator
set GOPRIVATE=
set GOPROXY=https://goproxy.cn,direct
set GOROOT=C:\Users\Administrator\go\go1.18
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Users\Administrator\go\go1.18\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.18
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=0
set GOMOD=E:\project\cancelreader\go.mod
set GOWORK=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\ADMINI~1\AppData\Local\Temp\go-build2492465113=/tmp/go-build -gno-r
ecord-gcc-switches

What did you do?

read Chinese from console

https://go.dev/play/p/e3o6XINrBIv

demo code
package main

import (
	"fmt"
	"golang.org/x/sys/windows"
	"syscall"
	"unicode/utf16"
)

var fileShareValidFlags uint32 = 0x00000007

func main() {
	conin, err := windows.CreateFile(
		&(utf16.Encode([]rune("CONIN$\x00"))[0]), windows.GENERIC_READ|windows.GENERIC_WRITE,
		fileShareValidFlags, nil, windows.OPEN_EXISTING, windows.FILE_FLAG_OVERLAPPED, 0)
	if err != nil {
		fmt.Println(err)
		return
	}

	bytes, err := f1(conin)
	if err != nil {
		fmt.Println(err)
		return
	}

	fmt.Println(string(bytes))
}

func f1(handle windows.Handle) ([]byte, error) {
	one := make([]byte, 1024)
	var n uint32
	ov, err := newOverlapped()
	if err != nil {
		return nil, err
	}
	defer windows.CloseHandle(ov.HEvent)

	err = windows.ReadFile(handle, one, &n, ov)
	if err != nil && err == syscall.ERROR_IO_PENDING {
		if err = windows.GetOverlappedResult(handle, ov, &n, true); err != nil {
			return nil, err
		}
	}

	return one[:n], nil
}

func newOverlapped() (*windows.Overlapped, error) {
	var ov windows.Overlapped
	h, err := windows.CreateEvent(nil, 0, 1, nil)
	if h == 0 {
		return nil, err
	}
	ov.HEvent = h
	return &ov, nil
}

What did you expect to see?

can parse Chinese correctly

What did you see instead?

there is no correct parsing of Chinese now
image

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 11, 2022
@gopherbot gopherbot added this to the Unreleased milestone Aug 11, 2022
@thanm thanm removed the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 11, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 11, 2022
@thanm
Copy link
Contributor

thanm commented Aug 11, 2022

In your demo code, you read a slice of bytes from the console, then convert to a string:

fmt.Println(string(bytes))

which seems to assume that those bytes are UTF-8 encoded-- are you sure this is the case? If some other encoding is being used, this would explain why you are seeing garbage output.

Example https://go.dev/play/p/_A2_ideqH8H

package main

import (
	"bytes"
	"encoding/binary"
	"fmt"
	"log"
	"os"
	"unicode/utf16"
)

func main() {
	// Write a file containing UTF-16 encoded data.
	const file = "/tmp/foo.data"
	x := utf16.Encode([]rune("世界"))
	of, err := os.OpenFile(file, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0666)
	if err != nil {
		log.Fatal(err)
	}
	if err := binary.Write(of, binary.LittleEndian, x); err != nil {
		log.Fatal(err)
	}
	of.Close()

	// Read back the bytes from that file.
	content, err2 := os.ReadFile(file)
	if err2 != nil {
		log.Fatal(err2)
	}

	// Convert to string (assumes UTF-8).
	fmt.Printf("%s\n", string(content))

	// Better conversion.
	var nx [2]uint16
	b := bytes.NewBuffer(content)
	if err := binary.Read(b, binary.LittleEndian, &nx); err != nil {
		log.Fatalf("binary.Read: %v", err)
	}
	fmt.Printf("after binary.Read: %+v\n", nx)
	v := utf16.Decode(nx[:])
	fmt.Printf("%c %c\n", v[0], v[1])
}

@thanm thanm added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed compiler/runtime Issues related to the Go compiler and/or runtime. labels Aug 11, 2022
@fzdwx
Copy link
Author

fzdwx commented Aug 11, 2022

I'm sure it's utf-8 @thanm

@fzdwx
Copy link
Author

fzdwx commented Aug 11, 2022

@thanm You can take a look at this gif, it's what I just ran with my code
demo

@thanm
Copy link
Contributor

thanm commented Aug 11, 2022

I can't read your GIF -- please post in text format. Thanks.

@fzdwx
Copy link
Author

fzdwx commented Aug 11, 2022

ok,this is code https://go.dev/play/p/e6nVtwf7GLH .

input
output:

好
bytes to string ��

bytes  [186 195 13 10]

input asd
output:

asd
bytes to string asd

bytes  [97 115 100 13 10]

@thanm
Copy link
Contributor

thanm commented Aug 11, 2022

Thanks. In that case, I don't have a good idea why ReadFile would be returning [186 195 13 10], that doesn't have anything to do with '好' as far as I can tell. Have you tried writing '好' to a regular file and then reading that?

@fzdwx
Copy link
Author

fzdwx commented Aug 11, 2022

Ok,I wrote another test code to read data from regular file and console respectively, https://go.dev/play/p/T1qkWs1DOsF

output:

foo data to string 好
foo bytes [229 165 189 10]
=====================
please input word:好 
consle data to string ��
consle bytes [186 195 13 10]

Looks like there is a problem with reading from the console

@thanm thanm added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Aug 11, 2022
@thanm
Copy link
Contributor

thanm commented Aug 11, 2022

Thanks for that.

@golang/runtime @alexbrainman

@tdakkota
Copy link

It seems ReadFile encodes output using system encoding.

package main

import (
	"fmt"

	"golang.org/x/text/encoding/simplifiedchinese"
)

func main() {
	decodeBytes, err := simplifiedchinese.GB18030.NewDecoder().Bytes([]byte{186, 195, 13, 10})
	if err != nil {
		panic(err)
	}
	str := string(decodeBytes)
	fmt.Printf("%q", str)
}

Output:

"好\r\n"
Program exited.

@thanm
Copy link
Contributor

thanm commented Aug 12, 2022

Thank you @tdakkota .

@fzdwx , is there anything to fix in Go here, should I close this bug out?

@fzdwx
Copy link
Author

fzdwx commented Aug 12, 2022

@thanm @tdakkota Thank you all !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants