Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: Program exits abruptly with the 語 character #50105

Closed
gkamathe opened this issue Dec 11, 2021 · 4 comments
Closed

cmd/go: Program exits abruptly with the 語 character #50105

gkamathe opened this issue Dec 11, 2021 · 4 comments

Comments

@gkamathe
Copy link

gkamathe commented Dec 11, 2021

I was playing around with a simple program to see how bytes/runes work

$ cat main.go 
package main

import (
	"fmt"
)

func main() {
	s := "hello"
	for i := 0; i < len(s); i++ {
		c := s[i]
		fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)
	}

	for i, rchar := range s {
		fmt.Printf("Rune at byte position : %d is : %#U\n", i, rchar)
	}
}
$ 

Running it gave the expected results

$ go run main.go 
index : 0 has char : h which is hex : 0x68
index : 1 has char : e which is hex : 0x65
index : 2 has char : l which is hex : 0x6c
index : 3 has char : l which is hex : 0x6c
index : 4 has char : o which is hex : 0x6f
Rune at byte position : 0 is : U+0068 'h'
Rune at byte position : 1 is : U+0065 'e'
Rune at byte position : 2 is : U+006C 'l'
Rune at byte position : 3 is : U+006C 'l'
Rune at byte position : 4 is : U+006F 'o'
$ 

Next I thought of using words from a different language for s

$ cat main.go 
package main

import (
	"fmt"
)

func main() {
	// s := "hello"
	s := "日本語"
	for i := 0; i < len(s); i++ {
		c := s[i]
		fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)
	}

	for i, rchar := range s {
		fmt.Printf("Rune at byte position : %d is : %#U\n", i, rchar)
	}
}
$ 

Did you notice what happened ? The second for loop never ran ?
What ?

$ go run main.go 
index : 0 has char : æ which is hex : 0xe6
index : 1 has char :  which is hex : 0x97
index : 2 has char : ¥ which is hex : 0xa5
index : 3 has char : æ which is hex : 0xe6
index : 4 has char :  which is hex : 0x9c
index : 5 has char : ¬ which is hex : 0xac
index : 6 has char : è which is hex : 0xe8
index : 7 has char : ª which is hex : 0xaa
index : 8 has char : $ 

Ok, next I tried to insert a fmt.Println to print random "======"

$ cat main.go 
package main

import (
	"fmt"
)

func main() {
	// s := "hello"
	s := "日本語"
	for i := 0; i < len(s); i++ {
		c := s[i]
		fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)
	}

	fmt.Println("========================")
	for i, rchar := range s {
		fmt.Printf("Rune at byte position : %d is : %#U\n", i, rchar)
	}
}
$ 

Same behavior ? So the problem lies in the first for loop

$ go run main.go 
index : 0 has char : æ which is hex : 0xe6
index : 1 has char :  which is hex : 0x97
index : 2 has char : ¥ which is hex : 0xa5
index : 3 has char : æ which is hex : 0xe6
index : 4 has char :  which is hex : 0x9c
index : 5 has char : ¬ which is hex : 0xac
index : 6 has char : è which is hex : 0xe8
index : 7 has char : ª which is hex : 0xaa
index : 8 has char : $ 

To narrow down the issue, I chose a different set of characters from the same language

$ cat main.go 
package main

import (
	"fmt"
)

func main() {
	// s := "日本語"
	s := "こんにちは"
	for i := 0; i < len(s); i++ {
		c := s[i]
		fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)
	}

	for i, rchar := range s {
		fmt.Printf("Rune at byte position : %d is : %#U\n", i, rchar)
	}
}
$ 

This time it ran fine, second for loop executed. Hmmm

$ go run main.go 
index : 0 has char : ã which is hex : 0xe3
index : 1 has char :  which is hex : 0x81
index : 2 has char :  which is hex : 0x93
index : 3 has char : ã which is hex : 0xe3
index : 4 has char :  which is hex : 0x82
index : 5 has char :  which is hex : 0x93
index : 6 has char : ã which is hex : 0xe3
index : 7 has char :  which is hex : 0x81
index : 8 has char : « which is hex : 0xab
index : 9 has char : ã which is hex : 0xe3
index : 10 has char :  which is hex : 0x81
index : 11 has char : ¡ which is hex : 0xa1
index : 12 has char : ã which is hex : 0xe3
index : 13 has char :  which is hex : 0x81
index : 14 has char : ¯ which is hex : 0xaf
Rune at byte position : 0 is : U+3053 'こ'
Rune at byte position : 3 is : U+3093 'ん'
Rune at byte position : 6 is : U+306B 'に'
Rune at byte position : 9 is : U+3061 'ち'
Rune at byte position : 12 is : U+306F 'は'
$ 

I focused on the problematic characters picking one at a time

s := "日"

Program worked fine for this character

$ go run main.go 
index : 0 has char : æ which is hex : 0xe6
index : 1 has char :  which is hex : 0x97
index : 2 has char : ¥ which is hex : 0xa5
Rune at byte position : 0 is : U+65E5 '日'
$ 

Moved to the next character

s := "本"

Again worked fine

$ go run main.go 
index : 0 has char : æ which is hex : 0xe6
index : 1 has char :  which is hex : 0x9c
index : 2 has char : ¬ which is hex : 0xac
Rune at byte position : 0 is : U+672C '本'
$ 

Which leaves the final character

s := "語"

And as expected the program seems to exit ? Second for loop never printed

$ go run main.go 
index : 0 has char : è which is hex : 0xe8
index : 1 has char : ª which is hex : 0xaa
index : 2 has char : $ 

I tried running the program on the Go playground and it seemed to work fine there
I am using the following version of Go on a Fedora distribution

$ go version
go version go1.16.8 linux/amd64
$ 

If you notice where the exit happens

index : 2 has char : $ 

It seems its coming from this line, so is %c or %x the problem here ?
Anybody else able to reproduce it on the exact same version ?

fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)

I casually tried to reproduce the issue on various different online Go playgrounds (non-Google) too, by adding the line below to programs

fmt.Println(runtime.Version())

I am also seeing the issue on below version for an online playground here (https://www.onlinegdb.com/online_go_compiler)

go1.12.7
@seankhliao
Copy link
Member

The program doesn't terminate, just your terminal can't display what it considers to be invalid output.

Closing as not a bug in go

@zyxkad
Copy link

zyxkad commented Dec 11, 2021

Hi, can you show your platform?
I try to run this program on my computer

$ go version
go version go1.17.2 darwin/amd64

I couldn't find any problem
Screen Shot 2021-12-11 at 9 20 10 AM
Screen Shot 2021-12-11 at 9 21 00 AM

@gkamathe
Copy link
Author

@seankhliao In that case shouldn't the 2nd for loop at least run and print Output beginning with "Rune at byte position..." ? even if it doesn't print the actual characters ?

@zyxkad I am running Fedora on my laptop, Already gave the Go version above, Could you try this Golang online compiler here https://www.onlinegdb.com/online_go_compiler and try out the below program ? You can see that 2nd loop doesn't show all the characters either

package main

import (
	"fmt"
	"runtime"
)

func main() {
	s := "日本語"
	for i := 0; i < len(s); i++ {
		c := s[i]
		fmt.Printf("index : %d has char : %c which is hex : 0x%x\n", i, c, c)
	}

	for i, rchar := range s {
		fmt.Printf("Rune at byte position : %d is : %#U\n", i, rchar)
	}

    fmt.Println(runtime.Version())
}

Here is the output I see on the above Golang online playground

index : 0 has char : æ which is hex : 0xe6
index : 1 has char :  which is hex : 0x97
index : 2 has char : ¥ which is hex : 0xa5
index : 3 has char : æ which is hex : 0xe6
index : 4 has char :  which is hex : 0x9c
index : 5 has char : ¬ which is hex : 0xac
index : 6 has char : è which is hex : 0xe8
index : 7 has char : ª which is hex : 0xaa
index : 8 has char : '
Rune at byte position : 3 is : U+672C '本'
Rune at byte position : 6 is : U+8A9E '語'
go1.12.7

@ianlancetaylor
Copy link
Contributor

For questions about how Go works and how to use it, please use a forum, not the issue tracker. See https://golang.org/wiki/Questions. Thanks.

@golang golang locked and limited conversation to collaborators Dec 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants