Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: misleading error for non-ASCII characters outside of strings #58713

Closed
LukeShu opened this issue Feb 24, 2023 · 3 comments
Closed

Comments

@LukeShu
Copy link

LukeShu commented Feb 24, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.1 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lukeshu/.cache/go-build"
GOENV="/home/lukeshu/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/lukeshu/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/lukeshu/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.20.1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/run/user/1000/tmpdir/go-build2806408646=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I fed the JSON decoder a document with a non-ASCII Unicode character outside of a string; as this character is not one of the {}[],:"e+-/0-9/whitespace characters that may exist outside of a string in a JSON document, I expect it to complain about that Unicode character.

package main

import (
	"encoding/json"
	"fmt"
)

func main() {
	var obj any
	err := json.Unmarshal([]byte(`😀`), &obj)
	fmt.Println(err)
}

https://go.dev/play/p/joi025GTvFM

What did you expect to see?

I expected to see the error invalid character '😀' looking for beginning of value, or perhaps invalid character '\xf0' looking for beginning of value.

What did you see instead?

I got the error invalid character 'ð' looking for beginning of value. This is a nonsense error, as the character ð is not present in the input. Note that the UTF-8 encoding of 😀 is []byte{0xf0, 0x9f, 0x98} and that ð is U+00F0; the first byte of the UTF-8-encoded character is being taken as a complete unencoded rune.

@thanm
Copy link
Contributor

thanm commented Feb 27, 2023

This looks very similar to #58680. I am going to go ahead and dup the two bugs, please let me know if you object.

@thanm
Copy link
Contributor

thanm commented Feb 27, 2023

Dup of #58680.

@thanm thanm closed this as completed Feb 27, 2023
@LukeShu
Copy link
Author

LukeShu commented Feb 28, 2023

I'm OK with closing it because @dsnet says he's working on a rewrite that will fix a bunch of issues like this. But I do believe that this has a different root cause than #58680; that if #58680 were getting a targeted bugfix that this would be worth keeping open as a separate bug.

@golang golang locked and limited conversation to collaborators Feb 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants