Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: Unmarshal & json.(*Decoder).Token report different values for SyntaxError.Offset for the same input #34543

Open
maxatome opened this issue Sep 25, 2019 · 5 comments · May be fixed by #43716
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@maxatome
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.13 freebsd/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/max/.cache/go-build"
GOENV="/home/max/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="freebsd"
GONOPROXY=""
GONOSUMDB=""
GOOS="freebsd"
GOPATH="/home/max/Projet/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/freebsd_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build707409473=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/zkVEGITpYIo

What did you expect to see?

The same Offset for the two error cases, as it is the same input.

What did you see instead?

Different Offset.

@bcmills bcmills changed the title json.Unmarshal & json.(*Decoder).Token give ≠ error json.SyntaxError.Offset encoding/json: Unmarshal & json.(*Decoder).Token report different values for SyntaxError.Offset for the same input Sep 26, 2019
@bcmills bcmills added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 26, 2019
@bcmills
Copy link
Contributor

bcmills commented Sep 26, 2019

CC @dsnet @mvdan

@bcmills bcmills added this to the Unplanned milestone Sep 26, 2019
babolivier added a commit to babolivier/go that referenced this issue Oct 1, 2019
…nter of its internal scanner

Decoder.Token() goes through the bytes of the JSON payload and return the next JSON token. When it encounters a token which isn't a square bracket, a curly bracket, a colon or a comma, it calls Decoder.Decode() to process it. This last function increments the decoder's internal scanner's byte counter on each byte it encounters. However, this increment isn't done if the character appears in the list mentioned previously. This causes the scanner's bytes counter to not correctly reflect the amount of bytes it has processed in the payload, and to show incoherent values when returning errors to the caller, e.g. with SyntaxError instances.

Fixes golang#34543
babolivier added a commit to babolivier/go that referenced this issue Oct 1, 2019
…nter of its internal scanner

Decoder.Token() goes through the bytes of the JSON payload and return the next JSON token. When it encounters a token which isn't a square bracket, a curly bracket, a colon or a comma, it calls Decoder.Decode() to process it. This last function increments the decoder's internal scanner's byte counter on each byte it encounters.

However, this increment isn't done if the character appears in the list mentioned previously. This causes the scanner's bytes counter to not correctly reflect the amount of bytes it has processed in the payload, and to show incoherent values when returning errors to the caller, e.g. with SyntaxError instances.

Fixes golang#34543
babolivier added a commit to babolivier/go that referenced this issue Oct 1, 2019
decoder.Token() goes through the bytes of the JSON payload and return the
next JSON token. When it encounters a token which isn't a square bracket,
a curly bracket, a colon or a comma, it calls decoder.Decode() to process
it. This last function increments the decoder's internal scanner's byte
counter on each byte it encounters.

However, this increment isn't done if the character appears in the list
mentioned previously. This causes the scanner's bytes counter to not
correctly reflect the amount of bytes it has processed in the payload, and
to show incoherent values when returning errors to the caller, e.g. with
SyntaxError instances.

Fixes golang#34543
@gopherbot
Copy link

Change https://golang.org/cl/198047 mentions this issue: encoding/json: fix byte counter increments when using decoder.Token()

@mvdan
Copy link
Member

mvdan commented Oct 10, 2019

The CL above could be the right fix, but it has no tests. If someone else wants to send a CL with a test, that would be helpful.

@gopherbot
Copy link

Change https://golang.org/cl/284078 mentions this issue: encoding/json: fix byte counter increments when using decoder.Token()

AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Oct 13, 2021
Stream decoder does not count whitespaces, empty objects and arrays
in syntax error offset. This change removes offset tracking from the
scanner and relies on the calling code to provide the correct value.

Fixes golang#44811, golang#34543
@gopherbot
Copy link

Change https://golang.org/cl/355729 mentions this issue: encoding/json: calculate correct SyntaxError.Offset in the stream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
4 participants