encoding/json: unable to set options when unmarshaling #17654
Labels
FrozenDueToAge
NeedsDecision
Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone
What version of Go are you using (
go version
)?go version go1.7.3 darwin/amd64
What did you do?
Attempted to make Go pass JSON parsing tests which include large numbers and trailing invalid JSON.
Background
json.Unmarshal
andjson.Decoder
have different behavior beyond one parsing from[]byte
and the other from anio.Reader
.https://play.golang.org/p/64EDWtq5M6
Unmarshal
fails,Decode
succeeds. This is expected behavior asDecoder
is documented to process a stream of JSON and decodes the first valid object, stopping before reading any of the invalid string.Problem
The problem comes in when one wants the behavior of
Unmarshal
(interpreting the entire input as a single JSON element and returning an error if it is invalid) and to set decoding options, such asUseNumber()
. The only way to setUseNumber()
is viaDecoder
.This is highlighted in the results of a recent comparison of JSON parsers (article/repo). Go did well, except for a few test cases. The test cases relevant to this issue involve large numbers (y_number_huge_exp.json, y_number_real_pos_overflow.json, y_number_real_neg_overflow.json)
On twitter, @rsc noted that Go would pass the large number tests if the
UseNumber()
option were set. Unfortunately this change also causes Go to fail a number of other tests, such as n_array_comma_after_close.json. This test expects a failure due to the comma after the array, but of courseDecoder
never reads the comma ifDecode()
is only called once.Workarounds
It is possible to get the desired behavior by checking the
Decoder
for more data (https://play.golang.org/p/TKLyTWmo0M), but this is unnecessary extra work considering that theUseNumber()
is setting an option ondecodeState
, which is also used inside ofUnmarshal
.Potential Fixes
New Function
The simplest solution would be to create a
UnmarshalUseNumber
function, similar to what was suggested in #7067. This is not ideal for the reason stated in #7067 (comment):To be fair, this isn't completely without precedent, as
MarshalIndent
already exists.New Decoder Type
Create something like
NewUnmarshaler
, which would return a struct that hadUseNumber()
andUnmarshal([]byte, interface{}) error
methods, allowing options to be set, but with the behavior ofjson.Unmarshal
. However,NewUnmarshaler
is a poor name because one might reasonably expect it to return anUnmarshaler
, which is already a defined interface.In my opinion, this is the best general approach, but I have been unable to come up with good name to suggest for the function and returned struct type.
The text was updated successfully, but these errors were encountered: