Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: unmarshalling null into non-nullable golang types #33835

Open
Qhesz opened this issue Aug 26, 2019 · 13 comments
Open

encoding/json: unmarshalling null into non-nullable golang types #33835

Qhesz opened this issue Aug 26, 2019 · 13 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@Qhesz
Copy link

Qhesz commented Aug 26, 2019

Does this issue reproduce with the latest release?

As of 1.12, yes.

What did you do?

I unmarshalled null into an int, string, bool etc. (any non-nullable Golang type)

What did you expect to see?

A type mismatch error, similar to if I'd unmarshalled a string into an int, or any other type mismatch.

What did you see instead?

No error at all, unmarshalling 'succeeds' at deserialising null into my non-nullable type.

Discussion

I'm aware of #2540 , I'm specifically arguing for the reversal of that decision. I'm aware the documentation says "Because null is often used in JSON to mean “not present,” unmarshalling a JSON null into any other Go type has no effect on the value and produces no error.". I'm arguing against that behaviour.

1.

The unmarshaller shouldn't have an opinion on what null "means", it should accurately map json types into their golang equivalents.

2.

If I'm unmarshalling into a specific golang type, I've gone to the effort of statically defining the type that I expect. The unmarshaller should assist me, and reject any poorly typed input (which it does well for all types other than null). If I wanted to marshal into a nullable type, I would have specified a nullable type.

3.

myVar int and myVar *int currently behave the same way when unmarshalling. In golang, an optional value is often implemented as a pointer, so I read these as "int" and "optional int", and I would expect them to behave differently.

4.

Currently json struct tags using the standard library, and db struct tags using the standard library for sql behave differently. In the sql libraries, nullable types are explicit, and trying to unmarshal an sql NULL into a non-nullable type is an error. For sql, like my point 2 above, a pointer to a value can be used as a nullable version of that type.

5.

It is impossible for me to override this behaviour of the unmarshaller. Defining my own type NoReallyIMeantInt int doesn't help. Unmarshalling still succeeds, because the current code parses the null first, and returns early without looking at the Golang type. (It explicitly sets Interfaces, pointers, maps and slices to their zero values, and ignores all other types.)

Practicalities

I'm aware this decision was made a long time ago. Realistically, what are the chances that this will be fixed now? Is it a compatibility concern? Do we think people deliberately rely on the current unmarshalling behaviour of null?

@bcmills
Copy link
Contributor

bcmills commented Aug 26, 2019

CC @dsnet, @bradfitz, @mvdan for encoding/json.

Realistically, what are the chances that this will be fixed now?

Low, since it would change the run-time behavior of the program without a corresponding compile-time error.

Per https://go.googlesource.com/proposal/+/refs/heads/master/design/28221-go2-transitions.md#language-redefinitions: “In order for the Go ecosystem to survive a transition to Go 2, we must minimize these sorts of redefinitions.”

Is it a compatibility concern?

Yes.

Do we think people deliberately rely on the current unmarshalling behaviour of null?

encoding/json is widely used, so per Hyrum's Law I would expect that people are relying on the current behavior. I do not know how often that choice is deliberate.

@bcmills bcmills changed the title encoding/json unmarshalling null into non-nullable golang types encoding/json: unmarshalling null into non-nullable golang types Aug 26, 2019
@bcmills bcmills added this to the Unplanned milestone Aug 26, 2019
@bcmills bcmills added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 26, 2019
@flimzy
Copy link
Contributor

flimzy commented Aug 26, 2019

IMO, this could only be changed with an opt-in to the json.Decoder type, similar to the DisallowUnknownFields() option added in Go 1.10.

Maybe it EnforceStrictNull()?

@Qhesz
Copy link
Author

Qhesz commented Aug 26, 2019

A json.Decoder opt-in would be better than nothing. I'll have a look at making a pull request.

@Qhesz
Copy link
Author

Qhesz commented Aug 26, 2019

Edit: the code already lets custom UnmarshalJSON handle nulls, I was wrong.
A note about implementation: when I define my own type, I should be able to define whether it can unmarshal null or not.

For example, I would like to be able to define an OptionalInt struct such as:

type OptionalInt struct {
  Present bool
  Value int
}

And define that null is a valid json-value that can be unmarshalled into it.

Which means that the implementation should probably use UnmarshalJSON("null"), and not just whether the primitive type (struct, bool, etc) is nullable or not.

(This only becomes an issue when using EnforceStrictnull - previously, OptionalInt would have been skipped entirely when the json was null, and I could abuse the zero-value to determine whether it was skipped over, and infer it was either null or the field was absent entirely)

@flimzy
Copy link
Contributor

flimzy commented Aug 27, 2019

I've been thinking about this, and I'm not sure that adding an option to json.Decoder is actually the right approach. It may be better to do this in a struct tag, even though I know that can mean a lot more work in some cases.

Imagine that a project which wants to take advantage of "strict null" interpretation (this proposal) embeds (in the JSON sense, not the Go struct sense) some third-party struct, as from a third-party SDK, a protobuf definition, or something else, which does not expect strict nulls.

Example:

import "foo"

type MyCompositeType struct {
    Key   string    `json:"key"`   // I want strict nulls here
    Value foo.Value `json:"value"` // but not here
 }

If foo.Value expects null to be skipped, then this struct will break in strict-null mode, and there will be no recourse.

The only solution I can think of, but maybe there are others, would be to support this mode by a struct tag:

type MyCompositeType struct {
    Key   string    `json:"key,strictnull"` // I want strict nulls here
    Value foo.Value `json:"value"`          // but not here
 }

This leaves the behavior of embedded types unaltered.

@Qhesz
Copy link
Author

Qhesz commented Aug 27, 2019

Just thinking out loud, you could wrap the foo.Value in your own type like:

type FooValue foo.Value
//or
type FooValue struct {
  foo.Value
}

And override UnmarshalJSON to use the standard decoder:

func (f *FooValue) UnmarshalJSON(bytes []byte) error {
  return json.Unmarshal(bytes, f)
}

Then you'd get the standard no-error behaviour.

Just trying to understand your example a bit better, if you're using strict nulls, and the foo.Value is a non-nullable type (== the Foo field is required) surely you want the error? So it seems that's not the interesting case.
If foo.Value is a nullable type (== the Foo field is optional), but it's unmarshaller relied on null being skipped rather than handling it, you'd now get an unwanted error. This seems to be the interesting case, and we're talking about a nullable type.

I get your point, a nullable unmarshaller implemented with the current default decoder would likely not actually handle nulls, and thus break when used with strict nulls. Which is a compatibility argument - people relied on the old behaviour, and were not forced to handle null explicitly.

(EDIT: that's not how the current code works, the type would explicitly ignore null in its UnmarshalJSON. That is, it has already been hardcoded to be nullable in terms of json, and the interesting case is a bit different...)

If I was the project using strict nulls, I think I'd be ok overriding types to fix up 3rd party types. (I'm already being anal about nulls by opting into strict nulls after all). How do you feel about the overriding approach compared to struct tags?

If I was the SDK exporting the Golang type, I would eventually add in explicit null handling to my nullable types, so that they would work regardless of which Decoder settings the end user picked. EDIT: There is no happy value - the type must explicitly be strict or not about null. I suspect all current types are not strict to match the current behaviour for builtin types.

( And in the magical far flung future, everyone would migrate to strict nulls, and it would become the default :) I can only dream )

@flimzy
Copy link
Contributor

flimzy commented Aug 28, 2019

A creative solution. It obviously only works for defined types (i.e. map[string]interface{} wouldn't work). Maybe that's okay? I'll continue to ponder.

@Qhesz
Copy link
Author

Qhesz commented Aug 29, 2019

It doesn't seem that creative - but I appreciate the compliment nonetheless! This is the exact same thing you do if you want to add json serialisation to an existing type that doesn't have any, such as net.IPNet.

You would have to clearly define which of the builtin types are nullable (in the sense that unmarshalling null into them does not return an error). In general, I'd argue that almost all the builtin types are not nullable, and if you want a nullable version, use *type. Which would leave pointers and interfaces as nullable. And would mean maps and slices are not nullable.

Which gets a bit murky, because maps and slices have a nil value. But both of those nil values behave like the non-nil-but-empty value. So Golang itself doesn't strongly distinguish between the two, and doesn't really help us decide. The existing encoding/json code always initialises maps, which is treating them as non-nullable. From that I'd be comfortable defining them as not nullable.

@Qhesz
Copy link
Author

Qhesz commented Aug 31, 2019

Whoops, my point 5 is wrong - I misread the source code. It is possible to write your own UnmarshalJSON that is strict about nulls. The existing code will call your unmarshaller with null, and you have the opportunity to return an error.

Which means I could go to the effort of making my own set of jsonNotNullX types like this:

type example struct {
	Foo string
}

type jsonNotNullExample struct {
	example
}

func (n *jsonNotNullExample) UnmarshalJSON(b []byte) error {
	if string(b) == "null" {
		return fmt.Errorf("strict null error")
	}

	return json.Unmarshal(b, &n.example)
}

But then I'm either polluting my code with json-specific types, or have to somehow then translate every jsonNotNullX type and every struct that contains a jsonNotNullX type back into the original types. Which is way more code than is worth it for strict null deserialisation.

@Qhesz
Copy link
Author

Qhesz commented Aug 31, 2019

Being wrong about 5 also means that if I wanted to use strictNulls, include 3rd party types like time.Time, and have them also be strict about nulls, I would have to override them to be not-json-nullable. I'd have to do this for all 3rd party types I include, because they would currently all be explicitly json-nullable in their UnmarshalJSON functions.

@Qhesz
Copy link
Author

Qhesz commented Aug 31, 2019

I can't see 3rd party types ever being fixed to be strict. They can't change the current UnmarshalJSON behaviour, because that's a breaking change. And there is no context provided to UnmarshalJSON that gives access to the decoder settings, so they can't dynamically choose between behaviours.

We've produced an ecosystem where all publicly exported types are explicitly json-nullable, and I can't imagine that's ever going to change. That's really demotivating.

@ostcar
Copy link

ostcar commented Oct 29, 2020

I ran into this problem and I really did not expect the current behavior. I have code like this one: https://play.golang.org/p/8-HkG0HkXeN

Maybe you could at least document the current behavior, that null == 0?

@mvdan
Copy link
Member

mvdan commented Oct 30, 2020

@ostcar as per the original post, the current behavior is already documented:

The JSON null value unmarshals into an interface, map, pointer, or slice by setting that Go value to nil. Because null is often used in JSON to mean “not present,” unmarshaling a JSON null into any other Go type has no effect on the value and produces no error.

Which composes with how arrays are decoded:

To unmarshal a JSON array into a slice, Unmarshal resets the slice length to zero and then appends each element to the slice.

So any null JSON array elements result in zero value Go elements being appended to the slice.

I strongly agree that this behavior can't be changed in v1. I've already had multiple reasonable json "bug fixes" reverted in previous releases because they broke too much existing code, and this would almost certainly be reverted as well.

I think we should rephrase this issue to be about giving the developer more control as to whether they want to be lenient and accept nulls anywhere, or be stricter and only allow them in certain places. It seems in line with other previous json proposals like not allowing unknown fields, disallowing case insensitive matching in object keys, and so on. It's certainly an issue we should keep in mind for future JSON API/design changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants