Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json: nilasempty to encode nil-slices as [] #37711

Open
lukebarton opened this issue Mar 6, 2020 · 46 comments
Open

proposal: encoding/json: nilasempty to encode nil-slices as [] #37711

lukebarton opened this issue Mar 6, 2020 · 46 comments

Comments

@lukebarton
Copy link

lukebarton commented Mar 6, 2020

https://go-review.googlesource.com/c/go/+/205897

This is an improvement over @pjebs PR/Proposal (which can be found here: #27589) for the following reasons:

  • The previous PR had a naming issue which was unresolved. As such I spent a long time stewing over the name of the option, and ended up going with nilasempty which I think works very well to describe what it does, and even gives a clue as to why it exists
  • The encoder functions are entirely responsible for the encoding of their nil versions resulting in a cleaner implementation
  • There are separate tests

This is the cleanest, most straightforward, most complete change that could implement this feature in the existing encoding/json package. There are good, clean, readable tests too.

If the maintainers agree, I really think it'd be great to get this merged so that we can start benefiting from it's presence in the next suitable release! Let's put this shortcoming to bed 👍

The issue

I have an json API contract that says my response must look like this:

{
  "foo": ["some"],
  "bar": ["values", "for the consumer"], 
  "baz": []
}

(empty-array expected over null)

So naturally I create a type to help me satisfy the contract

type MyResponse struct {
  Foo: []string `json:"foo"`,
  Bar: []string `json:"bar"`,
  Baz: []string `json:"baz"`,
}

Then someone else comes along and creates a mock response for testing

myMock := MyResponse{
  Foo: []string{"blah"},
  Bar: []string{"blah"},
}

The type checker is happy with a nil slice for Baz

json.marshal(myMock)

for already understood reasons, this results in

{
  "foo": ["blah"],
  "bar": ["blah"],
  "baz": null,
}

which, of course does not satisfy the contract as intended.

In conclusion of that, the type system is not doing anything wrong, but it's not helping us satisfy contracts we have in json.

Moving forward

So what can we do? The obvious choice is a constructor to initialise MyResponse -- but Go doesn't have any way of enforcing constructors, so it has to be opt-in as far as usage goes, which is going to be forgotten and the type checker still isn't going to complain to save us from ourselves. It's not a satisfactory solution.

Digging deeper

I began thinking about where the fault lies - is it in Go's nil slices? is it in the json encoder? is it in our usage of types?

I concluded that the error does lie in the choice the json encoder makes in choosing how to encode a nil-slice and I'll do my best to explain why I think that's the case.

The author of the encoder chose to encode a nil-slice as null -- but why? The code that returns null is inside a function which knows it's encoding a slice.

func (se sliceEncoder) encode(e *encodeState, v reflect.Value, opts encOpts) {
if v.IsNil() {
e.WriteString("null")
return
}
se.arrayEnc(e, v, opts)
}

The reason that decision was made seems to be because it's nil under the hood, and nil == null.

So now I want to consider, why is the zero value of a slice nil in the first place? Correct me if I'm wrong (I'm relatively new to Go) but it's a memory allocation optimisation, meaning Go only needs to allocate memory when it gets it's first values -- which is totally smart when designing a memory-efficient language!

So, I'll say that nil-slices are just an implementation detail of Go in it's aim of being memory efficient out-of-the-box as a programming language -- It looks like a slice, it quacks like a slice, but aha! it's a nil under the hood -- but it's still a slice.

Aha!

Now let's look at the perspective of the encoder - the encoder appears to be encoding the underlying nil to null. But now we understand that the slice being nil is just an implementation detail of Go lang. The encoder should be encoding the values in the context of their type eg. []string to [], and definitely not converting the underlying zero-value-of-a-slice-nil to it's closest json equivelent of null. In wonderful irony, representing the underlying nil (which is a memory allocation optimisation for Go) as null in json actually costs more bytes over-the-wire than the empty array version []!

So how do we address this misstep? Ideally, mirroring omitempty and as the less common behaviour, nullempty would encode empty arrays as null. But to be backward compatible, something like nilasempty would be best.

Summary

In summary, nil-slices are an implementation detail of the Go language and nil-slices should be treated as plain-old slices by the json encoder - it's a slice and it's empty, and it should be represented in json as [] when specifying something like nilasempty to enable this new behaviour in a backwards compatible way.

@gopherbot gopherbot added this to the Proposal milestone Mar 6, 2020
@lukebarton lukebarton changed the title proposal: encoding/json nilasempty to encode nil-slices as `[] proposal: encoding/json nilasempty to encode nil-slices as [] Mar 6, 2020
@lukebarton
Copy link
Author

@kortschak @slrz rather than just thumbsdowning, or confused emoji'ing could you please give some actual feedback?

This issue continues to prevent Go being a primary choice for building a JSON API.
JSON schemas that look like { foo: null | Thing[] } are horrible for consumers to work with, because they need to do null checks. Serialising nil-slices to null causes that. The only alternative is to initialise empty-slices where slices exist through extraneous constructor functions. And it's still not possible to guard against people from initialising an object of the type directly (which would result in the nil-slices). It's not a very nice developer experience at all, and thus why I would choose another language. It disappoints me that this issue hasn't been progressed when there's a perfectly good patch, complete with tests, that's well-aligned with the aproaches of the existing json/encoder.

@kortschak
Copy link
Contributor

kortschak commented May 4, 2020

nil maps to null, and []T{} maps to []. This already just works out of the box, is simple and requires no explanation for users or addition to the API.

https://play.golang.org/p/5yJEv7hmrxf

If nil-safe deserialisation is required, then json.Unmarshaler can be implemented to take that into account.

@neilalexander
Copy link
Contributor

Having been bitten by bugs caused by problems just like this in the past, I am very much in support of a nilasempty option, or some similar way of configuring such optional behaviour.

In a controlled environment, or in a context where you can specifically plan for nulls in the JSON, the default behaviour makes sense.

In an uncontrolled environment where you can’t necessarily prevent someone from passing in nil map or slices as inputs, or where interoperability with existing software is required, the only way to avoid this is to pepper the code with checks for nil maps or slices everywhere. This quickly gets verbose, it quickly gets tiresome and it is still all too easy to shoot yourself in the foot and land on a panic down the line as a result.

The solution proposed is a reasonable and pragmatic one. Being expected to write custom marshallers and unmarshallers to avoid this problem is yet another place to introduce verbosity, complexity and to, in all likeliness, shoot yourself in the foot some more.

@ToadKing
Copy link

ToadKing commented May 5, 2020

The whole point of JSON is for exchanging data between two different systems. Having to do extra steps to get the data in a format for other systems just makes it harder to use JSON generated with Go with programs not written in Go.

A common usecase I use (and I'm sure other people use) is something like this:

var filtered []something

for _, item := range fullList {
	if isValid(item) {
		filtered = append(filtered, item)
	}
}

In this case, an empty filtered variable would be null if marshaled to JSON, when any place this is passed to is probably expecting an empty array. Since a nil slice behaves exactly like an empty slice in every other situation there shouldn't be extra steps that have to be done only for the JSON package.

Personally, if not for the 1.0 promise, I would want this new behavior to be the default. Implementing a data-exchange format with a language-specific quirk built into it is just bad design. null shouldn't show up except when a field is explicitly a pointer type.

@lukebarton
Copy link
Author

lukebarton commented May 14, 2020

@kortschak Your example might be just fine, if this were typical.

I think it appears to you to be less-of-an-issue because it's your example is only toy-size, scale this up to something more real world and the problem becomes more obvious:
https://play.golang.org/p/le_PN4NMw3u

Do you really want to have to initialise all those empty slices everywhere you use a T?

Now, considering this example, let's say I add J:
https://play.golang.org/p/0kQ_3L3YkQV

If I want to avoid an uninitialised J serializing to null I have to go through my entire codebase ensuring that I am initialising J as an []string{} in every single place that I am initialising a T. Not nice, especially when one gets missed. And remember here, we're not talking about just me remembering to do it, but everyone else on the team. Not gonna happen.

Constructor function are something to be forgotten about too, and they can't be enforced. Anyone can initialise a broken T{} with nil-slices that ends up serialising incorrectly -- a problem that doesn't manifest itself until runtime.

Honestly, given a type of T{ F []string }, a developer trying to build an API returning JSON would expect and bog-standard T{} to serialize to: { "F": [] } not { "F": null }. That's why this should be in the stdlib and as @ToadKing says -- if it weren't for never breaking backward compat, this behaviour should be the default.

There's a perfectly good patch sat here that would enable us to serialize in the way most developers would expect, without resorting to custom serializers or unenforcable constructors. We still have to litter our types with nilasempty property traits -- it's still not great, but it's better to have that available as part of the stdlib than all of the alternatives.

@kortschak
Copy link
Contributor

kortschak commented May 14, 2020

Anything can be made to look bad given an appropriate strawman.

If I want to avoid an uninitialised J serializing to null I have to go through my entire codebase ensuring that I am initialising J as an []string{} in every single place that I am initialising a T. Not nice, especially when one gets missed. And remember here, we're not talking about just me remembering to do it, but everyone else on the team. Not gonna happen.

You're right, it's not going to happen. This is what construction helpers are for.

If I were forced to write something like that and were bound by a broken JSON ecosystem then I would say it this way, https://play.golang.org/p/_wbj8qAOAWz, or this way https://play.golang.org/p/noNNRLiO3TI, or this way https://play.golang.org/p/eehvczp12f6 (at a stretch).

I'd also like to address this comment:

So now I want to consider, why is the zero value of a slice nil in the first place? Correct me if I'm wrong (I'm relatively new to Go) but it's a memory allocation optimisation, meaning Go only needs to allocate memory when it gets it's first values -- which is totally smart when designing a memory-efficient language!

This is not the case. It is not merely a memory allocation optimisation, it is a reflection of a fundamental property of Go slice values, that []T{} is not the same as []T(nil). This difference is important in many situations. Just because other languages do not make this distinction does not mean that the difference is not important here.

@neilalexander
Copy link
Contributor

neilalexander commented May 14, 2020

@kortschak Your proposals are perfect illustrations of how difficult it can be to avoid this problem in the real world. You either need to be meticulous to ensure that you don’t secretly have nil slices somewhere, or you need to define custom marshalling, or you need to define custom types. None of these are good solutions.

The point here is that this behaviour feels unexpected, because it isn’t apparent to the programmer from looking at the field type alone that this should even be possible without tracing every use of that field.

If you start with an array type, then you would expect the JSON output to be an array. The fact that nil slices act like empty arrays in Go is an implementation detail that is pretty much irrelevant here. You asked for an array. The type states it should be an array. It should be an array.

@ToadKing
Copy link

ToadKing commented May 14, 2020

The fact is that unless you explicitly check for nil, a nil slice and a slice of length 0 are functionally equivalent in Go code in any case I can think of. You can use both in range loops, call len on both, unpack with ..., and so on. There's no reason for this to appear as null in the one place where being null is detrimental for the sole purpose of the code: serializing data for exchanging between programs.

This is not the case. It is not merely a memory allocation optimisation, it is a reflection of a fundamental property of Go slice values, that []T{} is not the same as []T(nil). This difference is important in many situations. Just because other languages do not make this distinction does not mean that the difference is not important here.

I can't think of a single situation where a length 0 slice and a nil slice should behave differently. If it makes sense that they're different internally in the runtime, then that difference shouldn't be exposed in standard packages.

@kortschak
Copy link
Contributor

kortschak commented May 14, 2020

The point here is that this behaviour feels unexpected, because it isn’t apparent to the programmer from looking at the field type alone that this should even be possible without tracing every use of that field.

This is a subjective statement dressed as fact. The behaviour that exists now is entirely unsurprising to me based on the behaviour of types and values in the Go language.

There's no reason for this to appear as null in the one place where being null is detrimental for the sole purpose of the code: serializing data for exchanging between programs.

It is perfectly reasonably to serialise the null state since that may be (and often is) important. That some programs in some languages make incorrect assumptions about the difference between null and [] should not be papered over here at the expense of clarity and complexity.

I can't think of a single situation where a length 0 slice and a nil slice should behave differently. If it makes sense that they're different internally in the runtime, then that difference shouldn't be exposed in standard packages.

It is exposed in the language, by the == nil check.

@neilalexander
Copy link
Contributor

It is perfectly reasonably to serial the null state since that may (and often is) important.

Which is why nilasempty feels like a pragmatic solution - it’s optional. It doesn’t break the convention of the defaults, nor does it break anything that exists today.

That some programs in some languages make incorrect assumptions about the difference between null and [] should not be papered over here

Some languages simply don’t make these distinctions. JSON is a data interchange format. We have to expect that not everyone will have nil slices. Is it not OK to accommodate that?

at the expense of clarity and complexity.

Expecting custom marshallers to solve this problem is definitely adding extra complexity.

This is a subjective statement dressed as fact. The behaviour that exists now is entirely unsurprising to me based on the behaviour of types and values in the Go language.

Your expertise isn’t in question but rather the soul of the defaults. The problem is that this is a really easy hole to fall into and this PR is a sensible way to make it avoidable.

@ToadKing
Copy link

It is exposed in the language, by the == nil check.

My point being that outside of that one check, a nil slice and a zero-length slice behave the same. They behave so much the same it's easy to use a nil slices in place of zero-length slices which is the reason we have a proposal for this feature in the first place, since the JSON package appears to be one of the only places doing this check.

@kortschak
Copy link
Contributor

kortschak commented May 14, 2020

This appears to be going nowhere. My last comment here until others comment; I have addressed the issue of only giving a 👎 raised in this comment.

Which is why nilasempty feels like a pragmatic solution - it’s optional. It doesn’t break the convention of the defaults, nor does it break anything that exists today.

It adds additional complexity to the API of the JSON package that is not necessary.

Some languages simply don’t make these distinctions. JSON is a data interchange format. We have to expect that not everyone will have nil slices. Is it not OK to accommodate that?

This is already accommodated by virtue of the json.Encoder interface in a way that is consistent with the Go language's design principles of orthogonality and composability. nilasempty is neither orthogonal nor composable (it interacts with other tags).

Expecting custom marshallers to solve this problem is definitely adding extra complexity.

There are two types of complexity here. Local complexity and global complexity. Adding this adds global complexity that everyone needs to understand, using the language as it exists adds local complexity to some packages. By writing those packages using well established Go coding practices of locality that complexity can be well understood.

My point being that outside of that one check, a nil slice and a zero-length slice behave the same.

My counterpoint was that ignoring that one point is an invalid position to argue from. The availability of the nil check is important and so should be ignored. It is used in json-client code and reflects the behaviour of values within the Go language.

@ToadKing
Copy link

It is used in json-client code and reflects the behaviour of values within the Go language.

JSON is not a Go language construct though. It's a near-universal standard. Having Go language behavior be a part of the standard library for it is counter-productive. May as well call it GoJSON instead at that point.

@lukebarton
Copy link
Author

It's a shame the answer seems to be "Your JSON schema and the JSON ecosystem are broken" and "The Go-way is the right way".

I'm disappointed by Go's odd determination on behalf of all it's JSON users that null is preferred over an empty array, despite it being more bits over the wire/on the disk and forces schemas to expect the less straightforward union of array and null, rather than just array.

It's far more common to encounter an empty array for an empty array than null when dealing with JSON. I don't really mind how Go wants to optimise itself, but I don't want to have to answer the question "why is this null instead of just an empty array?" with "well, we use Go and that's what their marshaller does with uninitialised slices, oh and apparently we're all doing JSON wrong".

"But why didn't you just overcome the problem with more code?" Well, we had types and a standard lib marshaller, and this was the most sane way we could use them together. The alternative was a poop developer experience fraught with room for error. It's just more simple for us to give you null, sorry.

I'm really trying to address this issue. I've sat down and understood how the Marshaller works and answered why it doesn't just output an empty array based on the type definition. I worked out a way to fit this corrected behaviour in sensibly. I jumped through all the hoops and created a decent PR with tests.

Exposing Go's penchant for memory optimised slices by JSON marshalling a T{ Foo string Bar []string } to { "Foo": "", "Bar": null } is odd. An uninitialised string is "" but that []string? Yup. You're getting that as null my friend.

There's a PR here that adds a property trait in line with all the other json marshalling traits. It doesn't break anything and accepting it would make a lot of people happy.

@as
Copy link
Contributor

as commented May 15, 2020

This is a problem that occurs in custom JSON parsers, often in low-level c or c++. Are there any other situations where it creates a burden? I see lots of theoretical points being made, but are there any examples you can share that don't involve a custom JSON parser that is not RFC compliant?

I think the question that needs to be answered here is, why is this such a problem that Go needs an additional annotation to solve it instead of allocating the objects yourself to conform to the contract you have chosen?

{
  "foo": ["some"],
  "bar": ["values", "for the consumer"], 
  "baz": []
}

This contract is actually satisfied by many types, because baz is an abstract json array of anything, not just []string.

@ToadKing
Copy link

The problem isn't with parsers not handling null properly. The issue is null in a place of an array breaks schema and pretty much every language that isn't Go would require extra logic to check if a value is null or an array.

For values that are created right before JSON marshaling they can be created manually, but more often than not slices are passed around or in structs passed in to functions that handle marshaling and transmission. The empty slice creation will either have to be done for every possible slice field and every call to Marshal, or for the caller to make sure the slice isn't nil, which would just make every single caller have to do their own slice checks instead.

I also want to emphasize that this isn't just theoretical issues the PR attempts to solve: We have a Go server that talks exclusively with a web client with JavaScript, and the issues around null values being in place of empty arrays have been issues we have hit many many times before.

@lukebarton
Copy link
Author

lukebarton commented May 15, 2020

I also work on about 20 APIs that serve web clients.

This isn't about the web client not deserialising properly. The web client receives the JSON just fine - as @ToadKing says, we just have to force the client into null checks because that's how Go works.

Now I do hear the argument "Just initialise empty slices", I'm listening, it crossed our minds already. We've been here. I'll explain why it didn't work for me:

Setting the scene: You've got 20 APIs across 5 teams. You're hiring, new devs are coming in with variable Go experience. You need to implement typical JSON schemas that don't yield null where there's an empty array.

  1. Initialising empty slices at each point of use for types that are evolving is just not maintainable. People will forget. And it'll be a mess. Never gonna happen.
  2. "Ah! Use a constructor" sure but there are a few problems: there's no way to enforce the use of a constructor - if I want to export the type to type function params for example, then structs of that type are creatable and I can't stop that. There's no way to make using the type directly to create new structs off-limits. We'd rely on every developer not to use one of the languages features and to go through one of an army of constructors that now also need to be written, maintained and tested.

Since I'm writing about other solutions, I'll address the custom marshaller too: Yes, we could use a custom marshaller, but if a developer wants JSON, it's reasonable they'll reach for the stdlib marshaller, and across teams and projects, its hard to stop them because it'll work fine too. Until it doesn't. Now we have things using different serialisers and some bugs, some of which we aren't even aware of yet because we only see them at run time. Sure, we could write tests for this, but do we really want to write a whole new class of test for something we could lean on the type system and serialiser to do for us? Not really.

In summary, I can't enforce things like "Never initialise using the type" or "Never use stdlib json/marshaller" for very long. The moment you look away, some new hire will do it. It's not really the kind of environment I want devs to have to navigate.

"Oh hey, you need to make sure you use the 'nilasempty' trait to make sure it serialises the property to an empty array to satisfy our contracts where empty arrays are expected rather than null" is a much, much easier conversation to have. Especially since these traits will be easy to spot when comparing your new type to any of the existing types in the codebase (and you probably copypasta'd it anyway, right?)

@as
Copy link
Contributor

as commented May 15, 2020

I think that this is a proposal of convenience, perhaps for specific requirements your team may have, but I also don't think it's the silver bullet for your specific requirements either. Let's ignore the infrequency this is requested, the barrier of entry for adding a feature to stdlib, and performance implication for the sake of argument:

You want a tag that guarantees an invariant that slices are never nil, but at the same time we have concerns about enforcing a constructor. This is actually the same problem, because now you need to trust everyone to have this tag on your structs, for both sides of the protocol. Let's assume we have this tag now, and remove all nil checks. Now you also need to trust that your data structure is never, ever created any other way, or have nil checks for them anyway. The only way to really guarantee anything now is by transmitting it over the wire and having the magic tag. Is this a robust solution?

You can already range over a nil slice in Go, so there is no need for a nil check in the first place. The language did consider the corner cases. The Go struct exits the wire and enters the other wire exactly like it never left at all, and you don't have to check for nil on a slice anyway unless you're indexing it, but that's ok because indexing a zero-length slice anywhere will crash your program in most languages anyway, and it's ok to call len and cap on a nil slice, and its ok to range over it. And if your slice type has a method on it, you can have a method receiver work on that too!

I'm convinced there is no need for Go to solve this problem, because Go doesn't actually have this problem since its not a json problem as the structures are isomorphic on the sender and receiver.

If you still want to guarantee this for another language, it is a function you can have in your repo that calls reflect and initializes slice types to an empty slice value. This would work a lot better for you because it would become a property of your marshaller, and you could enforce this for any type in any package. Given that, I still think it would be helpful to see examples of some languages where this is a problem on clients, because I'm convinced Go is not one of them.

@lukebarton
Copy link
Author

lukebarton commented May 15, 2020

The whole world is not Go though. Browsers do not implement Go, they implement JavaScript for example. Sure go ahead and scoff at JavaScript clients. I have to deal with them.

If I give a JavaScript client { "Foo": null } when the agreed schema is { "Foo": [] }, then my Go service isn't conforming to the schema. We avoid null when crafting public facing JSON schemas because it's basically just forcing the client into a null check for literally no reason. null is rarely used in public facing JSON schemas, except in absence of objects e.g. instead of an empty object {}.

We're talking about Go's ability to fulfill an agreed schema. I'm being told that the schema must be wrong because Go. Not everything is Go. If you have two Go services and want to pass serialised JSON between them, sure, null serialisation for nil slices can be ignored because if, on both ends, Go thinks it's okay to pretend nil is actually a slice then everything works fine. But the whole world is not Go. We're building APIs to be consumed by all the languages in the world and currently the JSON marshaller makes choices that go against the grain of most public JSON schemas written. The GoJSON analogy that @ToadKing used was spot on.

I feel like we're crossing wires here because people who 'understand' this PR and it's other permutations (there are several other issues and PRs around this same subject) want to use Go as a language to help us build APIs that conform to schemas that aren't how the json/marshaller decided Go does JSON. I feel like @as and @kortschak are just looking at it from serialising and deserialising JSON to and from Go, and they're right - in that use case, everything is hunky dory. The whole world is not built on Go though.

And what language feature isn't a feature of convenience?

At present, I would not choose Go as a language for a JSON API, all down to the behavior of the json/marshall package because of the sheer number of hoops that need to be jumped through to address the shortcomings, which then leave you with a bunch more shortcomings to deal with afterwards. That's why we're here.

This PR is a simple trait. It's in line with everything else in the package. It's not huge. It's not a massive performance hit. It's literally a flag to say "hey, I don't care that you're using a nil slice under the hood, I just want an empty array like my type definition actually says when you serialise it to JSON".

I am on your side @as and @kortschak as much as I'm on @ToadKing's. I don't understand what the big deal about this PR is, given the headaches that it saves. Thinking the whole world is Go and if Go consumes Go-serialised JSON then there's no problem, doesn't really help anyone but people that only write Go that talks to Go.

@kortschak
Copy link
Contributor

The GoJSON analogy that @ToadKing used was spot on.

We already have an implementation of the JSON spec that is a Go dialect. https://play.golang.org/p/Oohiu_3SjJ6

I feel like @as and @kortschak are just looking at it from serialising and deserialising JSON to and from Go

No, this is not what I am saying. Although I have said it plenty of times here already, the behaviour of the JSON serialisation reflect the nature of the language, and there a perfectly reasonably ways to achieve what a ,nilasempty would effect, but with IMO greater clarity (a marshaler method is much clearer in its effect than a struct tag).

@lukebarton
Copy link
Author

lukebarton commented May 15, 2020

(a marshaler method is much clearer in its effect than a struct tag).

That's falling into the 'custom marshaler' category of solution.

And to be fair, you have omitempty as a tag, which is in the same category as nilasempty.

I'm going to leave the conversation at this: I, and plenty of other people, have struggled to craft JSON APIs because Go's stdlib serialises nil slices to null and the alternatives aren't ideal developer experiences by any stretch of the imagination. Several people have sought ways to address this in the form of several PR's, of which this is the latest.

I don't have a desire for a long drawn out battle over this. I'll say though, that given a fresh choice of Go or something else for a web-based JSON API, I'd totally choose something else. I don't want to have to bring my own serialiser, or write a bunch of constructors just to get the damn serialiser to output an empty array where there's an slice type that just happens to be empty. I want to be able to rely on the type system and the standard library JSON serialiser of a language I'm using. I can't do that with Go without mandating a bunch of against-the-grain disciplines as already discussed. After all, programming languages are for humans, to make their life better when developing software. Go doesn't make things things easy for us as schema-conforming JSON API developers, in fact, the experience is pretty rubbish, that's all.

If that's the end of the conversation, then so be it. We at least tried to help make Go better in relation to this use case, which isn't exactly niche. Cheers for the help @ToadKing, @neilalexander whoever you guys are :) nods in acknowledgement of shared struggle

@lpar
Copy link

lpar commented May 15, 2020

I don't want to have to bring my own serialiser, or write a bunch of constructors just to get the damn serialiser to output an empty array where there's an slice type that just happens to be empty.

Your problem isn't with slice types that are empty, it's with slice types that are nil. Stop your code from creating slices that are nil and you won't get nils in your JSON.

Yes, that might mean writing and using a constructor function, but that's not hard or onerous surely? I've never found myself thinking "Oh no, I have to write a constructor".

The alternative is to tailor the behavior of the serializer, and as you know there's already a way to do that too. Why not make yourself a JSONStringSlice type, stick it in a library, and use it everywhere? I almost always have to do something like that for any kind of date or time field that's going to JSON or coming from JSON. (In fact, I have a custom string list type I use in Kotlin to solve JSON string array issues.)

@ToadKing
Copy link

Your problem isn't with slice types that are empty, it's with slice types that are nil. Stop your code from creating slices that are nil and you won't get nils in your JSON.

That's a disingenuous argument, because literally everywhere else in Go, nil slices and empty slices behave in the exact same way. I can pass nil to, say, strings.Join and it treats it like an empty slice. There are even standard libraries that return nil slices to signify empty slices (like strings.SplitN). The JSON package is the odd one out. Having to refactor code to deal with one standard library that does things differently shouldn't need to be done. The common argument against this keeps coming back to "if you want an empty slice then make an empty slice, not a nil slice" but that has never been the case for every other part of the Go language and standard libraries. If there's supposed to be important differences between the two, then nil slices shouldn't behave like empty slices in every non-JSON package part of Go.

@lpar
Copy link

lpar commented May 15, 2020

@ToadKing I disagree that nil slices and empty slices should be treated the same way by the JSON serialization. Both behaviors are useful.

Edit: It would also break a lot of existing code, so I think it's a non-starter.

@lukebarton
Copy link
Author

lukebarton commented May 15, 2020

This change wouldn't break anything. It's an addition, not a changing existing behaviour and provides both behaviors, at the choice of the user.

@sermojohn
Copy link

Use a custom MarshalJSON function that sets nil slices to empty ones before marshaling. The tag proposal, to me, would be insufficient even as a custom package. The tag needs to be on all the slices anyway, so the dial for the entire feature should be on the marshaller, not the object.

You will never want null on some slice but not another one, so having it configurable per-slice is not ideal. Enforce this behavior in one function, and your entire problem disappears, along with the burden of creating and enforcing tags for everything.

At this point, I'm not even talking about this as a proposal, I'm saying there's a very simple way to solve your specific business problem with one utility package.

I can see real scenarios where the empty slice is a requirement for a mandatory field, while null is acceptable for an optional field, where the omitempty tag can be used.

I still see the value of setting a tag on a slice field, so the JSON marshaller encodes a nil slice pointer in Go to a empty array in the JSON format. I would expect this to have no consequence to the memory footprint of the struct, because only the serialization output will be affected.

@cben
Copy link

cben commented Oct 1, 2020

I feel part of the arguing is that different people have different goals:

  • Some want an array in JSON, always, even when empty. This is doable today by using non-nil slices but is cumbersome, and violates widely held equivalence, and this proposal would solve it.

  • Some want the ability for a single field to sometimes omit it and sometimes make it a (possibly empty) array. This is now impossible without resorting to custom marshallers! proposal: encoding/json: add omitnil option #22480 would make it easy by distinguishing between nil and 0-length slices.

nil slice make([]Foo, 0) tag
null []
omitted omitted omitempty
[] [] nilasempty #37711
omitted [] omitnil #22480

These two proposals don't technically conflict, but conceptually they pull in opposite directions.
Does this package hold 0-length and nil slices as equivalent or not?
Well, let's start by agreeing the current mixed answer is confusing 😜

  • If you need control between omission / null / array, that's harder yet... Can pointer to slice work (untested, confusing as slices are already a pointer type)? Well interface{} surely works, as it can represent any JSON, but that's not what people want with strongly-typed serialization... IMHO the disctintion between omission and null is unusual enough (for this package) to justify a custom marshaller.

[Everything I say above is for marshalling; didn't check if it holds for reading.]

@DeedleFake
Copy link

I had trouble with this recently. I'm working on a backend that's accessed by an iOS app, and the iOS developer was having some issues getting Swift to accept an array field that it expected to be there but was completely absent because of the use of omitempty, and I couldn't use null, either, so I wound up having to manually insert empty arrays into every possible case where an array could be returned, which was, to say the least, not fun. This would simplify stuff a ton if it gets accepted.

To clarify, as an example, I'm sending {}, and his end expected {"some_field": []}, and it didn't work.

Perhaps something like #23637 can also fix the issue if non-constant fields are allowed in the tags, thus allowing for things like custom, per-field logic without the need for string tags that set every possible rule that someone might want. For example, you could allow for a transformation function that the field value is called on and then the result of that, if it's present, is marshalled instead. Then you could just do

func(v []int) []int {
  if v == nil {
    return []int{}
  }
  return v
}

With generics you could even make that a function in the json package, such as func NilAsEmpty[T any](v []T) []T and then just use json.NilAsEmpty[int], or something.

@cbelsole
Copy link

cbelsole commented Oct 24, 2020

FWIW, we dealt with this situation 3 years ago and ended up forking encoding/json to add a MarshalSafeCollections() which ensures maps and slices are always represent as {} and []. API consistency and lack of decision making on a per struct basis made development a lot easier, less verbose, and less error-prone on the front end since we are not checking as many fields to see if they are null. There may be one-off cases where you want to return a null. In that case you can use a pointer to a collection. Eg *[]int or *map[string]int

package main

import (
	"fmt"

	"github.com/homelight/json"
)

type A struct {
	S        []int  `json:"s"`
	PtrS     *[]int `json:"ptr_s"`
	OmitS    []int  `json:"omit_s,omitempty"`
	OmitPtrS *[]int `json:"omit_ptr_s,omitempty"`
}

func main() {
	var a A
	
	b1, _ := json.Marshal(a)
	b2, _ := json.MarshalSafeCollections(a)

	fmt.Println(string(b1))
	fmt.Println(string(b2))
}
{"s":null,"ptr_s":null}
{"s":[],"ptr_s":null}

@dsnet dsnet changed the title proposal: encoding/json nilasempty to encode nil-slices as [] proposal: encoding/json: nilasempty to encode nil-slices as [] Dec 31, 2020
@mitar
Copy link
Contributor

mitar commented Jan 6, 2022

In support of addressing this issue.

@lukebarton: I would suggest that the proposal is clarified how would one control the behavior if a slice is nested somewhere inside of a map[string]interface{} value? Would tag apply recursively?

@MaxHorstmann
Copy link

+1 on this proposal, in case there's any hope of revitalizing it. Fwiw, Go generics make it easier now to replace an empty slice with a nil slice, as someone pointed out in a great Stack Overflow answer.

@hookenz
Copy link

hookenz commented Jul 19, 2022

I want this too. omitempty is good, but nilasempty is required for my contract also. We're jumping through long winded and error-prone hoops to get around it right now

@szmcdull
Copy link

szmcdull commented Feb 3, 2023

I found another example of how nil slice cannot be avoided, when it is generated by a variadic function, with no argument parsed to it.

john-floren-gravwell pushed a commit to john-floren-gravwell/gravwell that referenced this issue Apr 21, 2023
Javascript really does not like to get a null when it expects an empty array,
but that's what Go will send if your struct contains e.g. a []string and
you don't explicitly initialize the slice. These custom marshallers make sure
we send an empty array, making the frontend's life easier.

Go desperately needs some way to tell the JSON encoder that we want this
behavior, see:

golang/go#37711
golang/go#27589
golang/go#27813
@bjornbyte
Copy link

While it's disappointing that this change doesn't look like it will make it in, using generics you can now get just about the same effect

type ValidJSONArray[T any] []T

func (n ValidJSONArray[T]) MarshalJSON() ([]byte, error) {
	if n == nil {
		n = make([]T, 0)
	}
	return json.Marshal([]T(n))
}

func TestNotNullSlice_basics(t *testing.T) {
	// here is the problem we want to avoid
	var nilSlice []string
	bytes, _ := json.Marshal(nilSlice)
	t.Log(string(bytes)) // output: null

	// using generics we can force the nil slice to marshal to an empty array
	bytes, _ = json.Marshal(ValidJSONArray[string](nilSlice))
	t.Log(string(bytes)) // output: []

	// and it works with empty slices
	emptySlice := []string{}
	bytes, _ = json.Marshal(ValidJSONArray[string](emptySlice))
	t.Log(string(bytes)) // output: []

	// or slices that have stuff
	nonEmptySlice := []string{"a", "b"}
	bytes, _ = json.Marshal(ValidJSONArray[string](nonEmptySlice))
	t.Log(string(bytes)) // output: ["a","b"]
}

func TestNotNullSlice_InStruct(t *testing.T) {
	type testStruct struct {
		Name string
		Tags ValidJSONArray[string]
	}
	withNil := testStruct{
		Name: "Alice",
	}

	bytes, _ := json.Marshal(withNil)
	t.Log(string(bytes)) // output {"Name":"Alice","Tags":[]}

	withValue := testStruct{
		Name: "Bob",
		Tags: []string{"cheese"},
	}

	bytes, _ = json.Marshal(withValue)
	t.Log(string(bytes)) // output {"Name":"Bob","Tags":["cheese"]}
}

Hopefully that saves folks some pain.

@VojtechVitek
Copy link

As a runtime alternative to the above generic type, I just wrote this little package, which recursively initializes all nil slices in a given object using reflect pkg. This effectively forces json.Marshal() to render empty array [] instead of null values.

https://github.com/golang-cz/nilslice

Might be useful for some, until we have a proper solution merged to encoding/json stdlib pkg.

import "github.com/golang-cz/nilslice"
type Payload struct {
	Items []Item `json:"items"`
}

payload := &Payload{}

b, _ = json.Marshal(nilslice.Initialize(payload))
fmt.Println(string(b))
// {"items": []}

@dsnet
Copy link
Member

dsnet commented Oct 6, 2023

Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal.
By default v2 proposes that a nil slice or map would marshal as an empty JSON array or object.

For flexibility, it provides the ability to alter this behavior with:

  • json.Marshal(v, json.FormatNilSliceAsNull(false)) at the marshal-call level
  • struct{ SliceField []T `json:",format:emitnull"` } at the struct-field level

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests