Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: add sum types / discriminated unions #19412

Open
DemiMarie opened this issue Mar 5, 2017 · 425 comments
Open

proposal: spec: add sum types / discriminated unions #19412

DemiMarie opened this issue Mar 5, 2017 · 425 comments
Labels
LanguageChange NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Milestone

Comments

@DemiMarie
Copy link

This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:

  • they are value types, like structs
  • the types contained in them are fixed at compile-time

Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Mar 6, 2017

This has been discussed several times in the past, starting from before the open source release. The past consensus has been that sum types do not add very much to interface types. Once you sort it all out, what you get in the end is an interface type where the compiler checks that you've filled in all the cases of a type switch. That's a fairly small benefit for a new language change.

If you want to push this proposal along further, you will need to write a more complete proposal doc, including: What is the syntax? Precisely how do they work? (You say they are "value types", but interface types are also value types). What are the trade-offs?

@bradfitz bradfitz added this to the Proposal milestone Mar 6, 2017
@rsc rsc changed the title Proposal: Discriminated unions proposal: spec: add sum types / discriminated unions Mar 6, 2017
@rsc
Copy link
Contributor

rsc commented Mar 6, 2017

See https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d03t6ji/?st=ixp2gf04&sh=7d6920db for some past discussion to be aware of.

@griesemer
Copy link
Contributor

I think this is too significant a change of the type system for Go1 and there's no pressing need.
I suggest we revisit this in the larger context of Go 2.

@rsc rsc added the v2 A language change or incompatible library change label Mar 13, 2017
@rogpeppe
Copy link
Contributor

rogpeppe commented Mar 22, 2017

Thanks for creating this proposal. I've been toying with this idea for a year or so now.
The following is as far as I've got with a concrete proposal. I think
"choice type" might actually be a better name than "sum type", but YMMV.

Sum types in Go

A sum type is represented by two or more types combined with the "|"
operator.

type: type1 | type2 ...

Values of the resulting type can only hold one of the specified types. The
type is treated as an interface type - its dynamic type is that of the
value that's assigned to it.

As a special case, "nil" can be used to indicate whether the value can
become nil.

For example:

type maybeInt nil | int

The method set of the sum type holds the intersection of the method set
of all its component types, excluding any methods that have the same
name but different signatures.

Like any other interface type, sum type may be the subject of a dynamic
type conversion. In type switches, the first arm of the switch that
matches the stored type will be chosen.

The zero value of a sum type is the zero value of the first type in
the sum.

When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen.

For example:

var x int|float64 = 13

would result in a value with dynamic type int, but

var x int|float64 = 3.13

would result in a value with dynamic type float64.

Implementation

A naive implementation could implement sum types exactly as interface
values. A more sophisticated approach could use a representation
appropriate to the set of possible values.

For example a sum type consisting only of concrete types without pointers
could be implemented with a non-pointer type, using an extra value to
remember the actual type.

For sum-of-struct-types, it might even be possible to use spare padding
bytes common to the structs for that purpose.

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

@rogpeppe How would that interact with type assertions and type switches? Presumably it would be a compile-time error to have a case on a type (or assertion to a type) that is not a member of the sum. Would it also be an error to have a nonexhaustive switch on such a type?

@josharian
Copy link
Contributor

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
  case int:
    // ...

and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

Or can sum types contain only concrete types?

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

@rogpeppe
Copy link
Contributor

Yes, I think it would be reasonable to have a compile-time error
to have a case that can't be matched. Not sure about whether it's
a good idea to allow non-exhaustive switches on such a type - we
don't require exhaustiveness anywhere else. One thing that might
be good though: if the switch is exhaustive, we could not require a default
to make it a terminating statement.

That means that you can get the compiler to error if you have:

func addOne(x int|float64) int|float64 {
    switch x := x.(type) {
    case int:
        return x + 1
    case float64:
         return x + 1
    }
}

and you change the sum type to add an extra case.

@rogpeppe
Copy link
Contributor

rogpeppe commented Mar 22, 2017

For type switches, if you have

type T int | interface{}

and you do:

switch t := t.(type) {
case int:
// ...
and t contains an interface{} containing an int, does it match the first case? What if the first case is case interface{}?

t can't contain an interface{} containing an int. t is an interface
type just like any other interface type, except that it can only
contain the enumerated set of types that it consists of.
Just like an interface{} can't contain an interface{} containing an int.

Sum types can match interface types, but they still just get a concrete
type for the dynamic value. For example, it would be fine to have:

type R io.Reader | io.ReadCloser

What about type T interface{} | nil? If you write

var t T = nil

what is t's type? Or is that construction forbidden? A similar question arises for type T []int | nil, so it's not just about interfaces.

According to the proposal above, you get the first item
in the sum that the value can be assigned to, so
you'd get the nil interface.

In fact interface{} | nil is technically redundant, because any interface{}
can be nil.

For []int | nil, a nil []int is not the same as a nil interface, so the
concrete value of ([]int|nil)(nil) would be []int(nil) not untyped nil.

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

The []int | nil case is interesting. I would expect the nil in the type declaration to always mean "the nil interface value", in which case

type T []int | nil
var x T = nil

would imply that x is the nil interface, not the nil []int.

That value would be distinct from the nil []int encoded in the same type:

var y T = []int(nil)  // y != x

@jimmyfrasche
Copy link
Member

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be? My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil.

It seems overly subtle.

Exhaustive type switches would be nice. You could always add an empty default: when it's not the desired behavior.

@rogpeppe
Copy link
Contributor

The proposal says "When assigning a value to a sum type, if the value can fit into more
than one of the possible types, then the first is chosen."

So, with:

type T []int | nil
var x T = nil

x would have concrete type []int because nil is assignable to []int and []int is the first element of the type. It would be equal to any other []int (nil) value.

Wouldn't nil always be required even if the sum is all value types? Otherwise what would var x int64 | float64 be?

The proposal says "The zero value of a sum type is the zero value of the first type in
the sum.", so the answer is int64(0).

My first thought, extrapolating from the other rules, would be the zero value of the first type, but then what about var x interface{} | int? It would, as @bcmills points out, have to be a distinct sum nil

No, it would just be the usual interface nil value in that case. That type (interface{} | nil) is redundant. Perhaps it might be a good idea to make it a compiler to specify sum types where one element is a superset of another, as I can't currently see any point in defining such a type.

@ianlancetaylor
Copy link
Contributor

The zero value of a sum type is the zero value of the first type in the sum.

That is an interesting suggestion, but since the sum type must record somewhere the type of the value that it currently holds, I believe it means that the zero value of the sum type is not all-bytes-zero, which would make it different from every other type in Go. Or perhaps we could add an exception saying that if the type information is not present, then the value is the zero value of the first type listed, but then I'm not sure how to represent nil if it is not the first type listed.

@jimmyfrasche
Copy link
Member

So (stuff) | nil only makes sense when nothing in (stuff) can be nil and nil | (stuff) means something different depending on whether anything in stuff can be nil? What value does nil add?

@ianlancetaylor I believe many functional languages implement (closed) sum types essentially like how you would in C

struct {
    int which;
    union {
         A a;
         B b;
         C c;
    } summands;
}

if which indexes into the union's fields in order, 0 = a, 1 = b, 2 = c, the zero value definition works out to all bytes are zero. And you'd need to store the types elsewhere, unlike with interfaces. You'd also need special handling for the nil tag of some kind wherever you store the type info.

That would make union's value types instead of special interfaces, which is also interesting.

@shanemhansen
Copy link
Contributor

shanemhansen commented Mar 22, 2017

Is there a way to make the all zero value work if the field which records the type has a zero value representing the first type? I'm assuming that one possible way for this to be represented would be:

type A = B|C
struct A {
  choice byte // value 0 or 1
  value ?// (thing big enough to store B | C)
}

[edit]

Sorry @jimmyfrasche beat me to the punch.

@jimmyfrasche
Copy link
Member

Is there anything added by nil that couldn't be done with

type S int | string | struct{}
var None struct{}

?

That seems like it avoids a lot of the confusion (that I have, at least)

@jimmyfrasche
Copy link
Member

Or better

type (
     None struct{}
     S int | string | None
)

that way you could type switch on None and assign with None{}

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

@jimmyfrasche struct{} is not equal to nil. It's a minor detail, but it would make type-switches on sums needlessly(?) diverge from type-switches on other types.

@jimmyfrasche
Copy link
Member

@bcmills It wasn't my intent to claim otherwise—I meant that it could be used for the same purpose as differentiating a lack of value without overlapping with the meaning of nil in any of the types in the sum.

@jimmyfrasche
Copy link
Member

@rogpeppe what does this print?

// r is an io.Reader interface value holding a type that also implements io.Closer
var v io.ReadCloser | io.Reader = r
switch v.(type) {
case io.ReadCloser: fmt.Println("ReadCloser")
case io.Reader: fmt.Println("Reader")
}

I would assume "Reader"

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

(And I would also expect sums which include only interface types to use no more space than a regular interface, although I suppose that an explicit tag could save a bit of lookup overhead in the type-switch.)

@jimmyfrasche
Copy link
Member

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

@rogpeppe
Copy link
Contributor

@ianlancetaylor That's an excellent point to raise, thanks. I don't think it's hard to get around though, although it does imply that my "naive implementation" suggestion is itself too naive. A sum type, although treated as an interface type, does not have to actually contain direct pointer to the type and its method set - instead it could, when appropriate, contain an integer tag that implies the type. That tag could be non-zero even when the type itself is nil.

Given:

 var x int | nil = nil

the runtime value of x need not be all zeros. When switching on the type of x or converting
it to another interface type, the tag could be indirected through a small table containing
the actual type pointers.

Another possibility would be to allow a nil type only if it's the first element, but
that precludes constructions like:

var t nil | int
var u float64 | t

@rogpeppe
Copy link
Contributor

@jimmyfrasche I would assume ReadCloser, same as you'd get from a type-switch on any other interface.

Yes.

@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R

I don't get this. Why would "this [...] have to be valid for the type switch to print ReadCloser"
Like any interface type, a sum type would store no more than the concrete value of what's in it.

When there are several interface types in a sum, the runtime representation is just an interface value - it's just that we know that the underlying value must implement one or more of the declared possibilities.

That is, when you assign something to a type (I1 | I2) where both I1 and I2 are interface types, it's not possible to tell later whether the value you put into was known to implement I1 or I2 at the time.

@jimmyfrasche
Copy link
Member

If you have a type that's io.ReadCloser | io.Reader you can't be sure when you type switch or assert on io.Reader that it's not an io.ReadCloser unless assignment to a sum type unboxes and reboxes the interface.

@jimmyfrasche
Copy link
Member

Going the other way, if you had io.Reader | io.ReadCloser it would either never accept an io.ReadCloser because it goes strictly right-to-left or the implementation would have to search for the "best matching" interface from all interfaces in the sum but that cannot be well defined.

@meln5674
Copy link

meln5674 commented Aug 13, 2023

@DeedleFake That's fair, and if that's a hard rule, I can certainly try to find some way without any new keywords. That said, do you have any objections to the general concept of something that looks like a struct, but quacks like a union?

@DeedleFake
Copy link

It seems very similar to how unions have been proposed in the past. Right now one of the most fundamental questions seems to be primarily what benefit a completely separate union type would give over simply extending the already existing type list interfaces. For me personally, my biggest problem with the interface approach is that they can always be nil, and unfortunately I don't think that's easily solvable. I do kind of like the parity of interface{} having infinite possible value, struct{} having one, and union{} having zero, but that doesn't justify an entire new category of types only mildly different from existing systems.

@Merovius
Copy link
Contributor

Merovius commented Aug 13, 2023

@meln5674 I have read the entire thread, but I can't link to any specific instance - but yes, this is definitely within the space of things that have been suggested. The main impasse when it comes to union/sum types is the zero value:

  1. You can have a special sentinel zero value, as is the case with proposal: spec: sum types based on general interfaces #57644. It's a bit unfortunate though, as it means every switch has to have a default or nil case, which reduces some of the utility of union/sum types.
  2. You can have it be automatically derived, usually this takes the form your design takes, where the zero value of the first named alternative is the zero value of the union/sum value. This is unfortunate, because it means the semantics of a union/sum type depend on the order the cases are defined.
  3. You can explicitly mark the zero value case and use the zero value of that as the zero value of the union/sum type. This is similar to the previous solution. It makes the semantics order-independent, but requires us to define extra syntax.

All of this becomes more contentious once you restrict yourself of re-using the interface{ a | b } syntax, which seems a pretty likely restriction at this point - it just seems too confusing to have two separate syntax constructs to express union-like things, which have pretty different usage. This basically means the latter two cases are unlikely to happen - currently, interface{ a | b } means the same as interface{ b | a } and that would be bad to change, and expressing an "zero value marker" syntax would then beg the question what that means in the context of a constraint.

So, personally, I don't think it's very likely that we would do something but #57644 - but who knows, I've been surprised before.

@meln5674
Copy link

meln5674 commented Aug 13, 2023

I can see how the order mattering would be a problem, as well as not wanting superfluous syntax for the same thing, however, one issue I have (and discuss) is that (unless I've missed something) is that interface{ a | a } is not valid, yet, it is a very valid thing to want to express (consider Haskell's data Foo = Int | Int). To me, they represent two fundamentally different things (as I discuss in the linked document comparing how interfaces and sum types are actually inverses, architecturally). This also covers the issue that you raise with the nil case, these unions cannot be nil, because they are a fundamentally different thing. That said, if the keyword/syntax and zero value are an issue, what about this?

type foo struct interface { A int | default B int }

This would not introduce any new keywords (though I'm not sure what it would do to the parse tree/lexer), resolve the issue I mentioned above, not require nil, and provide an order-agnostic zero value.

Thoughts?

@DemiMarie
Copy link
Author

@Merovius: what about not having a zero value at all, and instead reporting a compilation error if a discriminated union is used in a context where a zero value is required?

@Merovius
Copy link
Contributor

Merovius commented Aug 13, 2023

@meln5674 data Foo = Int | Int is not actually valid Haskell. You mean data Foo = Left Int | Right Int (for example). That has also been discussed above, as the difference between "sum types" (what Haskell has) and "union types" (what C has and #57644 proposes). Here is the start of that conversation. Under #57644, you'd have to write type Left int; type Right int; type Foo interface{ Left | Right }.

This would be an advantage a dedicated syntax could have (that it would be able to express sum types, instead of union types), but that advantage doesn't invalidate anything else I said.

@DemiMarie That has been discussed above. The zero value is a fundamental concept of Go and it is not practical to have types without a zero value. For example, you'd have to explain what s := make([]T, 1); s[0] evaluates to. Or ch := make(chan T); close(ch); v, _ := <-ch; v. Or m := make(map[struct{}]T); m[struct{}]. Or………

It's not going to happen.

@meln5674
Copy link

Yes, that's what I meant, thanks for pointing that out, its been a while since I've actually written Haskell.

I suppose that would be closer (albeit with lots of extra boilerplate), however, (again, unless I've missed something), that doesn't allow the use of struct tags. For a concrete example, I'm big a fan of this library, which allows embedding EBNF into struct tags to generate lexers and parsers. A union type construct that allows struct tags seems like a huge value add for it, and I could see something similar for other packages like even encoding/json. Combined with the removal of nil, I feel that an additional construct is worth the extra complexity.

In any case, thank you for at least humoring me, I'll watch this space.

@Merovius
Copy link
Contributor

Merovius commented Aug 13, 2023

For what it's worth, my own personal favorite alternative to #57644 is @zephyrtronium's #54685. It is a pretty well-written proposal that has managed like few others to fit sum types into Go, in my opinion. It still faces the same basic issues, but if we would want to add some sort of union/sum type and we would not re-use the union-element syntax from general interfaces, it would definitely be what I'd put my weight behind. So it might be useful to contrast with #57644, to see opposite ends of the decision spectrum here.

@Splizard
Copy link

@meln5674
Go already supports union types through generics and struct-reflection. Including support for marshaling and struct tags (see some of my earlier replies). What practical benefits does a language change have?

@meln5674
Copy link

meln5674 commented Aug 14, 2023 via email

@Splizard
Copy link

Splizard commented Aug 15, 2023

@meln5674
Have a look at this https://go.dev/play/p/Pp06ahQrt5-
(https://pkg.go.dev/github.com/splizard/tagged)

type Float tagged.Union[[8]byte, struct {
	Bits32 tagged.As[Float, float32]
	Bits64 tagged.As[Float, float64]
}]
var FloatWith = tagged.Fields(Float{})

var pi Float
pi = FloatWith.Bits32.New(math.Pi)
pi = FloatWith.Bits64.New(math.Pi)

switch tagged.FieldOf(pi) {
	case FloatWith.Bits32.Field:
		var f32 float32 = FloatWith.Bits32.Get(pi)
		fmt.Println(f32)
	case FloatWith.Bits64.Field:
		var f64 float32 = FloatWith.Bits64.Get(pi)
		fmt.Println(f64)
}

This has good performance, it is type-safe and IMO has a good developer experience. Best of all, unlike any of the proposals here, it is supported today. You can use it. You can customise it. It's only a couple hundred lines. I haven't included reflection support or marshalling in this implementation but it is easy to add (I've done it before). If the unsafe code in this particular implementation makes you nervous, you can implement unions with an underlying interface/any. Depends on your performance use-case.

@rami3l
Copy link

rami3l commented Aug 15, 2023

@Splizard Thanks for the library!

However, I do want to point out that there is a distinction between having something available as a library API and having it natively supported by the language (or its stdlib). In the former case, a tagged union will be much less likely to appear in public API surfaces.

@meln5674
Copy link

While that's an interesting approach, and perhaps this isn't the appropriate place for a code review, I can see a number of things that would steer me away from using this in production code.

First, this isn't type-safe. I suspect that you and I are using that word to mean different things. For example, if you were to mix up the cases on a switch, that would not be a compiler error, but a runtime panic. Compare this to a switch t := x.(type). Similarly, if one incorrectly guesses the largest size of any field and supplies the wrong buffer type parameter, then that is also a runtime panic. If it is your position that it is the programmer's responsibility, not the language's, to catch such errors, I would suggest you investigate a new and upcoming language by the name of FORTRAN77, which aligns with such philosophy. Reading the source code, unless I'm misunderstanding, it appears some methods panic if a field is a struct with a pointer, which would be a complete non-starter for a huge amount of Go I've seen.

Please don't interpret this post as "this is everything wrong with your code, your library is bad, and you should feel bad", that is absolutely not what I'm trying to say, this is quite a clever piece of code, and I imagine it was a useful exercise to write it, but rather, these are the things that are an inevitable consequence of implementing tagged unions using workarounds, because the language lacks appropriate facilities to implement them correctly. The entire purpose of this, my, and other similar proposals is to add those facilities so that they don't need to be workarounds, plus all the benefits of a unified ecosystem that rami3l touched on.

If you genuinely believe that this is the correct approach, then you should propose adding something based on it to the standard library, and if you do, I wish you the best of luck.

@Merovius
Copy link
Contributor

Merovius commented Aug 15, 2023

@meln5674 @Splizard I don't think this discussion is particularly helpful. In particular, I'll note that there is a certain amount of snark entering here. Let's just note that 1. there are certain ways to work around the lack of sum/union types in Go, but 2. this issue is about adding a first-party, language-level support for them - which is a feature request that has a place (even though I'm personally not that on board with its usefulness either).

@gophun
Copy link

gophun commented Aug 16, 2023

Your proposal introduces a new keyword, union, and is therefore not backwards compatible.

Adding a new keyword does not seem to be completely ruled out, as noted by @rsc. So if we made unions separate with the simple "zero value based on first type" rule:

type foo union { a | b }

Then these

interface { a | b }
interface { a | b ; c | d }

could be short form for:

interface { union{ a | b } }
interface { union{ a | b } ; union{ c | d } }

Within interfaces the order of types in the unions wouldn't matter.

There would be only one source of type options (union), and it would integrate nicely into other places, such as interfaces as demonstrated above.

We could still generalize interfaces as proposed in #57644 later, if we feel that we want to have nilable unions, or we could leave them the way they are today.

@Merovius
Copy link
Contributor

Merovius commented Aug 16, 2023

@gophun The main argument against adding a separate union concept like that (regardless of whether or not that requires a new keyword - see #54685 for a design that doesn't require a new keyword) is the conceptual overlap with interfaces. That is, the main reason not to do it, is that we'd then have both unions in general interfaces and unions-as-types, which serve very similar functions, but are different concepts, which is not very orthogonal.

(Also, obligatory note that all of this has been discussed above, at length)

@gophun
Copy link

gophun commented Aug 16, 2023

@gophun The main argument against adding a separate union concept like that [...] is the conceptual overlap with interfaces

I had hoped that I demonstrated that there is no conceptual overlap with interfaces, but a conceptual composition.

They are orthogonal concepts, but they can be composed, given these short-hand rules:

interface { union { a | b } } <=> interface { a | b } <=> a | b (in type parameter lists)

@Merovius
Copy link
Contributor

Merovius commented Aug 16, 2023

The conceptual overlap is that both represent a list of concrete types. Preventing them from be used interchangeably doesn't prevent that overlap in concept - it's what makes it confusing.

@gophun
Copy link

gophun commented Aug 16, 2023

int and interface { int } also both represent a list of concrete types (of size one), but I don't see them as conceptual overlap. In my mind unions are a building block, and everything can come together in interface as a big composing brace.

@gophun
Copy link

gophun commented Aug 16, 2023

Ideally, I would prefer just a | b instead of union{ a | b }, as it would drive home the idea that the a | b in interface{ a | b } is just another component of the interface and not a direct feature of interface (as it is currently), but that's probably a parsing ambiguity (although I'm not sure). It would also reduce nicely to a in the single type case (e.g. int).

@gophun
Copy link

gophun commented Aug 16, 2023

but that's probably a parsing ambiguity (although I'm not sure)

Can anybody help me find a potential parsing ambiguity? These seem to be possible to distinguish from the bitwise | operator:

type foo a | b		// union type definition

type foo a		// single type case

var x a | b		// variable declaration

interface { a | b }	// component of the interface (not a direct feature of 'interface', unlike today)

struct {
	F a | b		// used as type for struct field
}

foo.(a|b)		// type assertion

func f(x a|b)		// function parameter

func f[T a|b]()		// type parameter

@Merovius
Copy link
Contributor

Merovius commented Aug 16, 2023

@gophun I agree that int and interface{ int } do not have a lot of conceptual overlap. I disagree that both represent a list of types. int is a type - a set of values. interface{ int } is a set of types. A bag containing an apple is not the same thing as an apple.

FTR you might well disagree with the judgement or not find the argument of conceptual overlap persuasive. But that is the reason we have not accepted any proposal for a separate union/sum-type construct. And if we accepted using a separate concept, I'd be strongly in favor of making it actual sum types, not union types - if we have to pay the cost of overlap and a new concept anyways, we might as well make it the better one.

As for the ambiguity, I don't believe there is an ambiguity per se, but that syntax might require more lookahead than Go has traditionally been comfortable with.

@gophun
Copy link

gophun commented Aug 16, 2023

@Merovius
In the place where interfaces with type lists are accepted today, which is type parameter constraints, [T interface{int}] and [T int] are identical. There is no difference between bag and apple here, both represent a type set.

@Merovius
Copy link
Contributor

Merovius commented Aug 16, 2023

Fair enough. I still don't think that invalidates the argument that having to separate ways to express a union of types is a downside of that idea. Note that union-elements where not introduced to be able to write interface{ int }, but to write interface{ a | b }. That there is some overlap in a corner-case does not mean interface{ a | b } does not carry its own weight. It also doesn't remove the overlap between interface{ a | b } and union { a | b }.

Again, it's fine to not find these arguments persuasive. But they are the reason we haven't done this yet and it might be beneficial to try to understand them, instead of trying to argue them away.

@gophun
Copy link

gophun commented Aug 16, 2023

It also doesn't remove the overlap between interface{ a | b } and union { a | b }.

As you know, since you have followed my train of thought, I actually prefer the variant without the union keyword. So it would be the overlap between interface{a | b} and a | b, which already exists today as the overlap between [T interface{a | b}] and [T a | b].

@Merovius
Copy link
Contributor

@gophun Okay. I can see that you are not interested in understanding the context of this discussion.

@Merovius
Copy link
Contributor

To recap, from my perspective:

  1. Over on proposal: spec: sum types based on general interfaces #57644 you suggested that the zero value of interface{ a | b } should be the zero value of the first type mentioned. I explained that this is an issue, as currently interface{ a | b } and interface{ b | a } are equivalent and because interface{ a | b ; b | a } is valid and has no clear "first type".
  2. To that, your response was to suggest a new thing, which you called union, in which the order does matter. Note that the actual syntax and name does not matter here - the point is, you suggested to invent a new thing, in which order matters, which would thus have to be different from general interfaces, to solve this problem.
  3. When pointing out that this new thing would have significant overlap with general interfaces and union elements, your response was to say "no it doesn't". And then to pivot towards "I actually mean a | b, which is just a building block in interface{ a | b } - which brings us back to the ordering problem.

The point is, either a) you are suggesting a new thing, which has significant conceptual overlap with union elements in interfaces, or b) you are suggesting to assign a type-meaning to union elements in interfaces, in which case you have the zero-value and/or ordering problem.

This isn't a syntax problem. It's a problem with the concepts.

@gophun
Copy link

gophun commented Aug 16, 2023

It's true that my thoughts are evolving, that's what a discussion is good for. My goal is to find a solution that has the desired non-nilable property of a union type, integrates well with the existing language, and that does not have more conceptual overlaps than what already exists.

And then to pivot towards "I actually mean a | b, which is just a building block in interface{ a | b } - which brings us back to the ordering problem.

No, it does not bring us back to that.

a | b would be a standalone type. When used in a variable context like var x a | b the ordering matters (for the zero value). When this type is embedded in an interface declaration like interface { a | b } the ordering does not matter, because the interface is either used in a type parameter context to describe a type set for a constraint where the ordering does not matter, or if #57644 would be adopted at some later point in time, which I do not necessarily propose, but which would be possible, the zero value of interface { a | b } would be nil, just like the zero value of interface { int } would be nil).

@Merovius
Copy link
Contributor

Merovius commented Aug 16, 2023

Fair enough, we are talking about inventing a new thing with conceptual overlap.

TBQH it is quite confusing that you are choosing to focus on a syntax that gives the new thing also syntactical overlap with general interfaces. For example, if a|b is a type, I would expect interface{ a | b } to be "the constraint of a type to being a|b, just like interface{ int } is "the constraint of a type to be int". Not something else - and yes, it means something else, because a|b and b|a are different types, but interface{ a|b } and interface{ b|a } are the same constraint. But really, that's a problem with the syntax - syntax can be solved after we have an idea of the semantics.

I really don't believe you can escape the dichotomy I lined out above. Either union types are the same as general interfaces - in which case you can't rely on order to answer the zero value problem - or they are not - in which case there now are two very similar, but distinct concepts in the language. This isn't really a thing to "solve". Instead, if we are to add union (or sum) types to Go, we'll just have to accept the downsides of one of those branches.

@gophun
Copy link

gophun commented Aug 16, 2023

if a|b is a type, I would expect interface{ a | b } to be "the constraint of a type to being a|b, just like interface{ int } is "the constraint of a type to be int". Not something else

This is true; I hadn't seen this. So consider the idea scrapped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests