Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: permit write defined type in type term #52318

Open
changkun opened this issue Apr 13, 2022 · 12 comments
Open

proposal: spec: permit write defined type in type term #52318

changkun opened this issue Apr 13, 2022 · 12 comments
Milestone

Comments

@changkun
Copy link
Member

Author background

Would you consider yourself a novice, intermediate, or experienced Go programmer?

Depends on who am I comparing to and what are the metrics that define these levels. A few objective measures: Using Go since 2018, 1 year of cloud-native related industrial backend development, 3 years of research prototyping in 3D graphics associated with rendering and geometry processing, spare time projects in Go which can be found in my GitHub profile, contributing to Go project, etc. Probably "experienced"?

What other languages do you have experience with?

C++ (wrote a book), Python and R (for research and relevant data analysis), JS and TypeScript (for teaching in university), etc.

Related proposals

Has this idea, or one like it, been proposed before?

I don't know, and I hope not. There were some discussions in #52295.

Does this affect error handling?

Probably not.

Is this about generics?

Yes.

Proposal

What is the proposed change?

I propose to permit writing defined types in type terms. In this case,
we are allowed to use defined types in a type constraint.

For instance, the following code does not work in Go 1.18 because the tilde operator can only be applied to a type other than a defined type:

package main

type Type struct{}

type MyType Type

func F1[T ~Type](t T)     {} // ERROR: invalid use of ~ (underlying type of Type is struct{})
func F2[T ~struct{}](t T) {} // OK
func F3[T ~MyType](t T)   {} // ERROR: invalid use of ~ (underlying type of MyType is struct{})

func main() {
	x := Type{}
	y := MyType{}
	F1(x) // ERROR: cannot implement ~Type (empty type set)
	F1(y) // ERROR: cannot implement ~Type (empty type set)

	F2(x) // OK
	F2(y) // OK
}

Who does this proposal help, and why?

See some more elaborated examples in #52295.

Please describe as precisely as possible the change to the language.

I spend sometime to read into the relevant part of the spec, then I prepared a change to the current 1.18 spec for discussion, see: https://go.dev/cl/400095

What would change in the language spec?

I spend sometime to read into the relevant part of the spec, then I prepared a change to the current 1.18 spec for discussion, see: https://go.dev/cl/400095

Please also describe the change informally, as in a class teaching Go.

Allow using ~ following with a defined type.

Is this change backward compatible?

I think and hope so. But I am not entirely an expert on the spec.

From what I see so far is that: with this change, it unlocks code that cannot be written before.
Hence I would expect this is a backward compatible change.

Possibly some change of description regarding what is T's underlying type (I hope the wording is correct), but I guess this only stays on a conceptual level and does not affect the practice (?) Again, I am not a language expert.

Orthogonality: how does this change interact or overlap with existing features?

I don't think there is an overlap, but I don't know enough about the spec.

Is the goal of this change a performance improvement?

Probably not, but I am not certain on how the change could associate with performance.

Costs

Would this change make Go easier or harder to learn, and why?

I think it at least well addresses some of my confusion regarding the use of the language feature of type constraints. See discussions in #52295.
Thanks @cherrymui for the discussion and suggestions.

What is the cost of this proposal? (Every language change has a cost).

Sorry, I don't know enough, probably complicating the concept of "underlying type" for people reading into the spec?

How many tools (such as vet, gopls, gofmt, goimports, etc.) would be affected?

Sorry, I don't know enough about this aspect.

What is the compile time cost?

Sorry, I don't know enough about this aspect.

What is the run time cost?

Sorry, I don't know enough about this aspect.

Can you describe a possible implementation?

Sorry, I am not familiar enough with the current implementation details and am also not yet a compiler expert. I might imagine that inferring the underlying type of a type that is redefined multiple times might be slow or slightly inconvenient (?)

Do you have a prototype? (This is not required.)

For the spec, yes: https://go.dev/cl/400095
For the implementation, no.

@changkun changkun added LanguageChange Proposal generics Issue is related to generics labels Apr 13, 2022
@gopherbot gopherbot added this to the Proposal milestone Apr 13, 2022
@gopherbot
Copy link

Change https://go.dev/cl/400095 mentions this issue: spec: permit write defined type in type term

@ianlancetaylor
Copy link
Contributor

CC @griesemer

This seems to me to be potentially confusing. It seems natural to think that ~MyInt only accepts type arguments that are defined as MyInt, but in fact it accepts other types as well.

This change also doesn't seem very necessary. I can write

type MyUnderlyingType = struct { a int }
type MyStruct struct { a int }
type MyOtherStruct struct { a int }
func F[T ~MyUnderlyingType]() {}
var _ = F[MyStruct]
var _ = F[MyOtherStruct]

It seems to me that this gives me the same effect without changing the meaning of ~Type.

@changkun
Copy link
Member Author

changkun commented Apr 13, 2022

This seems to me to be potentially confusing. It seems natural to think that ~MyInt only accepts type arguments that are defined as MyInt, but in fact it accepts other types as well.

I noticed this while prototyping the spec change, and intentionally made the proposal to let this to happen:

<p>
Additionally, two constraints with only embedded type elements (not a method) are the same if their underlying types are the same:
</p>

<pre>
[T ~MyInt]                   // = [T ~int]
[T ~MyStruct]                // = [T ~struct{a int}]
[T ~MyOtherStruct]           // = [T ~struct{b int}]
</pre>

So that we could guarantee the concept of "underlying type" stays with (what I understood) the current meaning in the spec, and also not touch the potentially relevant "core type" concept.

Still, it's quite surprising for me to hear it is "natural to think that ~MyInt only accepts types that are defined as MyInt", because MyInt is used with a tilde, and currently, it refers back to the language predeclared or composite types.

For

func F[T MyInt]()

We might agree on this "natural think", but I can't easily stand for this "natural think".
If we imagine this to happen, it means we will need to permit all defined types to be an "underlying type". That means the "underlying type" concept won't align with the current spec, and it is also not clear what is the core type in this case, also not clear how to differentiate "underlying type" and "core type".

type MyInt int
type MyAnotherInt MyInt
func G[T ~int]().         // 1, underlying type is int
func G[T ~MyInt]()        // 2, underlying type is MyInt
func G[T ~MyAnotherInt]() // 3, underlying type is MyAnotherInt

I also don't know if this "nature think" intended to think that type set defined by 1, 2, and 3 gets smaller and smaller and why this makes sense.

Nevertheless, I have to say I am not a language expert; therefore, can't fully evaluate these two types of thinking and what are the consequences. But I think the main purpose of this proposal is to write code that we can't do today, which adds to your follow up comments:

This change also doesn't seem very necessary. I can write

type MyUnderlyingType = struct { a int }
type MyStruct struct { a int }
type MyOtherStruct struct { a int }
func F[T ~MyUnderlyingType]() {}
var _ = F[MyStruct]
var _ = F[MyOtherStruct]

It seems to me that this gives me the same effect without changing the meaning of ~Type.

This was discussed in #52295.
The trick may be a workaround for simple cases where the underlying type could be written but does not work when it gets complicated:

  1. It may not deal with circular struct where the underlying type needs a field to point to itself
  2. The underlying type may need to define methods
  3. The underlying type may not access or have unexported fields.

Specifically:

package main

import "sync"

type MyComplexStruct struct { // maybe from an external package
	a int
	b *MyComplexStruct
	c sync.Mutex
}

type S = MyComplexStruct

func F[T ~S](t T) {}

func main() {
	F(MyComplexStruct{}) // ERROR
}

A more complicated example could be found in #52295.

@changkun
Copy link
Member Author

changkun commented Apr 13, 2022

Just want add more thoughts regarding the "natural thinking" and how it might gets conflict. The following is how code works today. Hence I think the "underlying type" really means the language predeclared or composite types:

package main

type Type struct{}      // the underlying type of Type is struct{}
type MyType Type        // the underlying type of MyType is struct{}
type MyAnotherType Type // the underlying type of MyAnotherType is struct{}

func F[T ~struct{}](t T) {} // Today

func main() {
	x := Type{}
	y := MyType{}
	z := MyAnotherType{}

	F(x) // OK
	F(y) // OK
	F(z) // OK
}

Say if we can write this:

func F[T ~MyType](t T) {} 

and disallowing

F(Type{})

Then I guess we also can't write:

F(MyAnotherType{})

Because MyAnotherType is not defined by MyType.
Now, let's have

type MyThirdType MyType
type MyFouthType MyThirdType

Can we do the following? and why? Align with Ian's comments in #52318 (comment), this is possible:

F(MyFouthType{})

Should we say "the underlying type of MyFouthType is MyType"?
or, should we say "the core type of MyFouthType is MyType"?
or, should we say "MyFouthType is 'derived' type from MyType"?

Either way, the "natural thinking" seems to introduce an inconsistent behavior comparing to how the code works today.

@changkun
Copy link
Member Author

This seems to me to be potentially confusing. It seems natural to think that ~MyInt only accepts type arguments that are defined as MyInt, but in fact, it accepts other types as well.

After rethinking this argument with a fresh mind, I see the root issue is that ~ cannot state the "direction" of type approximation, and the key ambiguity at the moment is how to compare the "size" of the type set ~MyInt and ~int when:

type (
    A  int
    B A
    C A
    D B
    E B
    F C
)
int -> A -> B -> D
       |    |
       |    + -> E
       |
       + -> C -> F

There are three cases of understanding:

  1. This proposal (similar to == B): ~B includes int, and any successor types derived from int, such as A, B, ..., E, and F.
  2. Ian's intuition (similar to >= B): ~B includes its successor types derived from B, e.g., B, D, E, and any subsequent types from B but not include its ancestor A or int, or sibling C's successors F.
  3. One more possibility (similar to <= B): ~B includes all of its ancestors A and int, and its direct successor types D, E, but not C, F.

This sense comes from the type declaration type MyInt int can be considered as an ancestor of int or can be considered as something approximately equivalent to int. Should we consider ~MyInt as something approximately equivalent to int?

For simplicity, I think the 1st option could fit because it considers types that are underlying the same. The 2nd option conveys a bit more cognitive load, where a person needs to understand the "order" of defined types. This could be very complicated when types are redefined very differently at scale.

One additional downside for the 2nd is about the ~ operator itself: it is not an asymmetric operator similar to < or >. Maybe one could think of a difference between ~T and T~,
but I would imagine this further complicates the language, which seems to deviate from the initial goal of the proposal.

@ianlancetaylor
Copy link
Contributor

We're not going to make any generics changes any time soon, so putting this on hold.

@sammy-hughes
Copy link

sammy-hughes commented Apr 13, 2022

If the purpose of the tilde is tied to the normal form of a type, then the current behavior is appropriate. Considering that a named type may have associated methods, and that aliasing that type, a different (e.g. empty) set of methods is associated, such an assertion of structural compatibility does not guarantee compatibility of meaning.

Example 1, supposing the following:

type A struct{X int; Y float;}
type B struct{X int; Y float;}
type C A

What should "~A" mean? Would it mean "Assignable to struct{X int; Y float;} " or "Derived at some point from A".

If the former, then it's misleading. If the former, the exclusion of B is suggested, but B is apparently valid. If the former, you assert that any type, having any particular meaning, which is coercible to A, can be re-expressed in terms of A, with meaning as if it were A.

If the latter, you're proposing a new facility which is potentially problematic. If the latter, you propose another overhaul of the type system.

Example 2, suppose the following:

package alpha
type A struct{X int; Y float;}
type b A
type C b
package beta
import "alpha"
func f[T ~alpha.A](x T) T { return x; }
func main() {
    c := f(alpha.C{1,1.0})
}

The type resolution must resolve a lineage which includes an unexported type. How can that be resolved?

The contrary case, where ~A means "Assignable to A, as shorthand for ~struct{X int; Y float} is admittedly shorter, but it also requires special knowledge to understand.

For what it's worth, the supportable proposal would be to suggest some kind of "type as value" scheme, such that one could reason about the types of X, versus Y and Z, such as if type of X equals type of Y else if type of X equals type of Y. However, if you wanted this, you can accomplish it by trying bad assignments and recovering. If this is a feature that then meaningfully moves the boundaries of what can be expressed, at least then it can be proven that the mechanism is possible and the facility has value, and a proposal could address the performance implications of intending a panic/recover cycle. Further, such improvements would benefit the rest of the language, as well.

@changkun
Copy link
Member Author

changkun commented Apr 13, 2022

Sorry, I don't understand where your "Assignable to ..." came from. The whole conversation context is about type constraints. Your example

f(alpha.C{1,1.0})

instantiates f to:

func f(x alpha.C) alpha.C { return x; }

There is nothing in relation to assignability to alpha.A whatsoever.

@sammy-hughes

This comment was marked as off-topic.

@changkun

This comment was marked as off-topic.

@sammy-hughes

This comment was marked as off-topic.

@changkun

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants