Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: package-scope and function-scope types get the same name #38893

Open
ianlancetaylor opened this issue May 6, 2020 · 12 comments
Open
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@ianlancetaylor
Copy link
Contributor

This programs shows that we can get two different types with the same name. This is permitted by the reflect package, but it seems unnecessarily confusing. Should the compiler generate different names for these types?

package main

import (
	"fmt"
	"reflect"
)

type myint64 int64

func main() {
	var i1 myint64 = 100
	fmt.Printf("%T: %v\n", i1, i1)

	t1 := reflect.TypeOf(i1)

	type myint64 int32
	var i3 myint64 = 100
	fmt.Printf("%T: %v\n", i3, i3)

	t2 := reflect.TypeOf(i3)

	fmt.Printf("t1: %v %v\n", t1.Kind(), t1.String())
	fmt.Printf("t2: %v %v\n", t2.Kind(), t2.String())
	fmt.Println(t1 == t2)
}

Output:

main.myint64: 100
main.myint64: 100
t1: int64 main.myint64
t2: int32 main.myint64
false
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 6, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone May 6, 2020
@JohnReedLOL
Copy link

JohnReedLOL commented May 6, 2020

@ianlancetaylor I need to warn you that you are treading into some messiness. Consider this:

package main

import "fmt"

type myint64 int64

const myi myint64 = 999
const myj myint64 = 999

func main() {
	var closure1 = func (i myint64) { fmt.Printf("i: %T: %v\n", i, i) }

	type myint64 int64 // Note that this is a DIFFERENT myint64 than the one up top.
	var closure2 = func (j myint64) { fmt.Printf("j: %T: %v\n", j, j) }

	closure1(myi)
	closure2(myj) // Cannot use myj (type myint64) as type myint64 in argument to closure2
	closure1(myint64(myi)) // Cannot use myint64(myi) (type myint64) as type myint64 in argument to closure1
	closure2(myint64(myj))
}

In the above example, there are two different types named myint64 which both refer to a int64, but closure1 can only take in the first type of myint64 and closure2 can only take in the second.

That's when they both refer to the same underlying type. Now consider what can happen if they refer to different underlying types:

package main

type myint64 int64

const myi int64 = 100
const myj int16 = 100

func main() {
	// This takes in the first kind of myint64 declared above.
	var closure1 = func (i myint64) { }

	type myint64 int16 // This is another kind of myint64.
	var closure2 = func (j myint64) { }

	closure1(myi) // Cannot use myi (type int64) as type myint64 in argument to closure1
	closure2(myj) // Cannot use myj (type int16) as type myint64 in argument to closure2
	closure1(myint64(myi)) // Cannot use myint64(myi) (type myint64) as type myint64 in argument to closure1
	closure2(myint64(myj))

	{
		// Let's try to call closure1 by redefining myint64 in an inner scope.
		type myint64 int64 // Oops, this int64 is different than the one that closure1 takes in!
		closure1(myint64(myi)) // Cannot use 'myint64(myi)' (type myint64) as type myint64
	}
}

It gets worse. Each inner scope can produce types declarations that have the same name (but different underlying Kind) than the declaration in the outer scope. You can have a closure in a closure in a closure and each inner closure can define myint64 with a different underlying Kind than the outer closure did, like so:

package main

import (
	"fmt"
	"reflect"
)

type myint64 int64

func main() {
	var i1 myint64 = 10
	fmt.Printf("i1: %T: %v\n", i1, i1)

	t1 := reflect.TypeOf(i1)

	type myint64 int32
	var i2 myint64 = 10
	fmt.Printf("i2: %T: %v\n", i2, i2)

	t2 := reflect.TypeOf(i2)

	fmt.Printf("t1: %v %v\n", t1.Kind(), t1.String())
	fmt.Printf("t2: %v %v\n", t2.Kind(), t2.String())
	fmt.Println("t1 == t2? ", t1 == t2)

	var closure func() = func() {
		type myint64 int16
		var i3 myint64 = 10
		fmt.Printf("i3: %T: %v\n", i3, i3)
		t3 := reflect.TypeOf(i3)
		fmt.Printf("t3: %v %v\n", t3.Kind(), t3.String())
		fmt.Println("t3 == t1? ", t3 == t1)
		fmt.Println("t3 == t2? ", t3 == t2)

		var innerClosure func() = func() {
			type myint64 int8
			var i4 myint64 = 10
			fmt.Printf("i4: %T: %v\n", i4, i4)
			t4 := reflect.TypeOf(i4)
			fmt.Printf("t4: %v %v\n", t4.Kind(), t4.String())
			fmt.Println("t4 == t1 || t4 == t2? ", t4 == t1 || t4 == t2)
			fmt.Println("t4 == t3? ", t4 == t3)
		}
		innerClosure()
	}
	closure()
}
/*
Output:
i1: main.myint64: 10
i2: main.myint64: 10
t1: int64 main.myint64
t2: int32 main.myint64
t1 == t2?  false
i3: main.myint64: 10
t3: int16 main.myint64
t3 == t1?  false
t3 == t2?  false
i4: main.myint64: 10
t4: int8 main.myint64
t4 == t1 || t4 == t2?  false
t4 == t3?  false
*/

GoLang's type system obviously has some holes. I think the issue is that the type of a variable (ex. myint64) doesn't actually tell you the Kind that the variable points to. Like you can have four different variables all of type myint64 and they can all have different Kinds. Maybe put the Kind in the type of the variable. Like instead of %T printing main.myint64, maybe main.myint64-Int64, main.myint64-Int32, main.myint64-Int16, etc.

Either that or throw a deprecation warning at compile time if people declare a new type with the same name as an existing type.

But allowing the inclusion of -Kind would make it clearer what the underlying Kind is:

var i1 main.myint64-Int16 = 1
var i2 main.myint64-Int32 = 2
var i3 main.myint64-Int64 = 3

{ // Inner scope
  type myint64 struct{ s string }
  var i4 main.myint64-Struct = myint64{"hello"}
  var i5 main.myint64-Struct = struct{ s string }{"world"}
}

When it prints the % it can print the entire main.myint64-Int16 instead of just main.myint64 so you can see the Kind without having to do reflect.TypeOf(i1).Kind().

@JohnReedLOL
Copy link

JohnReedLOL commented May 6, 2020

@ianlancetaylor This example below should not compile. It's too confusing. If two declared types have the same package, name, and Kind, they should be interchangeable or the code should not compile.

package main

import "fmt"

/* This example should not compile. The two MyTypes should conflict. */
func main() {
	type MyType int
	var closure1 = func (i MyType) { fmt.Printf("i: %T: %v\n", i, i) }
	{
		var myInt0 MyType = 100
		type MyType int // Note: this is a DIFFERENT MyType than the one above.
		var closure2 = func (j MyType) { fmt.Printf("j: %T: %v\n", j, j) }
		var myInt1 = 100
		var myInt2 MyType = 100 // This MyType is not the same as the MyType on myInt0.
		closure1(myInt0)
		// closure1(MyType(100)) // Cannot use 'MyType(100)' (type MyType) as type MyType
		closure2(100)
		// closure2(myInt0) // Cannot use 'myInt0' (type MyType) as type MyType
		closure2(MyType(myInt1))
		closure2(myInt2)
	}
}
/*
Output:
i: main.MyType: 100
j: main.MyType: 100
j: main.MyType: 100
j: main.MyType: 100
 */

@ianlancetaylor
Copy link
Contributor Author

I don't see why you say that Go's type system has holes.

Don't confuse the name of the type as printed by %T or as used by the reflect package with the actual type. As the reflect documentation already says, there are cases where different types can have the same name (see the String method in https://golang.org/pkg/reflect/#Type). This is what your example programs show.

Your example in #38893 (comment) works as the language spec describes. Each type MyType declaration introduces a new type. The different MyType types are separate types that happen to have the same name. Go permits names to be shadowed, and this is just an example of that.

@JohnReedLOL
Copy link

JohnReedLOL commented May 7, 2020

"I don't see why you say that Go's type system has holes."

I misspoke. It's a little unintuitive is all.

Now before I continue, I'd like to take a little digression. Take this very mundane Go code snippet:

type Point struct {X int; Y int}

Coming from other programming languages like Kotlin and Scala, I expected that code to express the same intent as this code:

// Kotlin
data class Point(var X Int, var Y Int) { }
// Scala
final case class Point(var X Int, var Y Int) { }

That being said, after learning about type declarations, I now see that this Go code means something fundamentally different. What it means is that the type struct {x int, y int} is being assigned to a new, unique type (without modification to the original type). struct {x int, y int} and Point can be used to instantiate structs the same way as one another, but they produce different types (although casting between them is not necessary because struct {x int, y int} is not a defined type). If I did type Point = struct {X int; Y int}, that would mean something different.

With that in mind, let's take a look at this example code:

package main

import (
	"fmt"
	"reflect"
)

type MyInt int
var myInt MyInt = 1

/*
Note: I am expecting all the types in []types to be different from one another.
*/
func main() {
	type MyInt int
	var types [5]reflect.Type = [5]reflect.Type{}
	types[0] = reflect.TypeOf(myInt) // Set types[0]
	var myInt MyInt = 1
	types[1] = reflect.TypeOf(myInt) // Set types[1]
	// Surprise: types[2] and types[3] are the same type.
	for i := 2; i < 4; i++ {
		type MyInt int
		var int0 MyInt = 1
		types[i] = reflect.TypeOf(int0)
	}
	{
		// Set types[4]
		type MyInt int
		var int0 MyInt = 1
		types[4] = reflect.TypeOf(int0)
	}

	var type0 = types[0]
	var type1 = types[1]
	var type2 = types[2]
	var type3 = types[3]
	var type4 = types[4]

	fmt.Printf("type0. Kind: %v, String: %v\n", type0.Kind(), type0.String())
	fmt.Printf("type1. Kind: %v, String: %v\n", type1.Kind(), type1.String())
	fmt.Printf("type2. Kind: %v, String: %v\n", type2.Kind(), type2.String())
	fmt.Printf("type3. Kind: %v, String: %v\n", type3.Kind(), type3.String())
	fmt.Printf("type4. Kind: %v, String: %v\n", type4.Kind(), type4.String())

	fmt.Println("type0 == type0?", type0 == type0)
	fmt.Println("type0 == type1?", type0 == type1)
	fmt.Println("type0 == type2?", type0 == type2)
	fmt.Println("type0 == type3?", type0 == type3)
	fmt.Println("type0 == type4?", type0 == type4, "\n")

	fmt.Println("type1 == type1?", type1 == type1)
	fmt.Println("type1 == type2?", type1 == type2)
	fmt.Println("type1 == type3?", type1 == type3)
	fmt.Println("type1 == type4?", type1 == type4, "\n")

	fmt.Println("type2 == type2?", type2 == type2)
	fmt.Println("type2 == type3?", type2 == type3)
	fmt.Println("type2 == type4?", type2 == type4, "\n")

	fmt.Println("type3 == type3?", type3 == type3)
	fmt.Println("type3 == type4?", type3 == type4, "\n")

	fmt.Println("type4 == type4?", type4 == type4, "\n")
}
/*
Output:
type0. Kind: int, String: main.MyInt
type1. Kind: int, String: main.MyInt
type2. Kind: int, String: main.MyInt
type3. Kind: int, String: main.MyInt
type4. Kind: int, String: main.MyInt

type0 == type0? true
type0 == type1? false
type0 == type2? false
type0 == type3? false
type0 == type4? false

type1 == type1? true
type1 == type2? false
type1 == type3? false
type1 == type4? false

type2 == type2? true
type2 == type3? true
type2 == type4? false

type3 == type3? true
type3 == type4? false

type4 == type4? true

*/

Now as a beginner this is a little unintuitive, but given that Go was created by the creator of C, it makes some sense. Notice that when type MyInt int is inside of a loop it doesn't instantiate a new type for each iteration of the loop (even though it does for a similar non-loop scope). The type declaration behaves like a const static variable from the C programming language. Like if I jump over type MyInt int with a goto it still gets defined and used sort of like a const static variable in C does. I haven't programmed in C since Operating Systems 101 and I found this behavior unexpected.

If your code compiles and you have one type declaration on line 10 and another one with the same name on line 20, everything between line 10 and line 20 with that name gets the first type and everything after line 20 gets the second type, regardless of the presence of loops, closures, goto, etc, and regardless of whether the two type declarations refer to the same Kind or different Kinds.

Oh, and then the compiler says "Cannot use type MyType as type MyType", and when you try to cast your variable to MyType you can't because your variable actually needs to be cast to the MyType on line 10, not the MyType on line 20, which isn't obvious from the error message.

At the very least type declarations should give their enclosing scope (maybe more - in Scala types have a whole path). The types are static so it shouldn't be super hard to get that information. Like if there is one MyType in package scope and another in function foo, the first could be package.MyType and the second could be package.foo.MyType. If inside function foo MyType was declared in a closure assigned to a variable named bar, it would be nice if it were package.foo.bar.MyType instead of some long ugly anonymous lambda stacktrace name like package.foo.$$Lambda$1/791452441.MyType, but then what about if the closure is anonymous and gets immediately invoked without being assigned? Even then going straight to the enclosing function (package.foo.MyType or package.foo.$Closure.MyType) would probably be better than package.foo.$$Lambda$1/791452441.MyType.

Also, if the whole path with enclosing scope could show in the output of reflect.Type.String() and in the compiler error message, that would make things less unintuitive. Less "which MyType is it talking about?" and no "I can't cast this because it's shadowed by some other MyType in another scope." Like if you could explicitly cast package.foo.MyType(variable) as a different type cast than package.MyType(variable), that would be better.

Also, if it were my programming language I would require types to have unique fully qualified names. Like even in Scala I don't like when I do a type path or type projection for the type of a class inside another class (ex. AClass.BClass or AClass#BClass) and there are two classes with the same name inside of the outer class. In Scala it just grabs the one that is declared closest to the top, but I would prefer it if that sort of scenario couldn't come up in the first place.

@JohnReedLOL
Copy link

JohnReedLOL commented May 8, 2020

" Like even in Scala I don't like when I do a type path or type projection for the type of a class inside another class (ex. AClass.BClass or AClass#BClass) and there are two classes with the same name inside of the outer class. In Scala it just grabs the one that is declared closest to the top, but I would prefer it if that sort of scenario couldn't come up in the first place."

Correction:

// These objects are singletons within singletons...
object Failure {
  {
    {
      object Failure {
        object Failure {
          def printFail(): Unit = {
            println("Failure at the top!")
          }
        }
      }
    }
    object Failure {
      object Failure {
        def printFail(): Unit = {
          println("Failure in the middle!")
        }
      }
    }
  }
  object Failure {
    object Failure {
      def printFail(): Unit = {
        println("Failure at the end!")
      }
    }
  }
}

object Main {
  def main(args: Array[String]): Unit = {
    // Guess what this line prints...
    Failure.Failure.Failure.printFail()
  }
}
/*
Output:
Failure at the end!
 */

Yeah, this is horrible, but at least you can access singletons within singletons and types within types in Scala (as long as they're not shadowed). In Go if a type was defined and used inside of a function and that function has been returned from, that type is totally inaccessible regardless of where in the function it was defined or whether it was declared exported or unexported.

I have no expertise whatsoever, but I like the idea of being able to access an exported, declared type via a unique path like: package.function.$closure.innerFunction.$scope.Type, where $closure is an unnamed closure, $scope is an unnamed scope, and Type is the declared type that I am referring to. If all the declared types had unique paths, that would be more intuitive. Then I could think of the types like being in a tree.

@mdempsky
Copy link
Member

Getting back to @ianlancetaylor's original question:

Should the compiler generate different names for these types?

I don't see a need to. reflect.Type.String already warns that String has collisions from package names, and so reflect.Type values should be compared directly to test type identity.

I think we can wait until we have a report of a real-world use case where this would be useful.

@aclements
Copy link
Member

@ianlancetaylor , would it be okay to bump this to Unplanned?

@mdempsky
Copy link
Member

mdempsky commented Dec 1, 2020

I think so, yes.

@aclements aclements modified the milestones: Go1.16, Unplanned Dec 1, 2020
@ianlancetaylor
Copy link
Contributor Author

It's true that String says that the result may not be unique, but it strongly implies that this is because of package names, not because it will use the same string within a single package.

I think this should be in Backlog, not Unplanned, and I will change it accordingly.

@ianlancetaylor ianlancetaylor modified the milestones: Unplanned, Backlog Dec 2, 2020
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@arvidfm
Copy link

arvidfm commented Jan 30, 2024

I don't know if Name() necessarily needs to be unique (within a given package), but it would be nice to have some way of getting a (predicable) unique name of a type, which isn't something that seems to be provided at all at the moment. (Currently struggling with mapping a type definition from an AST tree to a type obtained using reflection when the type is defined inside a function, since as far as I can see reflect.Type doesn't provide anything that could help with guessing the scope the type was defined in)

@mdempsky
Copy link
Member

I think this issue is blocked waiting on the proposed user-visible changes. For example:

Why are unique strings needed? Why can't types.Type values be disambiguated directly with ==?

How should the strings be further disambiguated? How stable do the disambiguators need to be?

What changes need to be made to the reflect API? These will need to go through the proposal approval process.

@arvidfm
Copy link

arvidfm commented Jan 30, 2024

Why are unique strings needed? Why can't types.Type values be disambiguated directly with ==?

In my case it's because I need to relate a reflect.Type to an ast.TypeSpec, so I can't compare them directly.

How should the strings be further disambiguated? How stable do the disambiguators need to be?

I just came across #55924 which seems relevant - having some way of accessing the link name for any type might just do the trick. Though that's assuming that the way the link name is constructed is documented and could be calculated deterministically from the AST of a package.

For some more context on my specific issue, given a reflect.Type or one of its fields, I would like to be able to look up the corresponding doc comment from the source code. Incidentally, if a link name (or whatever would be used here) was also generated for anonymous structs to help locate them in the source code, that would also make my life a lot easier. (I'm currently having to do a lot of bookkeeping to keep track of how anonymous structs are accessed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

6 participants