Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: add untyped() type conversion for integer constants #31076

Closed
MichaelTJones opened this issue Mar 27, 2019 · 15 comments
Closed
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@MichaelTJones
Copy link
Contributor

MichaelTJones commented Mar 27, 2019

Proposal

This proposal extends the late-type-binding aspect of untyped constants to situations where the construction of a constant caused a type to be set, but the result is best used/considered as an untyped constant.

Background

There is a gentle robustness to untyped constants. I can add a plain "3" to any numeric expression without casts and verbosity. I want that for derived constants as well and offer the rationalization that even where typed constants are important in a computation, as in ^uint16(0), the result may not itself be usefully typed—the 65535 that results is just the integer 65535, so why should I not be able to add it to an int or uint loop index? untyped(expr) solves this.

Example

Here is an example from a few days ago using the present state:

import (
	"unsafe"
)

//1, 2, 4, 8
const wordBytes = int(unsafe.Sizeof(uintptr(0)))

// 0, 1, 2, 3
const logWordBytes = 0 +
	3*((wordBytes>>3)&1) + // 64-bit machine aligns 0 of 0..7
	2*((wordBytes>>2)&1) + // 32-bit machine aligns 0 of 0..3
	1*((wordBytes>>1)&1) + // 16-bit machine aligns 0 of 0..1
	0 //  8-bit machine always aligns

// 0, 0b1, 0b11, 0b111
const alignMask = uintptr(wordBytes - 1) // == (1<<logWordBytes)-1

Neither the int in int(unsafe.Sizeof(uintptr(0))) nor the uintptr in uintptr(wordBytes - 1) were what I wanted; I wanted a plain untyped integer. (As Rob asked in a video, "What was wrong with the old integer 80?") It was just that the expressions garnered a type and I could not "un-type" them.

I would like to see this instead:

import (
	"unsafe"
)

//1, 2, 4, 8
const wordBytes = untyped(unsafe.Sizeof(uintptr(0)))

// 0, 1, 2, 3
const logWordBytes = 0 +
	3*((wordBytes>>3)&1) + // 64-bit machine aligns 0 of 0..7
	2*((wordBytes>>2)&1) + // 32-bit machine aligns 0 of 0..3
	1*((wordBytes>>1)&1) + // 16-bit machine aligns 0 of 0..1
	0 //  8-bit machine always aligns

// 0b0, 0b1, 0b11, 0b111
const alignMask = wordBytes - 1 // == (1<<logWordBytes)-1

...which would have given me 8, 3, and 7 as simple constants for further convenient use.

Compatibility

This does introduce a new symbol so it is not impossible that it would be seen as violating the Go1 promise.

Jan Mercl pointed out that users can create a type of their own nameduntyped() as shown here in the Go Playground, so that makes it a guarantee breaking change.

Ian lance Taylor commented that it may be seen in as non-breaking under the notion that "If we treat untyped as an ordinary predeclared identifier, then I think the proposal remains backward compatible. The new predeclared untyped is not available in packages that defined the name untyped in package scope. And there is no other way write untyped(EXPR)."

In any case it is a localized change in that it is no change unless a user's created a type named "untyped" and if they have, the compiler and go vet (and go fix) will know and can respond as appropriate, one version of which is to use the user's type cast and then therefore it is not a change in the contract sense but admittedly becomes a potential source of confusion in that hypothetical case.

Ambiguities

This proposal originally focused on integer constants. But discussion here has made clear that it is just as useful generally so that utility and the appeal of orthogonality say that untyped() could remove the float128 from a complex constant just as beneficially.

Implementation

Simple change in the semantic actions of the parser to drop whatever type has been set on a constant.

@gopherbot gopherbot added this to the Proposal milestone Mar 27, 2019
@ianlancetaylor ianlancetaylor changed the title proposal: add untyped() type conversion for integer constants proposal: Go 2: add untyped() type conversion for integer constants Mar 27, 2019
@ianlancetaylor ianlancetaylor added LanguageChange v2 A language change or incompatible library change labels Mar 27, 2019
@ianlancetaylor
Copy link
Contributor

Thanks. This leads to the obvious question of how to handle float, complex, rune, string, and bool constants. These constants too can be both untyped and have a type. Should untyped work for them as well? For example, if untyped is given a value of type float32 or float64, should it return a value of untyped float?

@DeedleFake
Copy link

You can't normally call a function in a constant instantiation, for obvious reasons. Apparently the unsafe pseudo-package is an exception to this since they're compile-time operations that, as their documentation says, return constants. I didn't realize before that this meant that they could be used like that, though.

The problem you're trying to address seems to stem from the fact that they're returning typed constants. Maybe it would make more sense for them to return untyped constants directly, rather than adding a new pseudo-conversion that would only probably really be used in conjunction with them. Alternatively, make this an unsafe package operation, too. unsafe.Untype(), for example.

@MichaelTJones
Copy link
Contributor Author

[responding to Ian]

The notion of untyped(expr) should be universal if that makes sense parse-wise. Had not really considered what an untyped(complex) would be other than complex, but now I realize from your comment that it would be float64 or float128 tagging that would be removed. Yes, I would change the proposal to make it general.

[responding to DeedleFake]
Indeed although each part of this deserves careful comment:

  • Neither untyped() nor sizeof() nor hypothetical peers like typeOf(), offsetOf(), minValueOf() are function calls. They are fixed attributes of the argument being exported from the compiler's symbol table.

  • The general bias against function calls in constant expressions is regrettable. Pure functions should indeed be allowed in such cases. Indeed, consider how much better it would be to initialize a const sqrt2 = math.Sqrt(2) by calling the math function during compilation rather than setting const sqrt2 = 665857.0/470832.0 via a fraction, using an init function and its global variable, through cut-and-paste from Ivy/Mathematica, or hoping that the math package has the special value that you want squirreled away.

  • That was just an example that came up recently. I'm "sure" emotionally but not recalling hard fact that this has come up before in cases not involving unsafe... Aha! I made one up just now so I do have fact that it is not just about package unsafe, though I still don't remember the previous case:

const h = "hello"
const l = len(h)
const example = byte(5) + l
# xor
./xor.go:26:25: invalid operation: byte(5) + l (mismatched types byte and int)

...and here's an example. the length of "hello" is not the number 5, but rather Go's int(5). Now, why would this be the optimal circumstance? Is it important for static code analysis to halt a developer's "wild" abuse of adding the length of the string to an unsigned variable? Should len() return a number cast to unsigned since it can never be negative and our grandchildren may be >63 bit string cardinalities? (thus making normal uses cast it to int) Why would len() want to define a type more specific than the equivalent of inserting the symbols " 5 " into the parse stream?

My thought was just to leave everything as is and offer untyped() as a rescue means for people like me who want their 5 to be a 5 and their log2ByteSize to be a 3. On reflection, maybe this was aiming too low. Maybe it is const type assignment that should be made even more just in time for Go2. I am now sitting here thinking what the implication of this rule would be: never assign a type to a const expression unless absolutely unavoidable and then unassign that type as soon as is possible.

Some examples:

const pdp1140mem = ^uint16(0)

evaluated as

a zero of no type, "0"
forced to be uint16, giving 0b00000000_00000000
all 16 bits inverted, giving 0b11111111_11111111
returned as a value, so untype(expr) to give the typeless "65535"

This is great for me, but is complicated if the developer then tries

const pdp1140mask = ^pdp1140mem

because this has potentially confusing meanings. (not confusing mathematically, just that masking a high-one and inverting would not turn that leading zero into a one as may be expected if they are thinking the constant has 16-bittedness as an attribute.

Alternatives that work are:

const pdp1140mem uint16 = ^uint16(0)
const pdp1140mask = ^pdp1140mem

or

const pdp1140mem = uint16(^uint16(0))
const pdp1140mask = ^pdp1140mem

or

const pdp1140mem = ^uint16(0)
const pdp1140mask = ^uint16(pdp1140mem)

These all make sense, but highlight the breaking changes that newly understood ultra late type binding and rapid unbinding would bring. In their defense, they could be automagically rewritten by the gofix tooling.

About pure functions

Not only the math functions, but math/bits as well. I have had constants that could wisely be done as const to help the compiler but I could not do it other than via init functions or as a variable. I mused a few months ago about pure functions and compile-time evaluation. It would be great as a general facility, but for the restricted domain of math special functions and bit shuffles/reversals and the like, this could indeed be handled by a single switch statement, almost instantaneous evaluation, and then a happy typeless constant would result for the benefit of the compiler's future optimizations.

@randall77
Copy link
Contributor

Here's a nice motivating example for pure functions, from runtime/stack.go:

const (
	// The minimum stack size to allocate.
	// The hackery here rounds FixedStack0 up to a power of 2.
	_FixedStack0 = _StackMin + _StackSystem
	_FixedStack1 = _FixedStack0 - 1
	_FixedStack2 = _FixedStack1 | (_FixedStack1 >> 1)
	_FixedStack3 = _FixedStack2 | (_FixedStack2 >> 2)
	_FixedStack4 = _FixedStack3 | (_FixedStack3 >> 4)
	_FixedStack5 = _FixedStack4 | (_FixedStack4 >> 8)
	_FixedStack6 = _FixedStack5 | (_FixedStack5 >> 16)
	_FixedStack  = _FixedStack6 + 1
)

A pure round-up-to-power-of-two function would be nice instead.
That said, I think this was the only time that I missed using pure functions in const expressions in 6 years of writing Go.

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented Mar 30, 2019

Alas the round-up-to-power-of-two function does require an if test for "am i already a power of two" or else what you have here. With the if passed to prove not 2^n, then it can be a simple shift by leading zeroes. Without C's ternary operator or the Algol like a := if b then c else d fi, you must have the code you show (with the missing ">> 32" last step for generality).

We could add this to bits, and then with pure function support it would be there for you.

func RoundUpPowerTwo(n uint64) uint64{
  // special cases
  switch {
  case n=0:
    return 1
  case (n&(n-1)) == 0: // already a power of two
    return n
  }

  // general case: next higher power of two
  return 1<<bits.LeadingZeros64(n)
}

@griesemer
Copy link
Contributor

griesemer commented May 2, 2019

I'm very much sympathetic to this idea - it would indeed be nice to have an "untype" mechanism. I also agree that most of these problems would disappear if some of the built-in functions such as len, cap, and the unsafe functions would not return an int but an untyped integer (which would default to an int upon assignment to a variable) if we have a constant expression. (I should look into this, it may be possible to do this in a backward-compatible way).

There may still be a need to explicitly remove a type for which we might want to have an untype built-in that takes a typed value where the underlying type is any of the built-in types, and which simply removes that type.

Regarding your specific example at the top, computing wordBytes as an untyped value, that can be done easily w/o any of this, though. See math/bits/bits.go at the top for an example (it can be refined).

Finally, we actually do have a mechanism in the language to remove type information from numeric values ! :-) That operation is (left or right) shift. Shifts are the only operation that take two operands of different types to produce a result of one of those types (the type of the left operand). If the left operand is untyped, then the result remains untyped no matter the type of the right operand. This can be used to assemble any integer from untyped bits. Here's an example:

package main

import "fmt"

const (
	// typed x
	x byte = 0x5b // 0b_0101_1011

	// untyped bits of x
	b0 = 1 - 1>>(x>>0&1)
	b1 = 1 - 1>>(x>>1&1)
	b2 = 1 - 1>>(x>>2&1)
	b3 = 1 - 1>>(x>>3&1)
	b4 = 1 - 1>>(x>>4&1)
	b5 = 1 - 1>>(x>>5&1)
	b6 = 1 - 1>>(x>>6&1)
	b7 = 1 - 1>>(x>>7&1)

	// create an untyped u from the untyped bits of x
	u = b0<<0 | b1<<1 | b2<<2 | b3<<3 | b4<<4 | b5<<5 | b6<<6 | b7<<7

	// verify that u is untyped
	// (if it were typed we couldn't assign it to a complex128)
	_ complex128 = u
)

func main() {
	fmt.Println(b7, b6, b5, b4, b3, b2, b1, b0)
	fmt.Printf("%08b == %08b (%v)\n", x, u, x == u)
}

The scheme can be applied to any integer built-in type which can be "sampled" via the use of shifts; its source code size is O(n) where n is the number of bits of the original value... - fun, fun, fun!

@griesemer
Copy link
Contributor

I added #31795.

@griesemer
Copy link
Contributor

PS: If len would return an untyped integer for a constant expression, one could do len((*[x]byte)(nil)) to remove the type of x.

@ianlancetaylor
Copy link
Contributor

@MichaelTJones Do you think that @griesemer 's alternate proposal #31795 would address the use cases of an untyped conversion?

@griesemer
Copy link
Contributor

ping @MichaelTJones

@bradfitz
Copy link
Contributor

Closing, as @MichaelTJones replied at #31795 (comment) that #31795 would be work instead.

@griesemer
Copy link
Contributor

I'm fine with closing this, but for the record, @MichaelTJones reply came before I mentioned that we couldn't do #31795 in a backward-compatible way for the unsafe.Sizeof/Alignof/Offsetof functions because they return a uintptr. (I suspect one of the common use cases is wanting to have the size of a type as an untyped value, which would require unsafe.Sizeof.) But we could provide a trivial gofix (or even gofmt).

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented May 29, 2019 via email

@bradfitz
Copy link
Contributor

bradfitz commented Dec 3, 2019

Turns out #31795 was closed.

@MichaelTJones, do you want this re-opened?

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented Dec 3, 2019 via email

@golang golang locked and limited conversation to collaborators Dec 2, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

7 participants