Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: allow constants of arbitrary data structure type #6386

Open
gopherbot opened this issue Sep 14, 2013 · 49 comments
Open

proposal: spec: allow constants of arbitrary data structure type #6386

gopherbot opened this issue Sep 14, 2013 · 49 comments
Labels
LanguageChange v2 A language change or incompatible library change
Milestone

Comments

@gopherbot
Copy link

by RickySeltzer:

var each1 = []byte{'e', 'a', 'c', 'h'}
    const each2 = []byte{'e', 'a', 'c', 'h'}

The 'var' is accepted, the 'const' is not. This is a defect in the language spec and
design.

1. What is a short input program that triggers the error?
http://play.golang.org/p/Jbo9waCn_h

2. What is the full compiler output?
prog.go:7: const initializer []byte literal is not a constant
 [process exited with non-zero status]
@robpike
Copy link
Contributor

robpike commented Sep 14, 2013

Comment 1:

Issue #6388 has been merged into this issue.

@robpike
Copy link
Contributor

robpike commented Sep 14, 2013

Comment 2:

Labels changed: added priority-someday, languagechange, removed priority-triage, go1.2maybe.

Status changed to Accepted.

@cznic
Copy link
Contributor

cznic commented Sep 16, 2013

Comment 3:

I'm against this. If it would have to have constant semantics then its run time costs
are the same as today, only hidden.
        const c = []byte{1}
        a := c
        a[0] = 42
        b := c
        fmt.Println(b[0] == 1)
The above can print 'true' only if the c's backing array is copied in assignment to 'a',
however the const declaration gives an illusion of always using the same backing array -
like is the case of a string's backing array.
IOW: 1. nothing is gained by const []T and 2. run time costs get hidden and thus
potentially confusing.

@gopherbot
Copy link
Author

Comment 4 by RickySeltzer:

If slice constants have this hidden cost problem, could we at least have constant arrays
and arrays of arrays?
 const sentence = [...][...]byte{"a", "series", "of", "pieces", "of", "text"}
1. This should be sufficiently const that it could, in the right environment, go into
rom or the code segment.  That is, be immutable, ideally.
2. I should be able to type it without getting RSI from repeating '[]byte' for every
word.
3. For byte structures like this, it isn't absolutely necessary that we would be allowed
to enter arbitrary runes, although that would be technically feasible, and useful. 
Especially for those who aren't English-language programmers.  And it would be
consistent with the rest of Go.

@gopherbot
Copy link
Author

Comment 5 by RickySeltzer:

Change "arbitrary runes" ==> "arbitrary Unicode characters with large code points".

@griesemer
Copy link
Contributor

Comment 6:

Some comments:
1) This is neither a defect of the language nor the design. The language was
_deliberately_ designed to only permit constant of basic types.
2) The implications of such a change are much more far-fetching than meets the eye:
there are numerous open questions that would have to be answered _satisfactorily_; and I
don't think we are there yet.
For instance, if we allow such constants, where is the limit? Do we allow constant maps?
What about constant channels? Constant pointers? Is it just a special case for slices?
etc.
A first step might be to allow constant arrays and structs as long as they are only
composed of fields that can have constant types themselves.
An even smaller step (which I do think we should do) is to make "foo"[i] a constant if i
is a constant (right now it's a non-constant byte).
Finally, note that it's often not very hard for a compiler to detect that a
package-level variable is never modified. Thus, an implementation may choose to optimize
the variable initialization and possibly compute it compile time. At this point, the
const declaration only serves as documentation and for safety; there's no performance
loss anymore.
But again, we have tried to keep the type system (incl. what constants are) relatively
simple so that it doesn't get into the way. It's not clear the benefits are worth the
price in additional complexity.

Status changed to LongTerm.

@gopherbot
Copy link
Author

Comment 7 by RickySeltzer:

In many projects I have programmed, there is a need for non-writable initialized data
larger than a single variable.  This is analogous to an 'asm' directive that initializes
bits in the code segment.  Something that the compiler has to do anyway.
Also, embedded systems often need to put data into read-only-memory (ROM).  
This has more to do with storage class than the type system.  Declarations rather like
the following might be simplest:
const (
   data =  []byte("The quick brown fox jumped over the lazy dog.")
   π = float64(3.14159)
   bits = []uint32{0x12345678, 0xBabeAbed}
)
I don't see any additional complexity here.  But it might be easier to introduce a new
modifier, "readonly" to supplement "const", if for some reason the above chokes the
grammar. 
Putting data in the code segment is just my way of saying that it's simple.  Actually,
it would be a security risk.  Anything marked "const" or "readonly" should be NX (Not
eXecutable).

@gopherbot
Copy link
Author

Comment 8 by RickySeltzer:

Also, the language and runtime currently make strings immutable.  So the "Hidden cost"
problem in #4, above, already exists.  Just extend this to const.

@cznic
Copy link
Contributor

cznic commented Oct 19, 2013

Comment 9:

@8: No, you cannot do &str[expr], nor you can do str[expr] = expr, but you can of all of
that with a []byte, for example.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 10:

Labels changed: added go1.3maybe.

@gopherbot
Copy link
Author

Comment 11 by RickySeltzer:

Ah.  It was a typo on my part to use slice notation []byte, instead of array notation,
[...]byte.  I don't propose constant slices.  I propose constant (initialized) arrays. 
See here: http://play.golang.org/p/eCn6ip--w0.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 12:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 13:

Labels changed: added repo-main.

@OneOfOne
Copy link
Contributor

any news about this?

It'd be nice to have something like const keys = [...]string{"a", "b", ... }

@griesemer
Copy link
Contributor

This will not change in the foreseeable future.

The bar for language changes is extremely high at this point. "It'd be nice" is certainly not sufficient even as a starting point. To have a chance of even just being considered, there would need to be a full proposal together with a detailed analysis of cost and benefit.

In this specific case, extending the concept of constants to other than just basic types would be a significant change with all kinds of repercussions. I like to add also to the comment in the initial issue report that the current situation is not a "defect" in the spec - it was conceived as is pretty much from day one, for very good reasons.

Leaving open for a future Go 2 if there will ever be one.

@griesemer griesemer added the v2 A language change or incompatible library change label Feb 12, 2016
snsinfu added a commit to snsinfu/learn-go that referenced this issue May 8, 2017
This is a simple state machine implementation.

- Go has no array constants [1]
- Go has no builtin enum, use an idiom [2]

[1]: golang/go#6386
[2]: http://stackoverflow.com/a/14426447/5266681
@rsc rsc changed the title spec: array constants spec: allow constants of arbitrary data structure type Jun 17, 2017
@rsc rsc changed the title spec: allow constants of arbitrary data structure type proposal: spec: allow constants of arbitrary data structure type Jun 17, 2017
@ianlancetaylor
Copy link
Contributor

This is about whether Go should have immutable values. There are a number of things to consider.

One thing to consider is how to handle

const s = [...]int{ f() }

That is, do the elements of an const array have to be const themselves? Is it possible to set up such an array in an init function?

People have already raised questions about const slices. Another question is whether const values are addressable. If I can take a pointer to a const value, such as a field in a const struct, presumably I can't modify the value through that pointer, but what stops me from doing that? This proposal doesn't have any language mechanism for distinguishing a normal pointer from a pointer to a const (which I think is a good thing). So something has to catch erroneous writes and, presumably, panic. If we can put the initializer in read-only memory then I guess that will happen automatically, but then it's hard to initialize the elements in a function.

One idea I've mentioned elsewhere is a freeze function, that would do a shallow copy of a value into memory that is then made read-only (somehow). I don't know if that can be implemented efficiently but it is an approach to this general problem that does not involve a language change.

@CAFxX
Copy link
Contributor

CAFxX commented Apr 15, 2020

The only reason listed above to bar addressability is that it precludes initialization by a function; see #6386 (comment). Why is that a problematic limitation?

I may be missing something here, but even initialization by a function does not sound impossible. Excluding partial evaluation a-la graalvm, it should be possible to do initialization at runtime, taking care to store const values in pages that will be marked read-only as soon as initialization is over. It wouldn't be trivial to implement, but it shouldn't necessarily prevent addressability either.

@griesemer
Copy link
Contributor

@CAFxX Marking memory holding constants as read-only after initialization would certainly provide the most flexibility. But w/o evaluation at compile-time, the compiler won't know the values of those constants, and transitively, the values of any constant expressions depending on those constants. A variety of compile-time checks would need to become runtime checks. Not impossible (and perhaps still backward-compatible), but something to consider.

Hence the question whether the much simpler approach would provide enough benefit.

@networkimprov
Copy link

networkimprov commented Apr 15, 2020

As structs are typically passed by pointer, the inability to use a constant struct where a pointer is required is a rather severe limitation.

To support compile-time evaluation, could the compiler generate offsets into a table for constant addresses that gets populated at runtime?

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Apr 15, 2020

@changkun It's a good point that we should consider permitting const for values of interface type. But it's still hard to permit initializing them with a function while retaining safety.

@networkimprov The reason to ban addressability is that if a program can take the address of a value, then we have no completely reliable way of preventing that program from using that pointer to change the value. A const value that can be changed during program execution cannot really be described as a constant. Yes, this is a severe restriction.

The idea of making the memory read-only after it has been initialized is certainly tempting, but it's hard to see how to make that completely reliable in all cases.

Generating offsets into a table would imply that different values of the same type have different kinds of values. That seems difficult to implement.

There are all serious concerns and may well mean that the simple approach suggested in #6386 (comment) is not really useful.

@networkimprov
Copy link

offsets into a table would imply that different values of the same type have different kinds of values

Sorry, I didn't follow; could you elaborate? (I meant that the compiler would replace pointers to constants with a table lookup yielding a pointer.)

making the memory read-only after it has been initialized is certainly tempting, but it's hard to see how to make that completely reliable

In what cases would that not work?

@ianlancetaylor
Copy link
Contributor

@networkimprov Perhaps I don't understand your suggestion. The issue is how to handle a composite literal with an embedded pointer, where that pointer might point to some package-scope variable that can in turn by changed while the program is running. The problem is that the struct field (say) has a pointer type. I think you are suggesting that the pointer type could be represented not as a pointer, but as an index into a table. But then we have a pointer type that could be either a normal pointer value or an index into a table. How does arbitrary code, which may be in a different package, know how to interpret that value? As I say, perhaps I misunderstand your suggestion, so can you clarify? Thanks.

Making memory read-only after it has been initialized would not work if we can't reliably identify and collect all the memory in question. We can only mark entire pages as read-only, so we have to be sure that all read-only memory is in one set of pages while all modifiable memory is in a different set of pages. Suppose now that we initialize a constant value with a slice of some non-constant variable, where the values being slices themselves have pointers. What should we do and how can we make it completely reliable?

@networkimprov
Copy link

Suppose now we initialize a constant value with a slice of some non-constant variable ...
... a composite literal with an embedded pointer, where that pointer might point to some package-scope variable

Wouldn't those fail as initialization by non-constant expression? Shouldn't pointers that are constant only reference constants?

you are suggesting that the pointer type could be represented not as a pointer, but as an index into a table. But then we have a pointer type that could be either a normal pointer value or an index into a table.

Apologies if this is naive, but I'm suggesting the compiler produce a complete table lookup at any site taking the address of a constant. It can't just substitute an offset for an address.

const kVa = 1 // address placed in const_addrs at startup
const kVb = 2 // ditto
const kVc = 3    // not in const_addrs
const kPa = &kVa // not in const_addrs
...
f(&kVa, &kVb) // compiles to f((*int)(const_addrs + 0), (*int)(const_addrs + 8))

func f(a, b *int) { *a = 9 } // runtime error for above call

*kPa = 9 // compiler error

@ianlancetaylor
Copy link
Contributor

OK, even if we only permit pointers to constant values, still other code can pull out those pointers and modify the values to which they point, so that can only be permitted if we can be absolutely certain that we can always put those values in read-only memory.

I think I now understand what you are getting at with your constant offset, but I don't understand how that solves the problem. That is an implementation that we could use if we assume that we can always reliably identify which memory has to be read-only, and if we assume that we have a way to initialize that memory and then make it read-only. I find it hard to convince myself that both of those are true on all the platforms for which Go works.

@networkimprov
Copy link

The offsets into a table of addresses are meant to support compile-time evaluation, where each address of a constant needs a compile-time value, raised above by @griesemer.

On which platforms do you think read-only memory could be difficult to achieve, and why? Maybe we could pull in domain experts to comment on them...

In what cases would it be difficult to determine that an object should be read-only?

@griesemer
Copy link
Contributor

@networkimprov Let's look at a concrete example:

type List struct {
    next *List
    data int
}

const c = List{next: &List{next: &List{}}}
var v = List{next: &List{next: &List{}}}

Are you suggesting that the next field behaves differently when we have a "constant list" vs in the normal case (offset vs regular pointer)?

@ianlancetaylor
Copy link
Contributor

Consider wasm, for example. Also, the bare metal tamago port that is under discussion.

@CAFxX
Copy link
Contributor

CAFxX commented Apr 16, 2020

I completely agree that there may be simpler options with a narrower scope. But just to ensure I am getting the right picture:

But w/o evaluation at compile-time, the compiler won't know the values of those constants, and transitively, the values of any constant expressions depending on those constants.

The compiler wouldn't know the values of those constants(-at-runtime), but would know their addresses, so other constants expressions that depend on those values would become constant(-at-runtime) as well, i.e. transitevely be initialized at startup, and then marked read-only.

The compiler would need to track that these are constants(-at-runtime) and report error if code attempts to modify their value.

In a sense, they would behave like const for writes, and like var for reads.

As compilers improve and are progressively able to partially evaluate more at compile-time, more of these constant-at-runtime instances would turn into regular constants and not require initialization at runtime.

A variety of compile-time checks would need to become runtime checks.

Even in the model described just now? Wouldn't marking read-only the memory used to store the constants at runtime sidestep additional runtime checks?

@griesemer
Copy link
Contributor

@CAFxX No, it wouldn't. For instance, given an array a and a constant expression x, right now a[x] will not require a runtime index bounds check. But if we don't know the exact value of x, it will need an index bound check at runtime. The compiler may become smarter over time, but we could still not guarantee the constant evaluation in general.

I am not saying this is a show-stopper, but it is something that would have to be taken into account and which would change existing behavior. What now leads to a compile-time error may lead to a runtime panic if the compiler cannot evaluate all constant values at compile time.

@networkimprov
Copy link

networkimprov commented Apr 16, 2020

Are you suggesting that the next field behaves differently when we have a "constant list" ...?

EDIT: there would be no difference to the user between the two List chains you defined, except that for the const one:
a) c.next.data = v gives a compiler error,
b) var p = &c ... p.next.data = v gives a runtime error.

If you printed a .next field, you'd see an address from the const_addrs table I described in #6386 (comment).

Note that I am not advocating "constant-at-runtime" semantics.

Consider wasm, for example. Also, the bare metal tamago port that is under discussion.

@ianlancetaylor wasm lacks rather a lot of features; should a nascent world limit Go progress on mature platforms? And it appears this issue has been raised: WebAssembly/design#1278.

Re tamago, I'd imagine they'll happily evolve as Go does; cc @abarisani.

@griesemer
Copy link
Contributor

@networkimprov What happens if one assigns that const list c to variable v in my example?

@networkimprov
Copy link

networkimprov commented Apr 16, 2020

That copies the outermost List into a variable, with same effect as:

const kL = List{next: &List{}}
var   vL = List{next: &kL}
...
vL.data = v      // ok
kL.data = v      // compiler error
kL.next.data = v // compiler error
vL.next.data = v // runtime error

@abarisani
Copy link

Re tamago, I'd imagine they'll happily evolve as Go does; cc @abarisani.

I can confirm.

@marcel
Copy link

marcel commented Apr 17, 2020

It is unbelievable that we cannot write constant errors:

- const ErrVerification = errors.New("crypto/rsa: verification error")
+ var ErrVerification = errors.New("crypto/rsa: verification error")

You can though @changkun...

type Error string

func (e Error) Error() string {
  return string(e)
}

const ErrVerification = Error("crypto/rsa: verification error")

@changkun
Copy link
Member

You can though @changkun...

type Error string

func (e Error) Error() string {
  return string(e)
}

const ErrVerification = Error("crypto/rsa: verification error")

No. It does not solve the problem generally, because this approach cannot be applied to const errChain = wrap(errArgs). The fundamental problem is that we are missing immutable data, whereas typecasting here is just a verbose workaround for unpacked error.

@ValarDragon
Copy link

ValarDragon commented Sep 11, 2021

Just wanted to bump / voice immense support for this. As @changkun notes, this is honestly critical for security of the code. Additionally, its huge for API safety. Theres a large class of situations where you want publicly exported structs that other modules should be able to Read, but not accidentally set.

(I mean these can still be gotten around via many unsafe hacks, but thats not the point. The points to get people to not accidentally mess up, due to inability to provide a Safe API)

@go101
Copy link

go101 commented Mar 5, 2022

Just a remainder reminder that, if arrays could be declared as constants, then there is a parsing ambiguity in the following code:

type C[T [int]*string] struct{}

@go101
Copy link

go101 commented May 4, 2022

It looks the custom generics feature has sentenced this proposal to death.

const S = [2]int{1, 2}
const T = 2

// The following declaration is always thought as a generic type declaration.
type BoolArray [S[1] * T]bool // T (untyped int constant 2) is not a type

// If S is a constant array, then the above line may also viewed as array type declaration.

[update]: To avoid absolutely sentencing the constant array proposal to death,
elements of constant array must not be treated as constants,
just as elements of constant strings are not treated as constants (they are just immutable but not constants).

@leaxoy

This comment was marked as duplicate.

@myaaaaaaaaa
Copy link

I think a more interesting approach would be to try and adapt the untyped nature of Go's consts to other Kinds:

package main

func main() {
	const aConst = {
		1: "a",
		3: "b",
		5: "c",
	}
	var aMap1 map[int]string = aConst // OK
	var aMap2 map[any]any    = aConst // OK
	var aSlice []string      = aConst // OK
	var aArray [4]string     = aConst // Index out of bounds error

	const bConst = {
		"name": "bob",
		"age":  30,
	}
	var bMap1 map[string]any = bConst // OK
	var bMap2 map[string]int = bConst // Incompatible assign error due to "name": "bob"
	var bStruct1 struct {             // OK
		name  string
		email string
		age   int
	} = bConst
	var bStruct2 struct { // Incompatible assign error due to lacking an "age" field
		name string
	} = bConst
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests