Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: read-only types #22876

Open
jba opened this issue Nov 25, 2017 · 49 comments
Open

proposal: Go 2: read-only types #22876

jba opened this issue Nov 25, 2017 · 49 comments
Labels
LanguageChange NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Milestone

Comments

@jba
Copy link
Contributor

jba commented Nov 25, 2017

I propose adding read-only types to Go. Read-only types have two related benefits:

  1. The compiler guarantees that values of read-only type cannot be changed, eliminating unintended modifications that can cause subtle bugs.
  2. Copying as a defense against modification can be reduced, improving efficiency.

An additional minor benefit is the ability to take the address of constants.

This proposal makes significant changes to the language, so it is intended for Go 2.

All new syntax in this proposal is provisional and subject to bikeshedding.

Basics

All types have one of two permissions: read-only or read-write. Permission is a property of types, but I sometimes write "read-only value" to mean a value of read-only type.

A type preceded by ro is a read-only type. The identifier ro is pronounced row. It is a keyword. There is no notation for the read-write permission; any type not marked with ro is read-write.

The ro modifier can be applied to slices, arrays, maps, pointers, structs, channels and interfaces. It cannot be applied to any other type, including a read-only type: ro ro T is illegal.

It is a compile-time error to

  • modify a value of read-only type,
  • pass a read-only slice as the first argument of append,
  • use slicing to extend the length of a read-only slice,
  • or send to or receive from a read-only channel.

A value of read-only type may not be immutable, because it may be referenced through another type that is not read-only.

Examples:

  1. A function can assert that it will not modify its argument.
func transmit(data ro []byte) { ... }

The compiler guarantees that the bytes of data will not be altered by transmit.

  1. A method can return an unexported field of its type without fear that it will be changed by the caller.
type BufferedReader struct {
  buf []byte
}

func (b *BufferedReader) Buffer() ro []byte {
  return buf
}

This proposal is concerned exclusively with avoiding modifications to values, not variables. Thus it allows assignment to variables of read-only type.

var EOF ro error = errors.New("EOF")
...
EOF = nil

One could imagine a companion proposal that also used ro, but to restrict assignment:

ro var EOF = ... // cannot assign to EOF

I don't pursue that idea here.

Conversions

There is an automatic conversion from T to ro T. For instance, an actual parameter of type []int can be passed to a formal parameter of type ro []int. This conversion operates at any level: a [][]int can be converted to a []ro []int for example.

There is an automatic conversion from string to ro []byte. It does not apply to nested occurrences: there is no conversion from [][]string to []ro []byte, for example.

(Rationale: ro does not change the representation of a type, so there is no cost to adding ro to any type, at any depth. A constant-time change in representation is required to convert from string to ro []byte because the latter is one word larger. Applying this change to every element of a slice, array or map would require a complete copy.)

Transitivity

Permissions are transitive: a component retrieved from a read-only value is treated as read-only.

For example, consider var a ro []*int. It is not only illegal to assign to a[i]; it is also illegal to assign to *a[i].

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro *int and *ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The Address Operator

If v has type ro T, then &v has type *ro T.

If v has type T, then ro &v has type ro *T. This bit of syntax simplifies constructing read-only pointers to struct literals, like ro &S{a: 1, b: 2}.

Taking the address of constants is permitted, including constant literals. If c is a constant of type T, then &c is of type ro *T and is equivalent to

func() ro *T { v := c; return &v }()

Read-Only Interfaces

Any method of an interface may be preceded by ro. This indicates that the receiver of the method must have read-only type.

type S interface {
   ro Marshal() ([]byte, error)
   Unmarshal(ro []byte) error
}

If I is an interface type, then ro I is effectively the sub-interface that contains just the read-only methods of I. If type T implements I, then type ro T implements ro I.

Read-only interfaces can prevent code duplication that might otherwise result from the combination of read-only types and interfaces. Consider the following code from the sort package:

type Interface interface {
	Less(i, j int) bool
	Len() int
	Swap(i, j int)
}
 
func Sort(data Interface) bool {
	… code using Less, Len, and Swap …
}
 
func IsSorted(data Interface) bool {
	… code using only Less and Len …
}

type IntSlice []int
func (x IntSlice) Less(i, j int) bool { return x[i] < x[j] }
func (x IntSlice) Len() int { return len(x) }
func (x IntSlice) Swap(i, j int) { x[i], x[j] = x[j], x[i] }
 
func Ints(a []int) { // invoked as sort.Ints
	Sort(IntSlice(a))
}
 
func IntsAreSorted(a []int) bool {
	return IsSorted(IntSlice(a))
}

We would like to allow IntsAreSorted to accept a read-only slice, since it does not change its argument. But we cannot
cast ro []int to IntSlice, because the Swap method modifies its receiver. It seems we must copy code somewhere.

The solution is to mark the first two methods of the interface as read-only:

type Interface interface {
	ro Less(i, j int) bool
	ro Len() int
	Swap(i, j int)
}

func (x ro IntSlice) Less(i, j int) bool { return x[i] < x[j] }
func (x ro IntSlice) Len() int { return len(x) }

Now we can write IsSorted in terms of the read-only sub-interface:

func IsSorted(data ro Interface) bool {
	… code using only Less and Len …
}

and call it on a read-only slice:

func IntsAreSorted(a ro []int) bool {
	return IsSorted(ro IntSlice(a))
}

Permission Genericity

One of the problems with read-only types is that they lead to duplicate functions. For example, consider this trivial function, ignoring its obvious problem with zero-length slices:

func tail1(x []int) []int { return x[1:] }

We cannot call tail1 on values of type ro []int, but we can take advantage of the automatic conversion to write

func tail2(x ro []int) ro []int { return x[1:] }

Thanks to the conversion from read-write to read-only types, tail2 can be passed an []int. But it loses type information, because the return type is always ro []int. So the first of these calls is legal but the second is not:

var a = []int{1,2,3}
a = tail1(a)
a = tail2(a) // illegal: attempt to assign ro []int to []int

If we had to write two variants of every function like this, the benefits of read-only types would be outweighed by the pain they cause.

To deal with this problem, most programming languages rely on overloading. If Go had overloading, we would name both of the above functions tail, and the compiler would choose which to call based on the argument type. But we do not want to add overloading to Go.

Instead, we can add generics to Go—but just for permissions. Hence permission genericity.

Any type inside a function, including a return type, may be preceded by ro? instead of ro. If ro? appears in a function, it must appear in the function's argument list.

A function with an ro? argument a must type-check in two ways:

  • a has type ro T and ro? is treated as ro.
  • a has type T and ro? is treated as absent.

In calls to a function with a return type ro? T, the effective return type is T if the ro? argument a is a read-write type, and ro T if a is a read-only type.

Here is tail using this feature:

func tail(x ro? []int) ro? []int { return x[1:] }

tail type-checks because:

  • With x declared as ro []int, the slice expression can be assigned to the effective return type ro []int.
  • With x declared as []int, the slice expression can be assigned to the effective return type []int.

This call succeeds because the effective return type of tail is ro []int when the argument is ro []int:

var a = ro []int{1,2,3}
a = tail(a)

This call also succeeds, because tail returns []int when its argument is []int:

var b = []int{1,2,3}
b = tail(b)

Multiple, independent permissions can be expressed by using ro?, ro??, etc. (If the only feasible type-checking algorithm is exponential, implementations may restrict the number of distinct ro?... forms in the same function to a reasonable maximum, like ten.)

In an interface declaration, ro? may be used before the method name to refer to the receiver.

type I interface {
  ro? Tail() ro? I
}

There are no automatic conversions from function signatures using ro? to signatures that do not use ro?. Such conversions can be written explicitly. Examples:

func tail(x ro? []int) ro? []int { return x[1:] }

var (
    f1 func(x ro? []int) ro? []int = tail  // legal: same type
    f2 func(ro []int) ro []int = tail      // illegal: attempted automatic conversion
    f3 = (func(ro []int) ro []int)(tail)   // legal: explicit conversion
)

Permission genericity can be implemented completely within the compiler. It requires no run-time support. A function annotated with ro? requires only a single implementation.

Strengths of This Proposal

Fewer Bugs

The use of ro should reduce the number of bugs where memory is inadvertently modified. There will be fewer race conditions where two goroutines modify the same memory. One goroutine can still modify the memory that another goroutine reads, so not all race conditions will be eliminated.

Less Copying

Returning a reference to a value's unexported state can safely be done without copying the state, as shown in Example 2 above.

Many functions take []byte arguments. Passing a string to such a function requires a copy. If the argument can be changed to ro []byte, the copy won't be necessary.

Clearer Documentation

Function documentation often states conditions that promise that the function doesn't modify its argument, or that extracts a promise from the caller not to modify a return value. If ro arguments and return types are used, those conditions are enforced by the compiler, so they can be deleted from the documentation. Furthermore, readers know that in a well-designed function, a non-ro argument will be written along at least one code path.

Better Static Analysis Tools

Read-only annotations will make it easier for some tools to do their job. For example, consider a tool that checks whether a piece of memory is modified by a goroutine after it sends it on a channel, which may indicate a race condition. Of course if the value is itself read-only, there is nothing to do. But even if it isn't, the tool can do its job by checking for writes locally, and also observing that the value is passed to other functions only via read-only argument. Without ro annotations, the check would be difficult (requiring examining the code of functions not in the current package) or impossible (if the call was through an interface).

Less Duplication in the Standard Library

Many functions in the standard library can be removed, or implemented as wrappers over other functions. Many of these involve the string and []byte types.

If the io.Writer.Write method's argument becomes read-only, then io.WriteString is no longer necessary.

Functions in the strings package that do not return strings can be eliminated if the corresponding bytes method uses ro. For example, strings.Index(string, string) int can be eliminated in favor of (or can trivially wrap) bytes.Index(ro []byte, ro []byte) int. This amounts to 18 functions (including Replacer.WriteString). Also, the strings.Reader type can be eliminated.

Functions that return string cannot be eliminated, but they can be implemented as wrappers around the corresponding bytes function. For example, bytes.ToLower would have the signature func ToLower(s ro? []byte) ro? []byte, and the strings version could look like

func ToLower(s string) string {
    return string(bytes.ToLower(s))
}

The conversion to string involves a copy, but ToLower already contains a conversion from []byte to string, so there is no change in efficiency.

Not all strings functions can wrap a bytes function with no loss of efficiency. For instance, strings.TrimSpace currently does not copy, but wrapping it around bytes.TrimSpace would require a conversion from []byte to string.

Adding ro to the language without some sort of permission genericity would result in additional duplication in the bytes package, since functions that returned a []byte would need a corresponding function returning ro []byte. Permission genericity avoids this additional duplication, as described above.

Pointers to Literals

Sometimes it's useful to distinguish the absence of a value from the zero value. For example, in the original Google protobuf implementation (still used widely within Google), a primitive-typed field of a message may contain its default value, or may be absent.

The best translation of this feature into Go is to use pointers, so that, for example, an integer protobuf field maps to the Go type *int. That works well except for initialization: without pointers to literals, one must write

i := 3
m := &Message{I: &i}

or use a helper function.

In Go as it currently stands, an expression like &3 cannot be permitted because assignment through the resulting pointer would be problematic. But if we stipulate that &3 has type ro *int, then assignment is impossible and the problem goes away.

Weaknesses of This Proposal

Loss of Generality

Having both T and ro T in the language reduces the opportunities for writing general code. For example, an interface method with a []int parameter cannot be satisfied by a concrete method that takes ro []int. A function variable of type func() ro []int cannot be assigned a function of type func() []int. Supporting these cases would start Go down the road of covariance/contravariance, which would be another large change to the language.

Problems Going from string to ro []byte

When we change an argument from string to ro []byte, we may eliminate copying at the call site, but it can reappear elsewhere because the guarantee is weaker: the argument is no longer immutable, so it is subject to change by code outside the function. For example, os.Open returns an error that contains the filename. If the filename were not immutable, it would have to be copied into the error message. Data structures like caches that need to remember their methods' arguments would also have to copy.

Also, replacing string with ro []byte would mean that implementers could no longer compare via operators, range over Unicode runes, or use values as map keys.

Subsumed by Generics

Permission genericity could be subsumed by a suitably general design for generics. No such design for Go exists today. All known constraints on generic types use interfaces to express that satisfying types must provide all the interface's methods. The only other form of constraint is syntactic: for instance, one can write []T, where T is a generic type variable, enforcing that only slice types can match. What is needed is a constraint of the form "T is either []S or ro []S", that is, permission genericity. A generics proposal that included permissions would probably drop the syntax of this proposal and use identifiers for permissions, e.g.

gen <T, perm Ro> func tail(x Ro []T) Ro []T { return x[1:] }

Missing Immutability

This proposal lacks a permission for immutability. Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

The problem is how to construct immutable values. Literals of immutable type would only get one so far. For example, how could a program construct an immutable slice of the first N primes, where N is a parameter? The two easy answers—deep copying, or letting the programmer assert immutability—are both unpalatable. Other solutions exist, but they would require additional features on top of this proposal. Simply adding an im keyword would not be enough.

Does Not Prevent Data Races

A value cannot be modified through a read-only reference, but there may be other references to it that can be modified concurrently. So this proposal prevents some but not all data races. Modern languages like Rust, Pony and Midori have shown that it is possible to eliminate all data races at compile time. But the cost in complexity is high, and the value unclear—there would still be many opportunities for race conditions. If Go wanted to explore this route, I would argue that the current proposal is a good starting point.

References

Brad Fitzpatrick's read-only slice proposal

Russ Cox's evaluation of the proposal. This document identifies the problem with the sort package discussed above, and raises the problem of loss of generality as well as the issues that arise in moving from string to ro []byte.

Discussion on golang-dev

@jba jba added v2 A language change or incompatible library change Proposal labels Nov 25, 2017
@gopherbot gopherbot added this to the Proposal milestone Nov 25, 2017
@ianlancetaylor ianlancetaylor changed the title proposal: read-only types proposal: Go 2: read-only types Nov 26, 2017
@ianlancetaylor
Copy link
Contributor

I understand the desire for this kind of thing, but I am not particularly fond of this kind of proposal. This approach seems very similar to the const qualifier in C, with the useful addition of permission genericity. I wrote about some of my concerns with const in https://www.airs.com/blog/archives/428.

You've identified the problems well: this does not provide immutability, and it does not avoid data races. I would like to see a workable proposal for immutability, and I would love to see one that avoids data races. This is not those proposals.

Using ro in a function parameter amounts to a promise that the function does not change the contents of that argument. That is a useful promise, but it is one of many possible useful promises. Is there a reason beyond familiarity with C that we should elevate this promise into the language? Go programs often rely on documentation rather than enforcement. There are many structs with exported fields with documentation about who is permitted to modify those fields. Similarly we document that a Write method that implements io.Writer may not modify its argument slice. Why put one promise into the language but not the other?

In general this is an area where experience reports can help guide Go 2 development. Does this proposal help with real problems that Go programmers have encountered?

@jba
Copy link
Contributor Author

jba commented Nov 26, 2017

I would like to see a workable proposal for immutability, and I would love to see one that avoids data races. This is not those proposals.

I'm continuing to think about those things, but I wanted to get this proposal out there for two reasons. One, I think any proposal for immutability will have this as a subset. ro T is a subtype of both T and im T, so it will likely show up in any reasonable proposal involving im. (Permission genericity gets around using ro for functions, but you still might want it for data. Consider a data structure that wants to store both T and im T.) It's probably not an accident that Rust, Pony and Midori all have read-only types in addition to immutable ones.

The second reason I wanted to share this is that it serves as a counterexample to anyone who thinks adding read-only types to Go is just a matter of adding a keyword.

Does this proposal help with real problems that Go programmers have encountered?

Yes. At the recent Google-internal Go conference, @thockin specifically asked for const, citing bugs in Kubernetes due to inadvertent modification of values returned from caches. I think Alex Turcu also mentioned that he wanted something like this for an internal video ads system.

@ianlancetaylor
Copy link
Contributor

What do you think of a builtin freeze function that returns an immutable shallow copy of an object? That would fix the cache problem without modifying the type system. (The returned value would be immutable in that any attempt to modify it would cause a run time panic.)

@jba
Copy link
Contributor Author

jba commented Nov 28, 2017

Out of curiosity, how does that work? And how does it detect modification of a nested value?

@ianlancetaylor
Copy link
Contributor

I don't know exactly how it works, which is why I haven't written a proposal for it. One conceivable implementation would be to mmap a new page, copy the object in, and then to mprotect that page, but the difficulties are obvious.

For a nested value, you use freeze multiple times, as desired.

@neild
Copy link
Contributor

neild commented Nov 28, 2017

I'm not following the distinction between values and variables in this proposal. Why is the modification of the value stored in a permitted below?

var a ro int
a = 1 // Modifying an ro int via a variable.

var b *ro int := &a
*b = 1 // Modifying an ro int via a pointer.

A nitpick: Pointers-to-constants are entirely orthogonal to the rest of this proposal and (IMO) distract from the meat of it. Go already has syntax for constructing non-zero pointers to compound types; providing a similar facility for non-compound types does not require the addition of read-only values to the language. e.g., #19966.

@willfaught
Copy link
Contributor

@jba Could you accomplish the same thing with overriding type operations? Is it important that the read-only property be at the type level? For example, string (basically a read-only []byte) could be defined as something like this:

type string []byte

func (s string) []=(index int) byte {
    panic("not supported")
}

This doesn't require any changes to the type system, and seems to be backward-compatible at first glance.

@jba
Copy link
Contributor Author

jba commented Nov 29, 2017

@neild ro int is the same thing as int (actually, I disallow it, but that could go either way). ints are already immutable: you can't modify an int, only copy and change it. So your code is equivalent to

var a int
a = 1 // Modifying an ro int via a variable.

var b *int := &a
*b = 1 // Modifying an ro int via a pointer.

and of course both of those assigments are equally legal. The assignment in

var c ro *int = &a
*c = 1

would not be, but c itself could be changed.

I'm trying to avoid proposing both a type modifier and what C would call "storage class,", out of hygiene. (See Ian's blog post that he linked to above for a criticism of how C const conflates those.)

@jba
Copy link
Contributor Author

jba commented Nov 29, 2017

@willfaught I don't think operator overloading is a good fit for Go. One of the nice things about the language is that every bit of syntax has a fixed meaning.

@willfaught
Copy link
Contributor

It seems identical to how methods and embedding work. Like the selector a.b, the operation a[b] could also be overridden. It would simplify things for operators to just be methods (that can be aggressively inlined).

@neild
Copy link
Contributor

neild commented Nov 30, 2017

@jba Your proposal says:

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro *int and *ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The existence of *ro int implies the existence of ro int, doesn't it? If not, why not and what is the type of *p where p is a *ro int?

You also say:

It is a compile-time error to modify a value of read-only type,

I can't square this with it being legal to modify the value of c in your example:

var c ro *int = &a

The variable c has type ro *int. The value contained within c is a value of read-only type. Why can it be modified?

@ianlancetaylor
Copy link
Contributor

@willfaught Operator overloading is a very different idea that should be discussed in a separate proposal, not this one.

@jba
Copy link
Contributor Author

jba commented Dec 4, 2017

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro *int and *ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The existence of *ro int implies the existence of ro int, doesn't it? If not, why not and what is the type of *p where p is a *ro int?

That's a bug in my proposal. I chose a poor example. Replace int with []int.

It is a compile-time error to modify a value of read-only type,

I can't square this with it being legal to modify the value of c in your example:

var c ro *int = &a

The variable c has type ro *int. The value contained within c is a value of read-only type. Why can it be modified?

The it in your last sentence refers to the value of read-only type, the ro *int. That value cannot be modified; *c = 3 is illegal. But you can change the binding of c. There is nothing in my proposal that restricts the semantics of variable bindings.

The situation is analagous to

var s string = "x"
s = "y"

The value is immutable, but the variable binding is not.

@neild
Copy link
Contributor

neild commented Dec 4, 2017

It is possible that I have misunderstood the spec, but this is not consistent with my understanding of variable assignment. s = "y" does not change the binding of s; it changes the value of the variable bound to s.

@jba
Copy link
Contributor Author

jba commented Dec 5, 2017

I guess I'm using the word "binding" wrong. I was thinking variables are bound to their values, and you're saying identifiers are bound to variables, which have values. Anyway, you can change variable-value associations, but some values cannot be modified.

@Spriithy
Copy link

Why not reuse the already existing const keyword to ease readability and stick with Go's spirit of not obfuscating intent ?

My point here is that ro is an obfuscating keyword that hides intent to non-aware readers. Again, as you stated earlier, this is merely a suggestion and syntax comes last.

Other point, say I have a read only type for ints. Is such type declarable (as in type T = ro int) ?

If yes, do I declare an instance of such type using the var or const const keyword since it is a non modifiable type ?

type T = ro int

var x T = 55

// or

const y T = 98

Moreover, wouldn't it be enough to allow constant pointers ?

Other point, what about compound types ? Say, using these declarations

type S struct {
    Exported int
}

type RoS = ro S

Does this snippet compile ? If not, what errors are thrown ? If yes, what is the expected behavior ? Does it panic ? If yes, how does the runtime detects this ?

func main() {
    ros := &RoS{Exported: 55}
    p := &ros.Exported
    *p = 98
}

What about this one ?

func main() {
    ros := &RoS{Exported: 55}
    p := (*int)(unsafe.Pointer(&ros.Exported))
    *p = 98
}

@jaekwon
Copy link

jaekwon commented Dec 29, 2017

I just want to point to two proposals, one for immutable slices and one for pointerized structs that I think in combination amounts to a simpler set of language changes than what is proposed here. Please take a look!

What is needed is a constraint of the form "T is either []S or ro []S", that is, permission genericity.

Check out the any modifier in the immutable slices proposal.

Pointerized structs

Here's a concrete example. Here is one way to control write-access to structs. Copying is trivial, you can just do var g Foo = f from anywhere, even outside the module that declares Foo.

type Foo struct {
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}

The other way is to protect the struct with a mutex:

type Foo struct {
  mtx sync.RWMutex
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock

Here's a full pointerized struct version:

type foo struct* {
  Value interface{}
}
func (f foo) GetValue() interface{} {...}

type Foo struct {
  mtx sync.RWMutex
  foo
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock

f = Foo{...}
f.SetValue(...) // ok, f is addressable
g := f
g.SetValue(...) // ok, g is addressable
func id(f Foo) Foo { return f } // returns a non-addressable copy
id(g).SetValue(...) // compile-error, not addressable.
id(g).GetValue(...) // calls foo.GetValue, mtx not needed

Q: So why readonly slices? It seems natural to create a "view" into an arbitrary-length array of objects without copying. For one, it's a required performance optimization. Second, there's no way to mark any items of a slice to be externally immutable, as can be done with private struct fields. For these reasons, readonly slices appear to be natural and necessary (for lack of any alternative).

@wora
Copy link

wora commented Dec 29, 2017

I think this design would lead to significant complexity in practice, similar to C++ const. A couple of key issues:

  • The caller is free to modify the value while it looks like a constant to the callee.
  • If you read a field of ro T, what is the type of the field value? F or ro F?
  • Having libraries to consistently use this new feature can be very challenging and costly.

One cheap alternative is to introduce a documentary type annotation, which just document the value should not be changed. There is no enforcement, but it offers a design contract between caller and callee. Go doesn't provide in-process security anyway, a bad library can do arbitrary damage. I am not sure whether we need to guard it at language level.

@jba
Copy link
Contributor Author

jba commented Jan 3, 2018

@Spriithy:

Why not reuse the already existing const keyword to ease readability and stick with Go's spirit of not obfuscating intent ?

const is about the identifier-value binding, while ro modifies types. I think it would be more confusing to conflate the two.

.. do I declare an instance of [an ro type] using the var or const const keyword since it is a non modifiable type ?

var, because the variable can still be set to a different value.

Moreover, wouldn't it be enough to allow constant pointers ?

No, ro is useful for anything that has structure, like maps and slices. You might want to return a map from a function without worrying that your callers will modify it, for example.

Does this snippet compile ? If not, what errors are thrown ? If yes, what is the expected behavior ? Does it panic ? If yes, how does the runtime detects this ?

func main() {
    ros := &RoS{Exported: 55}
    p := &ros.Exported
    *p = 98
}

It fails to compile. p has type ro *int, so the assignment *p = 98 is illegal.

What about [using unsafe]?

Of course, all bets are off with unsafe.

@jba
Copy link
Contributor Author

jba commented Jan 3, 2018

@jaekwon:

Check out the any modifier in the immutable slices proposal.

I don't see how any actually works. Say I have x, which may be an roPeeker or an rwPeeker. Now I do

if y, ok  := x.(interface{ Peek(int) any []byte }); ok {
   b := y.Peek(3)
   b[1] = 17 // ???
}

Can I assign to elements of b or not? Hopefully the compiler somehow knows and reports an error just in case x was an roSeeker. But I don't see how it knows that.

Here is one way to create immutable structs:

type Foo struct {
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}

I don't understand this. What is immutable? Certainly not Foo—you can set its value field. (The field may as well be exported.) Is the thing I put in value immutable? Maybe; depends what I put there:

var f Foo
f.SetValue([]int{1})
x := f.GetValue()
x.([]int)[0] = 2 // Nope, not immutable.

@jba
Copy link
Contributor Author

jba commented Jan 3, 2018

@wora:

I think this design would lead to significant complexity in practice, similar to C++ const.

I think it's a little less complex, but yes, I basically agree.

The caller is free to modify the value while it looks like a constant to the callee.

It doesn't look like a constant, it looks like a readonly value.

If you read a field of ro T, what is the type of the field value? F or ro F?

ro F

@jaekwon
Copy link

jaekwon commented Jan 9, 2018

@jba

I don't see how any actually works. Say I have x, which may be an roPeeker or an rwPeeker. Now I do

if y, ok  := x.(interface{ Peek(int) any []byte }); ok {
   b := y.Peek(3)
   b[1] = 17 // ???
}

Can I assign to elements of b or not? Hopefully the compiler somehow knows and reports an error just in case x was an roSeeker. But I don't see how it knows that.

No, you can't. any means it might be read-only, so first you must cast to a writeable.

y  := x.(interface{ Peek(int) any []byte })
if wy, ok  := x.(interface{ Peek(int) []byte }); ok {
   b := wy.Peek(3)
   b[1] = 17
}

@jba

Here is one way to create immutable structs:

type Foo struct {
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}

I don't understand this. What is immutable? Certainly not Foo—you can set its value field. (The field may as well be exported.)

I meant, you pass by value (e.g. copy) to prevent others from writing to it. Immutable is an overloaded word... I was using it to refer to pass-by-copy semantics.

f := &Foo{value: "somevalue"}
f.SetValue("othervalue") // `f` is a pointer
g := *f
g.SetValue("another") // can't, g is a readonly copy.

The use-cases for ro struct{} overlap significantly for use-cases for g := *f, and the latter already exists. We don't need transitive ro as long as all field values are non-pointer types.

But I also acknowledge that Golang1 isn't perfectly suited for this kind of usage, because it forces you to write verbose and type-unsafe syntax to get the behavior you want... Here's an example with an (immutable) tree-like structure:

type Node interface {
    AssertIsNode()
}
type node struct {
  Left Node
  Right Node
}
func (_ node) AssertIsNode() {}

// Using the struct is cumbersome, but overall this has the behavior we want.

// Interfaces are pointer-like in how its represented in memory,
// copying is quick and efficient.
var tree Node = ...
maliciousCode(tree) // cannot mutate my copy

// But using this as a struct is cumbersome and type-unsafe.
var leftValue = tree.(node).Left.(node).Value

Maybe one way to make this easier is to declare a struct to be "pointerized"...

type Node struct* {
  Left Node
  Right Node
}

var n Node = nil // not the same as a zero value.
n.Left = ... // runtime error
n = Node{}
n.Left = ...
n.Right = ...

var n2 = n
n2.Left = ... // This won't affect `n`.
n2.Left.Left = ... // compile-time error, n2.Left is not addressable.

n.Left = n // circular references are OK.

Please check out #23162

@jba Please check out the update to the last comment: #22876 (comment)

@andlabs
Copy link
Contributor

andlabs commented Jan 31, 2018

Has anyone listed all the existing proposals for declaring a block of data to be stored in read-only memory?

@JavierZunzunegui
Copy link

Not sure if this is still under debate, but weighing in:

Focusing on Permission Genericity:

A function with an ro? argument a must type-check in two ways:

  • a has type ro T and ro? is treated as ro.
  • a has type T and ro? is treated as absent.

In calls to a function with a return type ro? T, the effective return type is T if the ro? argument a is a read-write type, and ro T if a is a read-only type.

Using tail as in your definition,

func tail(x ro? []int) ro? []int { return x[1:] }

what is the type of x in x := tail?

I doubt you want func(ro? []int) ro? []int, as you are only introducing ro-qualified types and not 'optional' ro?-qualified types. Which means the type of x must be be specified somehow and, more significantly, you have tail being different to x, i.e. type tail(a) may differ from type x(a).

As I see it you have to remove the genericity. There are two natural options.

  1. limit to one ro? (no ro??)
  2. remove ro? altogether

Either way the key difference is that any ro-able function foo (or even for every function, as an identity) has two forms: foo and ro foo. The key difference is the argument does not define the function (no genericity), i.e.

_ = foo // type without extra ro
_ = ro foo // type with additional ro

var a A // assume ro-able
_ = foo(a) // output is non-ro
_ = foo(ro A(a)) // output is still non-ro (no genericity)

// note in the below the syntax is (ro foo)(...), NOT ro (foo(...))
_ = ro foo(a) // BAD! won't compile
_ = ro foo(ro A(a)) // output is ro (no genericity)

Choosing between 1: limit to one ro? (no ro??) or 2: remove ro? altogether I am not sure on, 1 is more powerful put more inconvenient for the developer.

@JavierZunzunegui
Copy link

Focusing on Missing Immutability:

This proposal lacks a permission for immutability. Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

The problem is how to construct immutable values. Literals of immutable type would only get one so far. For example, how could a program construct an immutable slice of the first N primes, where N is a parameter? The two easy answers—deep copying, or letting the programmer assert immutability—are both unpalatable. Other solutions exist, but they would require additional features on top of this proposal. Simply adding an im keyword would not be enough.

While correct, I think this statement misses that ro could actually bring us a form of immutability. The key is while ro is a developer-level feature, immutability would be a compiler-level feature, existing only for performance purposes. This means: no im or comparable syntax, and the developer effectively not knowing if their ro is also immutable (in that all immutable are ro but not all ros are immutable). The difference is only in performance, the logic for the two must be identical and any immutable ro could be treated as a plain ro and produce identical results.

The immutability could be summarized as:
IF the compiler can assert all references to a ro variable are also ro themselves, it can treat the variable as immutable
(in the absense of unsafe)

To achieve this, the compiler must effectively do escape analysis on ro's, i.e. a non ro-escaping ro can be treated as immutable, but one that ro-escapes can't.

An example:

func foo1() []int {
  return append([]int{1}, 2)
}
// x is ro, and can also be considered immutable
x := ro []int(foo1())

func bar2 func([]int){}
func foo2() []int {
  out := append([]int{1}, 2)
  bar2(out)
  return out
}
// y is ro but can't be considered immutable, the variable ro-escapes in bar
// (the example is so simple a smart compiler could actually identifies it doesn't escape, but I am not making that case here give go prioritizes compile time)
y := ro []int(foo2())

func bar3 func(ro []int){}
func foo3() []int {
  out := append([]int{1}, 2)
  bar3(out)
  return out
}
// z is ro and considered immutable, the variable escapes but as a ro, and that's OK
z := ro []int(foo2())

Note this means that immutable ros are not actually (necessarily) created as immutable, in it's simplest terms:

func foo() ro []int {
  out := []int{1} // not immutable, not even ro
  out = append(out, 1) // not immutable or ro (and in fact is being modified!)
  return out // ro and immutable, despite having been non-immutable (and non-ro) before. Has NOT been deep-copied
}

In this sense immutable ros are not identical to const (in the current go sense, not referening to C-style const). The guarantee is not once first allocated, this memory is unchanged, but rather from this point on, this memory is unchanged.

The benefits of this are exactly as defined in this proposal:

Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

It is only that the gatekeeper to this performance gains is the compiler, not the developer - at least not explicitly. Write good ro code, and you'll be likely to get the advantages.

Concluding
ro, as defined in this proposal, does (or rather may, if we choose such immutability approach) bring immutability performance gains. And whats best, since those gains are responsibility of the compiler and don't change the code output we can have ro added to go2 and start writing ro-compliant code, and only progressively add this kind of immutability support.

Quoting from the proposal:

[...] or letting the programmer assert immutability

Replace programmer for compiler, that's all I'm trying to say.

@jba
Copy link
Contributor Author

jba commented Nov 18, 2018

what is the type of x in x := tail?

It is the exact type of tail, with the ro?s. I answer that in the original proposal. See the paragraph beginning "There are no automatic conversions...".

immutability would be a compiler-level feature, existing only for performance purposes.

But those performance gains come from how programmers write code. Say I'm writing a cache that accepts ro T values. Even if the compiler can prove immutability, I can't, so I have to copy them. Or say I'm calling a function Foo(int) ro []int. The compiler may be able to prove that the return value is immutable, but I have to code assuming it isn't.

@JavierZunzunegui
Copy link

But those performance gains come from how programmers write code. Say I'm writing a cache that accepts ro T values. Even if the compiler can prove immutability, I can't, so I have to copy them. Or say I'm calling a function Foo(int) ro []int. The compiler may be able to prove that the return value is immutable, but I have to code assuming it isn't.

Yes, the programmers still have a part to play in writing both safe and performant code - more so than if there was im in the language, but less than without ro support (at least with ro the caller of Foo cant change the []int, a decent gain).

Even if the compiler can prove immutability, I can't, so I have to copy them

No more than you may have to do in current (non-ro) golang. My suggestion on building immutability on top of ro is purely a performance concept, if the code was wrong with ro then my point about immutability changes nothing. If you can't trust the callers of your ro method to be sensible you have no choice but to copy it. Having said that, if you do copy it you will (or should) at least generate an immutable ro []int, so may still get some performance advantages whatever path you take.

If you want to ensure the ro is immutable to avoid copying (the ideal situation) maybe we can add something like:

//go:immutable


but I think generally that may introduce many issues as the compiler makes no promises that logically immutable ros will be trated as such (and it may take time before the compiler does a good job at it).

@ghost
Copy link

ghost commented Jan 15, 2019

It would be nice to have read only maps as asked in slack how to do the following:

const database_config := map[string]string{
"host": "localhost",
"port": "2114",
"username": "foo",
"password": "bar",
"name": "db_wiki",
}
And was pointed to here since I'm totally new to Go.

@ianlancetaylor
Copy link
Contributor

@Ookma-Kyi Constant map variables are an aspect of immutability and are covered by #6386. This proposal is less about immutability than it is about ensuring that functions don't change certain values.

@jcburley
Copy link
Contributor

jcburley commented Dec 29, 2020

Interesting idea, but please don't repeat the bodge of having ro, like C's const, apply to whatever is to the left or right of the keyword.

As have others, decades of wrestling with C's const taught me to consistently place it so it modified only what was to the left, as in * const int meaning "a pointer via which an int will not be modified".

In that sense, var ro ... would logically mean the same thing as const, which is what consistent right-to-left reading (as Go already requires for things like [4][8]byte) would indicate, and thus would probably just be disallowed, as would func foo(a ro int) and func foo(a ro *int).

Having already worked on large codebases with plenty of const * const * const int and such, because the programmers didn't quite understand the rules or weren't sure readers would, I believe it'd simplify things to be strict about the direction to which ro pertains, versus C's "to the left, unless there's 'nothing there' [which means what, to the naive reader?], in which case to the right".

@wora
Copy link

wora commented Dec 29, 2020 via email

@mihaigalos
Copy link

mihaigalos commented Jul 11, 2022

Interesting idea. Here are some thoughts:

ro vs mut

I would like to see a harmonizing of read-only types in Golang with Rust and Vlang. The last 2 use mut instead to denote modifiable types. All other types are read-only by default and can only be initialized once.

The reasoning is that if you forget to specify any qualifier, the type is by default non-mutable (const, read-only) and adheres to the principle of "Make it easy to use correctly and hard to be used incorrectly".

Mutable receivers

This proposal is, imho, very nice because having a mutable or read-only datatype would simplify receivers to a great extent.

The pass-by-value or pass-by-reference receiver would not be needed anymore and can be deprecated in favor of mutable/non-mutable receivers. This is a much simpler concept to explain to new developers than say, memory locations and pointers.

Only pass-by-reference would be used and the compiler would error in the case of mutating a non-mutable receiver.
The benefit here is obviously the elimination of an unnecessary copy, since pass-by-value receivers would be obsolete.
The syntax can then be simplified to remove the pointer asterisk for receivers (or keep it for legacy purposes).

@tv42
Copy link

tv42 commented Jul 11, 2022

@mihaigalos The language you are proposing seems to have very little intersection with Go1 as it already exists. Practically no existing Go program would work.

@mihaigalos
Copy link

Hi @tv42. Is this not the correct thread to discuss breaking changes?
I thought that was what Go2 was all about - ignore the part with legacy purposes in my original post.

@josharian
Copy link
Contributor

@mihaigalos useful background on breaking changes and Go2: https://go.googlesource.com/proposal/+/master/design/28221-go2-transitions.md

@amery
Copy link

amery commented Apr 26, 2023

having a ro or readonly behaving like const in C wouldn't be a breaking change and makes code more secure

@jba
Copy link
Contributor Author

jba commented Apr 28, 2023

@amery, it would be a breaking change in some cases:

var f func(*T) = F

This breaks if

func F(*T) {...}

is changed to

func F(ro *T) {...}

@amery
Copy link

amery commented Apr 28, 2023

@amery, it would be a breaking change in some cases:

var f func(*T) = F

This breaks if

func F(*T) {...}

is changed to

func F(ro *T) {...}

even without new features you can't assume you can change F's signature and expect everyone using it to remain happy. to me key to be a breaking change is that old code stops working with the new release of the compiler. this is not the case here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests