Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: pointer package #38298

Closed
carnott-snap opened this issue Apr 8, 2020 · 25 comments
Closed

proposal: pointer package #38298

carnott-snap opened this issue Apr 8, 2020 · 25 comments

Comments

@carnott-snap
Copy link

carnott-snap commented Apr 8, 2020

My apologies if this is a duplicate proposal, but the keywords are too generic to make search useful.

Description

Due to addressability rules, some literals cannot have pointers made of them inline. This has led to a number of third party implementations of this logic:

func Bool(v bool) *bool                   { return &v }
func Byte(v byte) *byte                   { return &v }
func Complex128(v complex128) *complex128 { return &v }
func Complex64(v complex64) *complex64    { return &v }
func Float32(v float32) *float32          { return &v }
func Float64(v float64) *float64          { return &v }
func Int(v int) *int                      { return &v }
func Int16(v int16) *int16                { return &v }
func Int32(v int32) *int32                { return &v }
func Int64(v int64) *int64                { return &v }
func Int8(v int8) *int8                   { return &v }
func Rune(v rune) *rune                   { return &v }
func String(v string) *string             { return &v }
func Uint(v uint) *uint                   { return &v }
func Uint16(v uint16) *uint16             { return &v }
func Uint32(v uint32) *uint32             { return &v }
func Uint64(v uint64) *uint64             { return &v }
func Uint8(v uint8) *uint8                { return &v }
func Uintptr(v uintptr) *uintptr          { return &v }

I think pointer is a better package than the alternative, ptr, since that could conflict with common variable names. I would like to propose that this logic be merged into one of the following locations:

  • pointer
  • golang.org/x/pointer
  • golang.org/x/tools/pointer
  • golang.org/x/exp/pointer

Rebuttal

While I am pretty sure this could be solved with the current generics proposal, or reflection, it is my opinion that these 18 functions provide sufficient benefit to justify their existence today.

Exclusions

Many third party implementations include other primitives that help with other pointer issues. Due to a variety of issues I do not think they warrant inclusion today, and are broken out below for completeness.

func Xxx(pkg.Xxx) *pkg.Xxx

Functions for named types like time.Time or time.Duration are also useful, but even limiting ourselves to the stdlib could be a lot of symbols.

func XxxEqual(*xxx, *xxx) bool

While this can be helpful, I think reflect.DeepEqual meets the needs, at some runtime cost.

func XxxSlice([]xxx) []*xxx

Pointering a slice can be painful, but the lack of support for custom types means this is more about convenience than covering addressability issues. This feels like a problem for generics.

func XxxYyyMap(map[xxx]yyy) map[xxx]*yyy

This just feels really niche, and would benefit more from a generic solution. Even more than slices, because of explosion of possibilities, this suffers from the lack of support for custom types.

func XxxValue(*xxx) xxx

This is really just a convenience function for "give me the default value if nil, instead of panicking". It is helpful, but really should apply to all types, so I vote generic solution.

func XxxYyyValueMap(map[xxx]*yyy) map[xxx]yyy / func XxxValueSlice([]xxx) []xxx

All the problems with Map and Slice, plus the complexities of Value: use generics.

Prior Art

@gopherbot gopherbot added this to the Proposal milestone Apr 8, 2020
@cespare
Copy link
Contributor

cespare commented Apr 8, 2020

I often need these, and also get annoyed by the proliferation of silly helper packages.

I think a nicer fix would be a small language change to allow easier construction of pointers from literals:

p := &int64(3) // shorthand for 'n := int64(3); p := &n'
p := &"hello"  // shorthand for 's := "hello"; p := &s'
p := &0        // shorthand for 'n := 0; p := &n'; p has type *int

I thought we had an issue for something like this, but now I'm having trouble finding it.

@zikaeroh
Copy link
Contributor

zikaeroh commented Apr 8, 2020

I thought we had an issue for something like this, but now I'm having trouble finding it.

That's probably #9097. See also the previous #37302.

@carnott-snap
Copy link
Author

That's probably #9097. See also the previous #37302.

Awesome, thanks @zikaeroh. That being said this proposal seems to differ even from #37302, since it does not ask to be a builtin or part of the stdlib. I agree #9097 would be better, but the conversation seems stalled, I think this would be nice to throw somewhere, even golang.org/x/exp.

@rsc rsc added this to Incoming in Proposals (old) Apr 15, 2020
@urandom2
Copy link
Contributor

Requesting inclusion in #33502.

@pjebs
Copy link
Contributor

pjebs commented May 8, 2020

@cespare

In my https://github.com/rocketlaunchr/igo#address-operator package, I solved it doing this:

v := "string"
vptr := &[]string{"string"}[0]  // inlined instead of  using &v

@carnott-snap
Copy link
Author

carnott-snap commented May 11, 2020

I am unclear on the perf implications, but it seems like everything escapes to the heap, meaning this is all just sugar: (though technically &[]string{"string"}[0] has a larger memory footprint, since the slice cannot be inlined like a func can be)

package main

import "fmt"

func main() {
	a := "string"
	p0 := &a
	p1 := func() *string { b := "string"; return &b }()
	p2 := func(c string) *string { return &c }("string")
	p3 := &[]string{"string"}[0]

	fmt.Println(a, p0, p1, p2, p3)
}
[user@localhost test]$ go build -gcflags -m -o /dev/null 
# test
./main.go:8:8: can inline main.func1
./main.go:8:51: inlining call to main.func1
./main.go:9:8: can inline main.func2
./main.go:9:44: inlining call to main.func2
./main.go:12:13: inlining call to fmt.Println
./main.go:6:2: moved to heap: a
./main.go:8:51: moved to heap: b
./main.go:9:44: moved to heap: c
./main.go:10:17: []string literal escapes to heap
./main.go:12:13: a escapes to heap
./main.go:12:13: []interface {} literal does not escape
<autogenerated>:1: .this does not escape

@pjebs
Copy link
Contributor

pjebs commented May 12, 2020

@carnott-snap For my https://github.com/rocketlaunchr/igo, in your opinion which technique should I use in terms of performance or memory? (obviously they are all close to negligible but
just wondering)

@carnott-snap
Copy link
Author

carnott-snap commented May 12, 2020

Benchmarking seems to show that all of these are operating at 30 ns/op, so that should not matter. p0 is the most space efficient (a on the heap, p0 on the stack), but with p1 and p2, they should be equivalent. My concern with p3 is that while it is easier to write, you leave a dangling []string on the heap, but the compiler may be smart enough to realise that you drop the reference to the slice? Since you are generating code anyway, I would expand to p0, or p1/p2 if you want the curt syntax, lexical scoping, and are fine possibly missing the inlining. Though the lexical scoping could be done with {}s too:

var p4 *string
{
        a := "string"
        p4 = &a
}

@as
Copy link
Contributor

as commented May 12, 2020

Another alternative is to give new a value parameter.

i := new(int, 8)
s := new(string, "foo")

I don't like this idea.

@pjebs
Copy link
Contributor

pjebs commented May 13, 2020

@as Even though you thumbed down your suggestion, new seems as convenient as using a separate pointer package.

eg

import p "pointerpkg"

b := p.Bool(true)

vs

b := new(bool, true)

@as
Copy link
Contributor

as commented May 13, 2020

While I don’t like it, I admit it may be of value for the discussion. It seems more orthogonal than a package for every scalar type.

@CAFxX
Copy link
Contributor

CAFxX commented May 14, 2020

Just a random observation for evaluating the proposal: if the goal to be achieved by taking the pointer is signaling whether the value is defined or not (this comes up often e.g. when handling serialization formats like JSON or protobuf to distinguish between a value being unset or having the zero value), then arguably a better solution would be, with generics, some kind of optional container (like std::optional in C++ or std::option in rust).

@carnott-snap
Copy link
Author

My proposal does not speak to why pointers of scalars are needed, in part to prevent bikeshedding, but there are a number of other use cases that require a pointer to a value: flag returns incomplete entries, encoding/json mutates pointer parameters.

To your point about optionals, I agree and tend to prefer multi return, func do() (v T, ok bool) or packages, like github.com/antihax/optional, over *T. Unfortunately, the community does not seem so enthusiastic. Maybe that will change with generics, #15292, but unless they add a Box type, we cannot get away from pointers for mutability.

@rsc
Copy link
Contributor

rsc commented May 20, 2020

It is important to discuss why a proposal is needed. That is key to evaluating its benefit. If all a proposal has are costs, we will not adopt it. In this case, I don't actually see why this proposal is needed, except for protocol buffers, which already define these helpers.

The comment above this one mentions package flag and encoding/json. But this package wouldn't seem to help either.


For package flag, you register a flag with

var count = flag.Int("count", 1, "run `n` times")

That doesn't require the new pointer package. There is also a helper flag.IntVar that takes a *int, which can be used like this:

var count = 1
func init() { flag.IntVar(&count, "n", count, "run `n` times")

(This is more awkward to write but sometimes you want count to be a plain int instead of a *int.)
There's no use for the pointer package here either. You cannot write:

func init() { flag.IntVar(pointer.Int(), "n", 1, "run `n` times")

because now you have no way to access the flag value. You could write:

var count = pointer.Int(1)
func init() { flag.IntVar(count, "n", 1, "run `n` times")

but that's equivalent to the first example above (using flag.Int), meaning the flag API does not benefit from the pointer package, and it's arguably worse than the second example - if you're going to the trouble to write the code this way, you might as well get a non-pointer variable for the effort.


For package json, the main place that requires a pointer is the call to Unmarshal. Just like with flag.IntVar, if you pass the result of pointer.Foo into Unmarshal, you have no way to retrieve the value.

It's true that some people write structs with *int fields to use with package json. Maybe populating those structs would benefit from package pointer. Overall I'd rather see more use of "missing = 0" and the ,omitempty tag, which performs better anyway, than explicit pointers. The pointer package would encourage an inefficient and more clunky way of using package json.


What other use cases would justify adding this package?

@carnott-snap
Copy link
Author

carnott-snap commented May 20, 2020

It is important to discuss why a proposal is needed. That is key to evaluating its benefit. If all a proposal has are costs, we will not adopt it. In this case, I don't actually see why this proposal is needed, except for protocol buffers, which already define these helpers.

Apologies, I missed one; will have added it to the list at the beginning.

My argument is that this proliferation of these helpers is problematic. It is not just protocol buffers, note that thrift does the same thing. Do we expect every new IDL or interested author to implement them? There is a clear need from the community, and no canonical implementation, so can we publish one somewhere? It need not be stdlib, I am happy with it living in golang.org/x/pointer or golang.org/x/exp/pointer.

but that's equivalent to the first example above (using flag.Int), meaning the flag API does not benefit from the pointer package, and it's arguably worse than the second example - if you're going to the trouble to write the code this way, you might as well get a non-pointer variable for the effort.

This sentiment is contrary to the preferences I have seen in the wild. Customers will pull in the whole AWS sdk just to get aws.Int so that they can store var count = aws.Int(1).

It's true that some people write structs with *int fields to use with package json. Maybe populating those structs would benefit from package pointer.

Overall I'd rather see more use of "missing = 0" and the ,omitempty tag, which performs better anyway, than explicit pointers. The pointer package would encourage an inefficient and more clunky way of using package json.

What other use cases would justify adding this package?

I see two primary use cases: mutability and optionality:

For mutability, I agree that taking an inline pointer to a stack variable is likely sufficient, and frequently what I do, but some people still prefer to grab a pointer helper, and store var count *int directly.

For optionality, we can say that you should use meaningful defaults, but people currently do not listen, or need to be able to represent &"" and nil. We could wait for generics, but I feel like I would get the same pushback for types.Pointer(v Scalar). Finally, we could expose a first class optional interface, this too would be better with generics, but even a stop gap interface would be an improvement over the status quo.

@as
Copy link
Contributor

as commented May 21, 2020

I think the point is, aws.String is the cure worse than the disease. It would be better to propose an elegant solution that eliminates the need for pointer types in APIs, rather than one that makes them more ubiquitous.

@rsc
Copy link
Contributor

rsc commented May 27, 2020

Honestly, it was probably a mistake in proto to use pointers for optionality. I don't really want to encourage more of that.

This sentiment is contrary to the preferences I have seen in the wild. Customers will pull in the whole AWS sdk just to get aws.Int so that they can store var count = aws.Int(1).

If the customer is already using AWS, that seems fine. If not, that seems a bit strange (and maybe a teachable moment about dependencies). I certainly don't think it's the norm to import aws just for a *int constructor.

It is worth noting that this whole package reduces to a single line with the contracts proposal that @ianlancetaylor talked about at Gophercon last year:

func PointerTo(type T)(t T) *T { return &t }

and you could use it like PointerTo(1), without needing to say (int).

Are there uses other than proto-like IDLs?

@rsc rsc moved this from Incoming to Active in Proposals (old) May 27, 2020
@benjaminjkraft
Copy link

A bit more detail on some proto-like use cases other than proto itself.

I see this problem in our (@Khan) codebase with GraphQL APIs (for us, via gqlgen and shurcooL/graphql), and with database models (for us, via Google Cloud Datastore). Only shurcooL/graphql includes pointer helpers, although obviously the others could.

Perhaps using pointers for optionality was the wrong choice for these packages, too, but it's definitely not just proto that encourages it. Datastore does have an omitempty option, but the way gqlgen works makes it less clear how that could be done. And for both we do sometimes need both zero and null, sometimes for semantic reasons and sometimes for legacy reasons or interop with other languages.

One thing to add to that: the most common place where this comes out to be a problem for us is in tests. In non-test code the value you want to take the address of often comes from application logic, and assigning it to a variable may already be necessary or improve readability. But in tests you often just want to use a literal string/int/etc., and it's clearest to put it inline.

(Personally, I still end up ambivalent on the proposal: I don't like the cognitive overhead of having a function that you might have to look up what it does. A standard library package is less bad than a third-party package one is less likely to have seen, but it also sanctions the practice. Assigning a variable, or some of the options in #9097, have more obvious effect even if you've never seen them before.)

@carnott-snap
Copy link
Author

carnott-snap commented May 27, 2020

Honestly, it was probably a mistake in proto to use pointers for optionality. I don't really want to encourage more of that.

In practice and because of my functional background, I agree, but I have yet to find a good solution for optionality. My desired outcome is a solution to pointer helper diaspora, so I am open to options. Is there interest in an option package, or is your suggestion to simply not?

E.g. without generics we cannot get the filter/map/reduce benefits:

var s string = option.Some("string").Map(strings.ToLower).Or("default")

And are just left with the same pointer interface, just cluttered with different boilerplate:

var s option.String        // var s *string
s = strings.Some("string") // s = func() *string() { s := "string"; return &s }()
if s.Nil() { /* ... */ }   // if s == nil { /* ... */ } 
return s.Value()           // return *s

@rsc
Copy link
Contributor

rsc commented Jun 3, 2020

Generic options might be interesting but wouldn't retroactively change the existing APIs.
It still seems like if this is important for your code (especially in a test), it's fine to write the one-line helper functions you need. I often end up doing that for all kind of data structures in tests, not just simple pointers. And those helpers will get even easier if we adopt the generics design, as noted before.

Based on the discussion above, this seems like a likely decline.

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Jun 3, 2020
@carnott-snap
Copy link
Author

carnott-snap commented Jun 3, 2020

I think we are on the same page, but I would also like to change the status quo. Can we codify your claim that "[i]t was probably a mistake in proto to use pointers for optionality. I don't really want to encourage more of that". One option would be to give a clear suggestion for optionality, like:

  • It should not be implemented with pointers.
  • Defaults should be the missing case.
  • Use a wrapper type if you need to differentiate missing from default.

Also, I am curious if there is nowhere in the listed import paths where this package could be published. I agree that stdlib is not right, but putting a deprecated package in golang.org/x/exp would be a nice virtue signal to the community.

@ianlancetaylor
Copy link
Contributor

Speaking personally, I think the right choice for optionality is going to depend on other details. In something like proto, I think the natural approaches would be to add a presence field for each value field, or to use one or more uint bitmasks recording presence information for each field. The point is that it should not be implemented with pointers, as that puts more load on the garbage collector and destroys memory locality. For data structures other than proto, different guidelines may apply. It depends on how they are used.

Anybody can create a go-gettable pointer package, it doesn't have to be in x/exp.

@carnott-snap
Copy link
Author

Do you think there is anywhere that would be reasonable to document the problems and alternatives of using pointers for optionality? This question has come up several times at work, but I do not have anything I can point to, at least in part due to my own naivete.

@ianlancetaylor
Copy link
Contributor

Write a blog post?

I don't know of a good place for that kind of information on the Go website. Sorry.

@rsc
Copy link
Contributor

rsc commented Jun 10, 2020

No change in consensus, so declined.

@rsc rsc closed this as completed Jun 10, 2020
@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Jun 10, 2020
@golang golang locked and limited conversation to collaborators Jun 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests