Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: allow taking the address of a function result value or constant #22647

Closed
benhoyt opened this issue Nov 9, 2017 · 17 comments
Closed
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@benhoyt
Copy link
Contributor

benhoyt commented Nov 9, 2017

As I noted in http://benhoyt.com/writings/learning-go/#general-quirks, one of the (few) language quirks I noticed when learning Go is that I couldn't take the address of a function result. For example, here's what I wrote at first and thought should work:

mystruct.TimePtr = &time.Now()

But of course it didn't because, according to the spec, the thing you're taking the address of needs to be "addressable" or, as a special case, a composite literal like &Point{2, 3}.

It seems like this would be a backwards-compatible change to the Go language spec to allow this, and have the compiler figure out that it needs to create a temporary variable (on the heap or stack) to take the address of. The compiler could expand it to the equivalent of:

tmp := time.Now()
mystruct.TimePtr = &tmp

And it seems I'm not the only one who has run into this in slightly different ways -- see one, two, three, and four.

In particular, the Amazon AWS SDK has a whole bunch of helper functions like aws.String (docs here), just so you can use constants as "optional parameters" for pointer fields in input structs, eg:

input := &s3.PutObjectInput{
    ACL:         aws.String("public-read"),
    Body:        bytes.NewReader(data),
    Bucket:      aws.String("mybucket"),
    ContentType: aws.String("image/jpeg"),
    Key:         &key,
}
_, err := s3ImageClient.PutObject(input)

Before they added this (see here), you'd have to do this in a very unwieldy way:

acl := "public-read"
bucket := "mybucket"
contentType := "image/jpeg"
input := &s3.PutObjectInput{
    ACL:         &acl,
    Body:        bytes.NewReader(data),
    Bucket:      &bucket,
    ContentType: &contentType,
    Key:         &key,
}

With this proposal in place, not only could the AWS SDK get rid of all those aws.String-style helpers, but there'd be an obvious, clean way to write this:

input := &s3.PutObjectInput{
    ACL:         &"public-read",
    Body:        bytes.NewReader(data),
    Bucket:      &"mybucket",
    ContentType: &"image/jpeg",
    Key:         &key,
}

Obviously there are a couple of things to think about:

  1. Would it be added just for the things people seem to need, that is, function calls and constants? Or for arbitrary expressions like &(x + 1234). I think it'd be nice and orthogonal if it allows arbitrary expressions, but if there's a good reason to be conservative, I think just allowing &functionCall() and &constant would be a great start.
  2. Is the syntax fully backwards compatible? I'm not a syntax expert, but I think so, as this is simply an extension of the special case you can already do with &Point{2, 3}.
  3. You couldn't take the address of multi-valued functions (this would be a compile-time error). But as multi-valued functions are already special in some ways, that seems fine.
@gopherbot gopherbot added this to the Proposal milestone Nov 9, 2017
@dsnet dsnet added the v2 A language change or incompatible library change label Nov 9, 2017
@dsnet
Copy link
Member

dsnet commented Nov 9, 2017

This is related to #19966, which is the same thing, but for primitive types.

and also #9097.

@benhoyt
Copy link
Contributor Author

benhoyt commented Nov 9, 2017

Ah, thanks @dsnet, I'd missed that (I searched, but I searched for address instead of &, which is hard to search for!).

Despite the fact that &functionCall() is what I ran into, I think allowing &constant is probably more important -- assuming we're going to pick and choose. So I'd be three-quarters happy if #19966 was included in Go 2. :-)

I've looked at #9097 but don't really go for the new(int, 5) or the &int(5) syntax -- neither of those seems particularly obvious to me. However, I also see this proposal which proposes new(5) -- I don't like this as much as straight &5 but it seems reasonable.

I'd also like to point out two instances of the newInt() / newString() workaround helper functions in the standard library, albeit in tests: encoding/asn1/asn1_test.go and text/template/exec_test.go.

@as
Copy link
Contributor

as commented Nov 10, 2017

The easiest way to write that Amazon API would have been to not have pointers ruthlessly pointing to everything (including immutable types). Is the need indicated anywhere outside those examples already mentioned in the proposal?

@benhoyt
Copy link
Contributor Author

benhoyt commented Nov 10, 2017

@as But as you can see from the AWS issue thread, they did it like that for a reason -- because they need to distinguish between zero value (say 0 or empty string) and an unset value:

One important thing to note that I'm not sure how other SDKs handle, is the distinction in AWS APIs between an "unset value"-- this is why we've been ensuring that all types are pointers, even for primitive values.

Similar with the Go protobuf library -- it has helper functions for all the basic types to help with setting optional fields.

So there are two major libraries that have this issue, plus the two instances in the stdlib tests, plus two popular StackOverflow questions and two golang-nuts threads that I linked to, as well as my own experience. In that light, it seems a bit dismissive to ask "is the need indicated anywhere" outside the 9 examples and 2 existing issues...

@as
Copy link
Contributor

as commented Nov 10, 2017

it seems a bit dismissive to ask "is the need indicated anywhere" outside the 9 examples and 2 existing issues...

My apologies. It was not intended to be dismissive, the purpose was inquisition and it served to expand the discussion.

because they need to distinguish between zero value (say 0 or empty string) and an unset value

In my opinion, it's important to see whether they need to do it or just want to do it. The thread shows some concern about how an Amazon service might change it's own semantics in the future, breaking the SDK if nil and empty are indistinguishable. If this is the only reason, it seems self-serving with respect to the scope and doesn't get to the root of the "why". Is there a clear reason why nil and "" should be distinguishable other than consistency with some other languages or frameworks? I contested that AWS example for this reason. I don't know the story about protocol buffers though.

@rasky
Copy link
Member

rasky commented Nov 10, 2017

Another related use-case is Create().Use(), where Create returns a value and Use has a pointer receiver. You can’t do that in Go and are forced to use a temporary, but it doesn’t appear to be a good reason for this limitation.

@benhoyt
Copy link
Contributor Author

benhoyt commented Nov 10, 2017

@as Thanks. And what you say below is a fair point:

In my opinion, it's important to see whether they need to do it or just want to do it. ... Is there a clear reason why nil and "" should be distinguishable other than consistency with some other languages or frameworks?

I can definitely see the need to distinguish between zero value and unset value in some cases, but with a quick glance at the actual s3.PutObjectInput struct here, for all of the keys I've shown and most of the others I haven't shown they're either required (Bucket and Key) or an empty string wouldn't be valid (ACL, ContentType). In fact, in the case of ACL, if I were designing the API by hand I'd probably go with an enum: ACLPublicRead, etc. My guess is that the API surface of AWS is so large they (partially? fully?) autogenerate the code from an API description (see this commit for example).

Still, such use cases are reasonable -- I think they probably didn't choose to do it this way lightly, given the size and need for consistency in their API. It'd be equally awkward having most parameters be values and some (maybe ContentLength?) be pointers/nil-able.

In any case, I think the more important use cases would probably be where I ran into it, for things like nullable database fields. If Go had option types (and I know there are proposals for that as well) that would solve this situation too -- you don't really want a pointer, you just want to denote the fact that a field can be set or unset.

@ianlancetaylor
Copy link
Contributor

Create().Use() is an interesting case. If Use takes a pointer receiver, and modifies the memory to which its receiver points, then writing Create().Use() seems likely to be a bug.

It's also an interesting question in general: if we permit &E for arbitrary E by creating a temporary memory location, should we do the same for E.M() for arbitrary E when M uses a pointer receiver? My guess is that the latter is much more likely to introduce bugs than the former, but I could be wrong.

@faiface
Copy link

faiface commented Nov 10, 2017

@ianlancetaylor Using Create().Use() is already perfectly possible when Create() returns a pointer (and Use has a pointer receiver). It's also useful sometimes, and when it's a bug, it's not harder to find than any other bug.

A situation when Create() returns a value (non-pointer) type, but Use() has a pointer receiver is very unusual IMHO. Struct types usually fall into two categories: pointer types and value types. Constructors for pointer types return pointers and their methods use pointer receivers likewise. Constructors for value types return a value and methods use value receivers.

@rasky
Copy link
Member

rasky commented Nov 10, 2017

Create().Use() is an interesting case. If Use takes a pointer receiver, and modifies the memory to which its receiver points, then writing Create().Use() seems likely to be a bug.

I don't think that's true in general. For instance, the above syntax is very common for logging libraries where the intent of the user is printing a line; the fact that the object is modified or not in the process is an implementation detail of the library itself.

Obviously, the libraries can still be written today, using either always a pointer or always a value (for both Create and Use), but I still can't see why a mix should be forbidden (and just for the case of a temporary, while allowed for an explicit temporary).

There's also a performance argument. Go's escape analysis is far from perfect, and elision of useless receiver copies is non existent; every method call with a value receiver generates a copy of the receiver, irrespective of the fact that the copy is actually required or not. In fact, if the method does not modify the value, the copy is useless most of the time. I tend to prefer pointer receivers for this reason unless the value is really simple (eg: a basic type).

@as
Copy link
Contributor

as commented Nov 11, 2017

There's also a performance argument. Go's escape analysis is far from perfect, and elision of useless receiver copies is non existent; every method call with a value receiver generates a copy of the receiver, irrespective of the fact that the copy is actually required or not. In fact, if the method does not modify the value, the copy is useless most of the time. I tend to prefer pointer receivers for this reason unless the value is really simple (eg: a basic type).

Those statements are implementation details not covered by the compatibility promise (unlike the proposed change would be), even if they are correct (and without benchmarks that is impossible to know)

@bcmills
Copy link
Contributor

bcmills commented Feb 21, 2018

If Use takes a pointer receiver, and modifies the memory to which its receiver points, then writing Create().Use() seems likely to be a bug.

At the moment, pointer receivers conflate two otherwise-orthogonal properties: efficiency (avoiding unnecessary copies) and identity (ensuring that writes do or do not go to the underlying object).

If Use accepts a pointer receiver for efficiency reasons, then Create().Use() should be allowed. If Use accepts a pointer receiver in order to preserve identity, then it is indeed likely to be a bug.

In the wild, this comes up in the form of the Builder pattern: if you want to write a Builder-like API in Go but pass pointers for efficiency reasons, you must wrap those pointers in an extra struct type in order to have a value receiver. (Thankfully, the Builder pattern doesn't come up often because we can usually use composite literals instead.)

@bcmills
Copy link
Contributor

bcmills commented Feb 21, 2018

Go's escape analysis is far from perfect, and elision of useless receiver copies is non existent; every method call with a value receiver generates a copy of the receiver, irrespective of the fact that the copy is actually required or not.
[…]

Those statements are implementation details

That may be true, but as abstractions they're very leaky. Fundamentally, escape analysis and copy elision both rely on some form of alias analysis, which is difficult to do precisely — especially in a language as dynamic as Go. If you want to elide a copy of a function argument, you need to know not only that the callee doesn't modify the argument directly, but also that it doesn't modify the argument indirectly (such as through a passed-in function or interface argument).

You could argue that a sufficiently clever compiler should be able to do that analysis anyway, but the Go compiler seems unlikely to be that sufficiently clever any time soon. So the viable options seem to be either to accept that pointer-receivers-as-optimization will sometimes make call sites more awkward, or accept that pointer-receivers-as-identity will allow aliasing bugs to occur.

It seems to me that the use of garbage collection already biases Go toward cleaner call sites at the expense of aliasing bugs, so I think making function arguments addressable would be fairly consistent with the language as it stands. (But I freely admit that that's entirely subjective.)

@griesemer griesemer changed the title proposal: Allow taking the address of a function call or constant proposal: Allow taking the address of a function result value or constant Mar 27, 2018
@ianlancetaylor ianlancetaylor changed the title proposal: Allow taking the address of a function result value or constant proposal: Go 2: allow taking the address of a function result value or constant Mar 27, 2018
@ianlancetaylor
Copy link
Contributor

#9097 proposes a more general syntax that works for all kinds of expressions, not just function calls and constants. Closing this issue in favor of that one. We can discuss syntax details over there.

@benhoyt
Copy link
Contributor Author

benhoyt commented Mar 28, 2018

@ianlancetaylor Hmm, I'm obviously biased, but I think this issue is better than #9097, for a few reasons:

  1. The report is more fleshed out and clearly worded, and gives real examples ("experience report") rather than just "we should add this". The description also contains links to other people asking about this on StackOverflow and golang-nuts.
  2. The title is maybe a bit narrow, but this issue does cover the more general support for taking the address of an expression, it's just an operator precedence thing -- see note about things like &(x + 1234) at the end of my description. That said, I think taking the address of general expressions would be rare.
  3. I think the &"foo" and &1234 syntax is a natural extension of the existing &Foo{42} syntax: describe a thing, then take its address. Whereas &int(foo) looks more like type coercion or a function call, not just "taking an address". Additionally, &1234 is more succinct (though that's not necessarily a virtue by itself).
  4. I know it's not a popularity contest, but this one has 12 upvotes and the other only 5.

Don't get me wrong, I think the &int(1234) syntax is okay and would be a win, just seems this was closed a bit prematurely. Or maybe I'll argue for the simpler &1234 syntax over at #9097. :-)

@ianlancetaylor
Copy link
Contributor

Yes, please discuss over at #9097. The two issues are basically the same idea with different syntax, so I'm consolidating the discussion in one place, the earlier issue.

@benhoyt
Copy link
Contributor Author

benhoyt commented Mar 28, 2018

Will do, thanks.

@golang golang locked and limited conversation to collaborators Mar 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

9 participants