Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflect: add Value.Caller #49340

Open
twmb opened this issue Nov 4, 2021 · 63 comments
Open

reflect: add Value.Caller #49340

twmb opened this issue Nov 4, 2021 · 63 comments

Comments

@twmb
Copy link
Contributor

twmb commented Nov 4, 2021

reflect.Call allocates a []reflect.Value slice of length nout for the number of return arguments from the function. This is fine for a one-off call of a function, but when repeatedly invoking a function, reflect.Call allocations can quickly add up.

As an example, I have a helper package that sorts arbitrary types, and if a type has a func (t T) Less(other T) bool (i.e., Name == "Less", NumIn() == 1, NumOut() == 1, In(0) == T, Out(0) == bool), the package calls that function for sorting. This is done within sort.Slice, meaning Less is called as many times as sort.Slice needs to compare. For an ordered slice of 1000 elements, each sort.Slice allocates 25735 times.

The only four allocations in this benchmark are (tip @ 2cf85b1)

  105.50MB 68.28% 68.28%   105.50MB 68.28%  reflect.Value.call /home/twmb/go/go.tip/src/reflect/value.go:565
      41MB 26.53% 94.81%       41MB 26.53%  reflect.methodReceiver /home/twmb/go/go.tip/src/reflect/value.go:862
    4.50MB  2.91% 97.72%     4.50MB  2.91%  reflect.Value.call /home/twmb/go/go.tip/src/reflect/value.go:609
    1.50MB  0.97% 98.70%     1.50MB  0.97%  runtime.allocm /home/twmb/go/go.tip/src/runtime/proc.go:1866

70% of the allocations are from allocating the output argument slice. If possible, I could provide the input slice and reuse it for all Calls, which would effectively eliminate this alloc.

I looked into line 862 and into the runtime, I'm not sure how to get rid of that allocation: the code takes the address of an unsafe pointer, and that allocates; if possible it would be quite nice to get rid of this alloc as well but this would likely require deeper wiring into the runtime (the point here is to have a method pointer). Line 609 has a TODO that could eliminate the allocation. runtime/proc.go:1866 looks to be unrelated.

reflect.Value.call can be:

func (v Value) call(op string, in, out []Value) []Value {

If out is non-nil, then it is used rather than allocating a new slice. If the input out is too small, the code will panic. Essentially, line 565 turns into:

ret = out
if ret == nil {
    ret = make([]Value, nout)
}
if len(ret) < nout { // optional check
    panic("short output slice")
}
@gopherbot gopherbot added this to the Proposal milestone Nov 4, 2021
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Nov 4, 2021
@ianlancetaylor
Copy link
Contributor

Going forward would it make sense to write your package using generics?

@twmb
Copy link
Contributor Author

twmb commented Nov 4, 2021

I don't think so. This is doing deep sorting on struct fields (admittedly this is not a high priority package). If the field is a slice, I check the inner type for the method. With generics, even if I know the method has a Less function, I cannot type assert into Lesser[T] because T requires instantiation and I do not know the type.

@rsc
Copy link
Contributor

rsc commented Nov 10, 2021

If we add CallInto or something like it, does reflect.Call go down to zero allocations?
It would be good to know we were done with this particular issue, at least for Call.

@rsc
Copy link
Contributor

rsc commented Nov 10, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Nov 10, 2021
@dsnet
Copy link
Member

dsnet commented Nov 10, 2021

This seems like a duplicate of #43732.

If we add CallInto or something like it, does reflect.Call go down to zero allocations?

According to my comment #43732 (comment), we can get it down to zero allocations. While there is a run time improvement of 50%, it wasn't a great as I had hoped.

@twmb
Copy link
Contributor Author

twmb commented Nov 11, 2021

Interesting, yes I think this is a duplicate of #43732. Does the CL linked take care of methods? I don't see a change in methodReceiver, which was the remaining allocation I ran into.

I also tried removing the allocation locally -- I got a similar sort of speedup: not as much as I hoped, but the allocation drop is somewhat nice regardless.

@dsnet
Copy link
Member

dsnet commented Nov 11, 2021

Does the CL linked take care of methods?

I didn't do anything special for method functions. I'll take a look later if something is needed there.

@rsc
Copy link
Contributor

rsc commented Dec 1, 2021

Indeed, this is a duplicate of #43732, but this one is in the active column, so I'll close that one.

@rsc
Copy link
Contributor

rsc commented Dec 1, 2021

It sounds like people basically agree about the signature (even in #43732): it takes two []Value and returns nothing.
But we need a name, and we need to figure out whether it handles variadic functions like Call or like CallSlice.
If it handles them like Call, doesn't that allocate? So probably it should handle them like CallSlice.
Maybe it would be CallSlices?
Other names welcome but CallInto and CallWith both sound like they are missing something.

@rsc
Copy link
Contributor

rsc commented Dec 8, 2021

What should we do about Call vs CallSlice? We probably don't want to add two new functions.
And if we do this, are we down to zero allocations, so we know we won't need to add another variant in the future?

@Splizard
Copy link

Splizard commented Dec 9, 2021

The naming problem could be avoided by introducing a reflect.Caller type.

//Caller is a low-level type used to make 
//zero-allocation function calls.
type Caller struct {}

I suspect that adding this type has an additional advantage, it leaves the door open to a NewCaller/Caller constructor of the type being able to optimise repeat calls to the function it's calling.

echo := reflect.ValueOf(func(in string) string { return in }).Caller()

or

echo := reflect.NewCaller(reflect.ValueOf(func(in string) string { return in }))

then

results := make([]reflect.Value, 1)
args := []reflect.Value{
    reflect.ValueOf("hello"),
}

for i := 0; i < 100; i++ {
    echo.Call(args, results) //more efficient
}

What should we do about Call vs CallSlice?

reflect.Caller provides at least two options:

  1. Have two methods on the reflect.Caller type, Call and CallSlice.
  2. Have reflect.Caller.Call behave like CallSlice when it is constructed from a variadic function.

It can then be considered that reflect.Value.Call and reflect.Value.CallSlice are shorthand/abstractions around the use of the "low-level" Caller type.

And if we do this, are we down to zero allocations, so we know we won't need to add another variant in the future?

In my echo example above, it is my understanding that the string header returned from echo will escape to the heap when it is converted into a reflect.Value and placed in the results slice. Therefore 'zero-allocation' is only possible by providing pointers in the results slice for Call to write into. I think it should be the requirement of reflect.Caller's user to allocate all the results needed to call the function. Only reflect.Value.Call, reflect.Value.CallSlice and reflect.Caller's constructor should be allowed to allocate.

results := []reflect.Value{new(string)} //all results must be pre-allocated.
args := []reflect.Value{reflect.ValueOf("hello")}
for i := 0; i < 100; i++ {
    echo.Call(args, results) //zero-allocations.
}

@twmb
Copy link
Contributor Author

twmb commented Dec 10, 2021

Adding the ability to specify the output slice removes allocations for non-method function calls. For methods, there still is an allocation to create a method pointer. Since call immediately uses and then discards this method pointer, it would be good if this allocation could be removed, but I'm not sure how, and removing the allocation for output arguments does not remove the method pointer allocation.

@rsc
Copy link
Contributor

rsc commented Jan 5, 2022

It sounds like we are having trouble getting to an API that will actually guarantee zero allocations and that is simple enough to be a reasonable API change. Does anyone see how to thread this needle?

@rsc
Copy link
Contributor

rsc commented Jan 19, 2022

It still sounds like we don't have a final API to decide about,
one that will guarantee zero allocations and is simple enough to present to users.
Does anyone want to try to do that?

@Splizard
Copy link

@twmb
Is this something that a reflect.Caller type could handle? If the reflect.Value is a method, allocate this method pointer and reuse it across sequential calls.

@twmb
Copy link
Contributor Author

twmb commented Jan 20, 2022

I think reflect.Caller is a nifty idea: the method pointer is allocated once, and function lookups are cached (and this work eliminated) across repeated invocation. I'm not sure what the code would look like but I'm 👍 on the idea.

Also, reflect.Caller could have one single method. Stealing some documentation from the current reflect.Value.Call,

// Call calls the function v with the input arguments in. For example, if len(in)
// == 3, v.Call(in) represents the Go call v(in[0], in[1], in[2]).  It returns the
// output results as Values. As in Go, each input argument must be assignable to
// the type of the function's corresponding input parameter. If v is a variadic
// function, Call creates the variadic slice parameter itself, copying in the
// corresponding values.
//
// The returned slice is reused across repeated invocations to call. If you wish
// to keep the argument slice itself, it must be copied.
func (c reflect.Caller) Call(args []reflect.Value) []reflect.Value

In this scheme, the returned argument slice can be allocated once on the first call, and reused across future invocations.

Alternatively, this type could have a second method. CallInto was questioned above, and I'm not so sure on CallWith myself (with what?), but perhaps CallWithReturns (still open to better names)?

In summary for new APIs,

func NewCaller(interface{}) reflect.Caller
func (v reflect.Value) Caller() reflect.Caller

type reflect.Caller struct { ... }

func (c reflect.Caller) Call(args []reflect.Value) []reflect.Value // reuses return slice across repeated invocations
func (c reflect.Caller) CallWithReturns(args, returns []reflect.Value)

I believe it should be possible to have zero allocation calls with this helper type.

@rsc
Copy link
Contributor

rsc commented Jan 26, 2022

Is Caller just a cache for allocated things? If so, maybe it shouldn't embed the receiver? That is, maybe it should be

var caller reflect.Caller
caller.Call(v, args)

because then you can do

caller.Call(otherV, otherArgs)

Does anyone want to try implementing this?

@rsc
Copy link
Contributor

rsc commented Feb 9, 2022

Sounds like we are waiting on someone to implement this to understand whether it satisfies the performance goals for the proposal.

@rsc
Copy link
Contributor

rsc commented Feb 23, 2022

Placed on hold.
— rsc for the proposal review group

@jimmyfrasche
Copy link
Member

FuncOf corresponds to declaration and Call with application

@rsc
Copy link
Contributor

rsc commented Dec 21, 2022

Talked to @ianlancetaylor and others at proposal review. Let's go back to in, out. Otherwise this seems good. Any objections?

@rsc
Copy link
Contributor

rsc commented Jan 4, 2023

The current proposal is:

type Caller struct { ... }
func (v Value) Caller() *Caller
func (c *Caller) NumIn() int
func (c *Caller) NumOut() int
func (c *Caller) Call(in, out []Value)
func (c *Caller) CallSlice(in, out []Value)

@rsc
Copy link
Contributor

rsc commented Jan 4, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jan 11, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: reflect: add Value.Caller reflect: add Value.Caller Jan 11, 2023
@rsc rsc modified the milestones: Proposal, Backlog Jan 11, 2023
@dsnet
Copy link
Member

dsnet commented Jan 11, 2023

Overall, the API LGTM. One minor change is that perhaps we should simply expose the type of the function rather than provide the NumIn and NumOut methods. It's not enough to know the number of arguments, but we need to know the types, which we can get from the type.

Thus, just:

func (*Caller) Type() Type

@nrwiersma
Copy link

I have run into an issue with reusing the output values. Initially I had used the following to reuse where possible:

s := out[i].ptr
if out[i].typ != tv || out[i].flag&flagIndir != flagIndir || s == nil {
	s = unsafe_New(tv.common())
	out[i] = Value{tv.common(), s, flagIndir | flag(tv.Kind())}
}

found here

This was to not break the current behaviour of call, but as it turns out, this allows for the following in the out:

var i int
out := []Value{ValueOf(i)}

This cases some serious weirdness, and it seems I have not way to detect this case, as the only flag set is flagIndir (outside of kind flags).

This could be resolved by checking out[i].CanSet() but then the newly created Value must have flagAddr which, as stated above, breaks the current implementation of Value.Call and Value.CallSlice and a test.

It is unclear to me how to proceed or if in fact I have made a mistake somewhere.

@gopherbot
Copy link

Change https://go.dev/cl/482375 mentions this issue: reflect: add reflect.Caller

@ianlancetaylor
Copy link
Contributor

@nrwiersma What is the status of your work on this issue? I see that we now have https://go.dev/cl/482375 to implement this.

@nrwiersma
Copy link

@ianlancetaylor I ran into a road block which I don't know how to resolve without changing the behaviour of Value.Call and a test. Details are here: #49340 (comment).

I saw the implementation, and it seems it suffers the same issue I ran into, allowing

var i int
out := []Value{ValueOf(i)}

@ianlancetaylor
Copy link
Contributor

Sorry, I'm not sure I entirely understand what you mean. Can you give a short example using Value.Caller where this causes a problem? Thanks.

@nrwiersma
Copy link

Sure. This is from a test I expected to fail, and it does, kind of.

fn := func() int {
	return 27
}

var i int
v := ValueOf(i)
out := []Value{v}

ValueOf(fn).Caller().Call([]Value{}, out)

if i != 27 {
	t.Errorf("unexpected result for output value; got %d, want 27", i)
}

The actual error you get is unexpected result for output value; got 27, want 27. v actually should be v := ValueOf(&i).Elem().

The issue here is that v is not addressable, and normally this should panic. But as call uses flags flagIndir | flag(tv.Kind()) and to make the output slice reusable, this is allowed and undetectable (I dont know if call made the output or if it was provided by the user). If I were to enforce flagAddr on the output values I ether cannot reuse output values created by call as the dont have this flag or I add flagAddr to the call created output values and I break the existing behaviour of Value.Call and a test (I believe TestFunc). That is my conundrum.

@cuiweixie
Copy link
Contributor

Sure. This is from a test I expected to fail, and it does, kind of.

fn := func() int {
	return 27
}

var i int
v := ValueOf(i)
out := []Value{v}

ValueOf(fn).Caller().Call([]Value{}, out)

if i != 27 {
	t.Errorf("unexpected result for output value; got %d, want 27", i)
}

The actual error you get is unexpected result for output value; got 27, want 27. v actually should be v := ValueOf(&i).Elem().

The issue here is that v is not addressable, and normally this should panic. But as call uses flags flagIndir | flag(tv.Kind()) and to make the output slice reusable, this is allowed and undetectable (I dont know if call made the output or if it was provided by the user). If I were to enforce flagAddr on the output values I ether cannot reuse output values created by call as the dont have this flag or I add flagAddr to the call created output values and I break the existing behaviour of Value.Call and a test (I believe TestFunc). That is my conundrum.

It output in my mac m1:

unexpected result for output value; got 0, want 27

I think this is reasonable.

@ianlancetaylor
Copy link
Contributor

@nrwiersma I see, thanks for the example. I agree that the behavior is not ideal but I think it's OK. It seems understandable. And it doesn't seem like a likely mistake.

@gopherbot
Copy link

Change https://go.dev/cl/483255 mentions this issue: reflect: add Value.Caller to minimise allocations

@bcmills
Copy link
Contributor

bcmills commented Apr 21, 2023

In order for this to be useful, I'm responsible for pooling the reflect.Caller objects. How do I do that when I'm managing an unbounded number of discrete functions? For each function, I can't just have one pre-prepared reflect.Caller since reflect.Caller is not concurrent safe. I need a pool of them for each function. It does not scale to have a sync.Pool for each of these.

In looking at the concrete CL, it seems that there is still a sync.Pool access internally on each invocation, which can lead to substantial cache contention (see #24479) and scalability issues, especially for Caller instances wrapping short and very efficient functions.

If the reflect.Caller has that sort of nontrivial buffer management, then I don't really see the point of it: it seems like most of its improvements could be made without an API change, by adding a sync.Map or two to store the pre-computed information keyed by rtype and methodIndex. (I suppose there is still an advantage to making the out slice an out-parameter, but it's not clear to me that that optimization is worth defining an entire receiver type.)

Moreover: once we promise that Caller is safe for concurrent use we cannot reverse that decision, but if we disallow concurrent use for now, we can easily revisit that decision and make it safe later. (Making a concurrency-safe API no longer so is a breaking change, but making a non-concurrency-safe API safe for concurrent use is strictly compatible.)

So I think it is important that the Call and CallSlice methods of reflect.Caller not be safe for concurrent use today.

@bcmills
Copy link
Contributor

bcmills commented Apr 21, 2023

What's the behavior if the provided out has elements that mismatch what the function actually outputs? I propose:

If the n-th output type is assignable to out[n].Type() and out[n].CanSet, then the output is directly stored using the equivalent of out[n].Set.

That seems too subtle to me: making a small change to a program that was previously allocating new values could cause it to silently switch to overwriting existing values (or vice-versa), the resulting aliasing bugs seem likely to be difficult to diagnose.

Perhaps instead the Caller could require that the out slice is always prepopulated with values of correct types for which CanSet is true, and provide a separate method method to allocate and initialize values as needed?

That method could look like:

// MakeOut returns a slice of Values suitable for passing as the out argument
// to Call or CallSlice.
func (*Caller) MakeOut() []Value

@nrwiersma
Copy link

So I think it is important that the Call and CallSlice methods of reflect.Caller not be safe for concurrent use today.

This would mean that Value.Call and Value.CallSlice could not use Caller, meaning Caller would need to be a entirely new way of function calling, or the Value would need its own sync.Pool in order to use it.
Also it would make no sense to provide the out values to Call and CallSilce as the Caller should be able to entirely manage the output values.

@bcmills
Copy link
Contributor

bcmills commented Apr 24, 2023

This would mean that Value.Call and Value.CallSlice could not use Caller

Why is that a problem? Even in https://go.dev/cl/483255/6 a new Caller instance is being allocated for each call to Value.Call or Value.CallSlice and is not being shared across multiple concurrent calls.

Also it would make no sense to provide the out values to Call and [CallSlice] as the Caller should be able to entirely manage the output values.

That doesn't follow. It still seems useful to pass in the out slice to allow the user of Caller to avoid unnecessary allocations, especially if they are reusing the same allocated slice across multiple calls to different functions.

@nrwiersma
Copy link

Fair enough. If this is the new direction, I don't have the time to completely restart this PR. In that case I would rather close it and let someone else have a go.

@ianlancetaylor
Copy link
Contributor

@bcmills Note that the sync.Pool is only accessed when there are no output arguments.

Personally I think the current behavior of the output slice is correct: it gets automatically initialized for you, and if you keep passing the same output slice, it doesn't do any additional allocations. Yes, if you change the output slice in some way it may not behave as you expect. Don't do that. Or, we could panic if the element in the output slice is both initialized and the wrong type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

Successfully merging a pull request may close this issue.