Go 1.1 Function Calls

Russ Cox

February 2013

The Go 1.0 runtime uses dynamic code generation to implement closures. I took this approach primarily out of expedience: it avoided toolchain-wide changes to the representation of function values and to the function calling convention. However, it is clear that in the long term we should not depend on dynamic code generation, since it limits the environments in which Go can run. It also complicates minor parts of the toolchain: the stack trace code has ugly heuristics to handle closures, and the gdb support cannot get past a closure in a stack trace. The canonical solution is to represent a closure as a pair of pointers, one to static code and one to a dynamic context pointer giving access to the captured variables.

As part of addressing the dynamic code generation problem for Go 1.1, it is worth taking a broader look at the way Go implements function calls. This document describes the general requirements and the current implementation and then a proposed new implementation that does away with dynamic code generation and at the same time enables receiver-curried method expressions and removes a panic from package reflect. I plan to do this for Go 1.1.

This document supersedes the “Alternatives to Dynamic Code Generation in Go” document I circulated in September 2012.

Kinds of Function Calls

This document uses the word “function” to mean any block of Go code that can be executed, and it uses the word “call” to mean the mechanism by which a function is invoked.

In Go, there are four different kinds of functions:

top-level func
method with value receiver
method with pointer receiver
func literal

And there are five different kinds of calls:

direct call of top-level func
direct call of method with value receiver
direct call of method with pointer receiver
indirect call of func value
indirect call of method on interface

The following Go program demonstrates all the possible function/call pairings.

package main

func TopLevel(x int) {}

type Pointer struct{}

func (*Pointer) M(int) {}

type Value struct{}

func (Value) M(int) {}

type Interface interface { M(int) }

var literal = func(x int) {}

func main() {

// direct call of top-level func

TopLevel(1)

// direct call of method with value receiver (two spellings, but same)

var v Value

v.M(1)

Value.M(v, 1)

// direct call of method with pointer receiver (two spellings, but same)

var p Pointer

(&p).M(1)

(*Pointer).M(&p, 1)

// indirect call of func value (×4)

f1 := TopLevel

f1(1)

f2 := Value.M

f2(v, 1)

f3 := (*Pointer).M

f3(&p, 1)

f4 := literal

f4(1)

// indirect call of method on interface (×3)

var i Interface

i = v

i.M(1)

i = &v

i.M(1)

i = &p

i.M(1)

Interface.M(i, 1)

Interface.M(v, 1)

Interface.M(&p, 1)

}

As the program shows, there are ten possible combinations of function and call:

direct call of top-level func /
direct call of method with value receiver /
direct call of method with pointer receiver /
indirect call of func value / set to top-level func
indirect call of func value / set to value method
indirect call of func value / set to pointer method
indirect call of func value / set to func literal
indirect call of method on interface / containing value with value method
indirect call of method on interface / containing pointer with value method
indirect call of method on interface / containing pointer with pointer method

In the list, a slash separates what is known at compile time from what is only found out at run time. The code generated at compile time for an indirect call cannot depend on the run-time values; instead, some of the indirect cases are handled by generating adapter functions that fit the expectations of the indirect call.

Current Implementation

This section describes the current implementation of the possible function calls.

Direct call of top-level func. A direct call of a top-level func passes all arguments on the stack, expecting results to occupy the successive stack positions. This matches the calling convention used by the associated C compilers.

Direct call of method. In order to use the same generated code for both an indirect call of a func value and for a direct call, the code generated for a method (both value and pointer receivers) is chosen to have the same calling convention as a top-level function with the receiver as a leading argument.

Indirect call of func value. An indirect call is treated as identical to a direct call of a top-level func except for the actual CALL or BL instruction: in this case, the func value is treated as containing the address of the code to execute. This choice means that an indirect call to a top-level func dispatches to the same function that the direct call does. As mentioned above, an indirect call of a method dispatches to the direct method function. That leaves an indirect call of a func value set to a func literal.

In general a func literal value has two parts: a function that can be generated at compile time, and then some associated hidden arguments that are known at run time when the func literal expression is evaluated and its value saved. In order to match the calling convention expected for an indirect call of a func value, the current implementation uses run time-generated code that supplies the hidden arguments to a compile time-generated function body.

Indirect call of method on interface. An interface value is a pair (type, word), where the word can be the typed value stored in the interface, if it fits, or else a pointer to the typed value. An interface method call retrieves from the type the address of the method code to execute and then calls it like an indirect call with the word as a leading argument. If the typed value is itself a pointer and the method is a pointer method, then the calling convention matches that of the direct and indirect calls above, so that the call can dispatch to the same function they use. If the typed value is a non-pointer of exactly one word in size and the method is a value method, the same optimization applies. Otherwise, for the remaining cases, namely a value method invoked on a typed pointer or non-pointer, the code used for other call contexts does not have the same calling convention. An adapter function must be generated at compile time, and the address of that adapter function stored in the method table consulted during the interface call. For example, the Value type above would use in its method table an adapter function like:

func Value.M·i(word uintptr, x int) {

v := *(*Value)(unsafe.Pointer(&word))

v.M(x)

}

If a Value exceeded one word in size (instead of, in this case, being smaller than one word), the adapter would look the same but omit the & in &word.

Problems with the Current Implementation

The current implementation is elegant, entirely dictated by three choices: (1) a direct call should match C’s conventions, (2) an indirect call should dispatch to the same code as a direct call, and (3) an interface call should dispatch to the same code as a direct call in the case of a pointer with pointer methods. However, there are three shortcomings.

Run-time code generation. In order to make func literal values fit the model for indirect calls, the current implementation uses run-time code generation to implement the capturing of local variables. In contexts such as embedded or sandboxed systems, run-time code generation can be expensive or impossible. Go cannot depend on it. The usual solution is to make func values two words, one code pointer referring to compile time-generated code and one data pointer referring to captured local variables.

Method Values. In discussions over a year ago, we reached consensus on what the meaning of method values would be if we added them to the language. They would look like:

var r io.Reader

f := r.Read // f has type func([]byte) (int, error)

n, err := f(p)

Even once we reached that agreement, I did not bother to send out a spec change, because the implementation of “f := r.Read” would hide the allocation of a closure. It seemed better to force people to write:

f := func(p []byte) (int, error) { return r.Read(p) }

and make the closure explicit. Also I was lazy and did not want to implement it. But I have always treated it as a “someday we’ll want to do this.” The fact that “f := r.Read” is disallowed is a common surprise among new programmers. The two-word func value representation makes possible a trivial implementation of “f := r.Read”.

Reflect. In reflection, v.Method(i) returns a Value corresponding to the i’th method of v with the receiver v pre-bound. For example, assuming the i’th method is named F,

v.Method(i).Call(ValueOf(x), ValueOf(y))

is the reflect equivalent of v.F(x, y). The fact that Method and Call are two different steps means that v.Method(i) by itself must evaluate to something. Today it evaluates a reflect.Value that can be used in Call and have its Type inspected. However, the Interface method, as in

v.Method(i).Interface()

panics, because there is no Go value to return in the interface{}.

The two-word func value representation makes it trivial to create a true method value here with v pre-bound, just as it makes it trivial to create one in the “f := r.Read” case. So the Interface call would not need to panic anymore. (It is important to fix both of these at the same time; if we fix reflect but disallow “f := r.Read” people will resort to using reflect to work around the deficiency in the language proper, a situation we should avoid.)

New Implementation

We propose a new implementation that changes only the “indirect call of func value” mechanism. All other call details remain unchanged.

The new implementation avoids the need for runtime code generation by making a func value a pointer to a variable-sized block of data memory, in which the first word holds a code pointer and the remainder holds additional data that can be used by the called code. This diagram shows the memory layout of a func variable in the current implementation in grey, with the changes required by the proposed new implementation in black:

In the current implementation, the func value holds a pointer to the actual code to be run during a call. The new implementation introduces an indirection through the data block.

In the current implementation, a call sequence looked like:

MOV …, R1

CALL R1

In the new implementation, a call sequence adds an indirection, and it also leaves the indirect block address (that is, the address of the middle box) in a well-known register (R0 here):

MOV …, R0

MOV 0(R0), R1

CALL R1 # called code can access “data” using R0

Consider each of the possible ways to initialize a Go func value, from the program above:

f1 := TopLevel

f1(1)

f2 := Value.M

f2(v, 1)

f3 := (*Pointer).M

f3(&p, 1)

f4 := literal

f4(1)

Except when calling a func literal that captures outer local variables, there is no associated data, so the memory layout reduces to:

In this case, the middle box is simply a C function pointer, so the Go func value is simply a pointer to a C function pointer. The extra C function pointer word must be allocated, but the fact of the assignment is known to the compiler ahead of time, so the pointer word can be allocated in read-only data and only once for each function being stored in a func value, no matter how many places in the program do so.

That is, the assignment “f := MyFunc” would generate code like:

MOV $MyFunc·f(SB), f1

DATA MyFunc·f(SB)/8, $MyFunc(SB)

GLOBL MyFunc·f(SB), 10, $8

The actual store instruction records a pointer to the read-only data MyFunc·f, which itself holds a pointer to the actual code MyFunc. The ·f suffix establishes a separate name space for these indirection words. In the GLOBL declaration, the flag word 10 sets bits 8 (read-only memory) and 2 (duplicate definitions can be merged in the final binary).

This applies to all functions without associated data. In the snippet above, f1, f2, f3, and f4 without captured variables all use this pattern.

If a func literal (f4) does capture variables, then a larger indirect block must be allocated at run time. The first word points at the compile time-generated function, and the rest of the block holds the pointers to the captured variables. The “runtime.closure” function, called to create a closure, currently allocates memory and then fills in actual machine instructions to produce a code sequence. The new implementation need only allocate memory and copy the necessary code and data pointers into it, little more than filling out a composite literal.

Assigning a method expression to a func value, as in “f := r.Read”, will allocate an indirect block containing a pointer to an adapter function and then, as the data, a copy of r. The statement “f := r.Read” then compiles into, approximately:

type funcValue struct {

f func([]byte) (int, error)

r io.Reader

}

func readAdapter(b []byte) (int, error) {

r := (*io.Reader)(R0+8)

return r.Read(b)

}

f := &funcValue{readAdapter, r}

Reflection. The fact that a pointer to a C function pointer is a valid Go func value means that reflection can generate a Go func value by using a pointer into its own tables. For example, a concrete type like *bytes.Buffer has an associated method table. Suppose table[0] holds the code address (C function pointer) for (*Buffer).Read, table[1] holds the address for (*Buffer).Write, and so on. When a reflect call like

reflect.TypeOf(new(bytes.Buffer)).Method(0)

must create a Go func value corresponding to (*Buffer).Read, it can return &table[0] instead of needing to allocate an explicit indirect block.

One of the goals of allowing “f := r.Read” was to enable

f := reflect.ValueOf(new(bytes.Buffer)).Method(0).Interface()

to have the same effect; currently it panics in the call to Interface. The obvious implementation is to generate a function like readAdapter above for every method and to record pointers to those functions in the reflection tables. This has the drawback of growing the reflection tables and growing the text segment, by including adapters that will in most cases never be used. Instead, we can include a single adapter for use by reflect no matter what the function. It would need to be written in assembly, and the effect of its use would make the call above generate something like:

typedef struct funcValue funcValue;

struct funcValue {

void (*adapter)(void);

void (*fn)(void);

uintptr rcvrArgBytes;

uintptr inArgBytes;

uintptr outArgBytes;

byte rcvr[0];

}

f := malloc(sizeof(funcValue) + sizeof(*bytes.Buffer))

f.adapter = adapter

f.fn = (*bytes.Buffer).Read;

f.rcvrArgBytes = sizeof(*bytes.Buffer);

f.inArgBytes = sizeof([]byte)

f.outArgBytes = sizeof(int, error);

memmove(&f.rcvr, r, sizeof(*bytes.Buffer));

and the adapter would need to use the context recorded in funcValue to make a generic function call, similar to (but simpler than) what reflect.Call does today. This would be slightly slower than pre-generating custom adapters, but it avoids the space overheads.

Properties of the New Implementation

Compared to the current implementation, the new implementation sacrifices the match between direct calls of Go and C functions. In particular, the runtime has thus far assumed that a Go func(int) and a C void(*)(int) are the same type. The distinction will need to be introduced, in the same way that the runtime must distinguish a Go string from a C char*. The new implementation preserves the other two, more important properties: an indirect call dispatches to the same code as a direct call, and an interface call dispatches to the same code as a direct method call in the case of a pointer with pointer methods.

The new implementation introduces a memory load immediately before the indirect call of a func value. It is possible that this will stall indirect CALL instructions somewhat, but this only affects calls using func values, not interface calls.

The new implementation preserves the in-memory size of a func value, so code copying func values need not change, and code written in C or assembly with func-valued arguments need not recompute argument frame or local variable offsets. Of course, the func value meaning has changed, even if the size has not, so C or assembly trying to call a Go func value will need adjustment.

The new implementation does not grow the memory tables required for run-time reflection. An indirect function word is included only when a statement like “f := MyFunc” is included in the binary; reflection uses a different strategy that avoids allocation but at the same time reuses existing tables.

Backwards Incompatibility

Almost no existing code will need to change. The only incompatibility is in C or assembly that calls Go func values. Such code will need to be updated to account for the indirect block.

Differences in the way direct-compiled Go code creates func values and the way reflection creates func values means that two func values referring to the same underlying code - even in the absence of closures - may use different pointers to get to it. Since func comparison is disallowed, this change can only break programs using unsafe, and such programs would also break in the presence of shared libraries when using gccgo.

It is now impossible for reflect’s Value.Pointer to return a unique identifier for a function. One possibility is to return the address of the indirect block, which would ensure that distinct functions have different “Pointer()”s. Unfortunately, then the result is not useful with runtime.FuncForPC, which some people might currently depend on. Instead, we will make Value.Pointer return the associated code pointer and document that different functions (for example, different instances of the same func literal, or all functions using the generic reflect adapter) may return the same pointer. The only guarantee is that the result of Value.Pointer on a func is zero if and only if the func is nil.

Implementation Plan

The implementation can follow these steps. Each step results in a working tree.

1. Make the func value an indirect word, keeping run-time code generation. All func values will look like the second picture (no associated data). This will require compiler changes to generate the MyFunc·f words, runtime changes to distinguish C function pointers from Go func values, and reflect changes to understand the new layouts.

2. Change the func-literal implementation to save captured pointers in the indirect blocks instead instead of run-time code generation. This is mainly compiler changes, plus deleting the runtime code generation function runtime.closure.

3. Change the reflect.MakeFunc implementation to avoid run-time code generation as well (fixes issues 3736, 3738, 4081).

4. Delete the closure hacks from traceback routines, possibly other places.

5. Add support for “f := r.Read” (fixes issue 2280).

6. Make reflect’s v.Method(i).Interface() work (fixes issue 1517).