runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing #62380

seebs · 2023-08-30T18:56:07Z

What version of Go are you using (`go version`)?

1.21 (you can tell because Pinner)

Does this issue reproduce with the latest release?

Sure.

What operating system and processor architecture are you using (`go env`)?

N/A

What did you do?

Read documentation. I'm aware that this is an extremely unusual use case.

What did you expect to see?

I expected words as clear as an unmuddied lake, leading me to a perfect and full understanding of the semantics of pointers, cgo, runtime.Pinner, and their interactions.

What did you see instead?

Well, gosh.

Lemme quote the two things:

runtime.Pinner:

Pin pins a Go object, preventing it from being moved or freed by the garbage collector until the Unpin method has been called.

A pointer to a pinned object can be directly stored in C memory or can be contained in Go memory passed to C functions. If the pinned object itself contains pointers to Go objects, these objects must be pinned separately if they are going to be accessed from C code.

cgo:

C code may not keep a copy of a Go pointer after the call returns, unless the memory it points to is pinned with runtime.Pinner and the Pinner is not unpinned while the Go pointer is stored in C memory. This implies that C code may not keep a copy of a string, slice, channel, and so forth, because they cannot be pinned with runtime.Pinner.

and cgo again:

The GoString type also may not be pinned with runtime.Pinner. Because it includes a Go pointer, the memory it points to is only pinned for the duration of the call; GoString values may not be retained by C code.

I originally thought my complaint was "runtime.Pinner should document that it can't pin these". But then I realized: It sort of does! But it does so only indirectly. You are responsible for knowing that string, slice, channel, "and so forth", cannot be pinned, because they contain pointers to Go objects.

I don't have 100% confidence that everyone would agree that it's obvious, for instance, that a map can't be pinned. I've read the internals and know that a map contains things that are pointer-like, and I guess you can figure out that it has to if you spend a bit of time pondering what it does, but it's not directly stated.

What should we do?

First, I think the documentation for Pinner should clearly enumerate the cases that involve built-in data types, which I think is just string, slice, channel, and map. Maybe. Unless there's other less obvious types. I just feel like this would improve life for relative novices who might not have an intuitive sense of all the internals. I ... suppose interfaces? Like, if you have an object o whose type is interface{...}, then o is actually a {data, typeinfo} pair. If you then pin &o, in fact that contains inner pointers. And on consideration, I have no idea whether the type-info part of the interface needs pinning, but I also have a vague recollection that it's a fairly magic word -- wasn't something somewhere around there at one point being done as an xor of an interface-related and type-related pointers, so that whichever one you had in hand, you could use it to find the other? But also I think the interface-related-structs are currently not collectable. I don't know. But I am pretty sure that pinning the interface doesn't pin its data pointer, but also that a lot of Go programmers are not particularly aware that an interface holding a non-pointer type actually holds a pointer to an object of that type.

There is a secondary question: Should Pinner handle strings and slices? I think I am going to argue that no, it should not. The rationale here is that, while it would be convenient, it would also be a weird special exception. There's no other cases where an object is basically a struct with a pointer in it and Pin automatically pins the pointer, and we specifically warn users that this won't happen. On the other hand, you could probably p.Pin(unsafe.StringData(s)), right? Similarly, unsafe.SliceData(s) should be pinnable. I think.

I don't know. I can see some appeal to Pin handling slices and strings, possibly even as exceptions to the general "if p is not a pointer, Pin panics". In this world, p.Pin(str) allocates a copy of the string header, stores that copy of the string header in an interface{}, and then as a special case, the Pin code checks for (1) string, (2) slice, or (3) any pointer type, and for string and slice, it implicitly pins the data pointer. If this were documented as its behavior, it wouldn't be too surprising. I could maybe see a case for doing that for interfaces, but that feels weirder and riskier, partially because so many interfaces (such as this one right here built around a string header in this very example) have a pointer only to a transient copy of an object somewhere, which is usually Not What You Meant To Pin.

It seems like it's just plain impossible to do this for maps. I think it might be possible for channels, maybe? But I'm not sure why or when you would ever want that -- I think nothing in a channel is meaningfully usable to cgo code.

The other option, I think, would be to explain that if you want to pin a string or slice's data, you pin the StringData or SliceData pointers.

So I think I'm advocating for the documentation for Pin to be expanded, and the documentation for cgo to be clearer or what you can or can't pin or pass to it. I'm especially confused because cgo talks about GoString separately from string, but implies that the memory the GoString points to is already automatically pinned for the duration of the call. Which... I guess that makes sense, it would be a horrible pain if it weren't true. But I am genuinely slightly unclear on whether these descriptions are slightly-conflicting or whether they're both describing the same thing.

I am deeply ambivalent about the idea of special-casing slices and strings as valid arguments to Pin. I'm even more ambivalent about what should happen if you try to pin the address-of a slice or string object. I think my intuition is that if you are trying to pin the address of the object, that should work like normal pointer pinning and not pin recursively. But then we end up with a weird thing where if you want to pin (the data of) a slice or string, you either have to pay for an allocation for a copy of their header, or use unsafe.S*Data on them to specify it more directly. I ... don't know.

I think the current semantics are acceptable and work fine, so this could be treated purely as a documentation problem. On the other hand, I don't think I'd hate having pin special-case slices and strings. But that would also make the documentation problem slightly larger.

The text was updated successfully, but these errors were encountered:

dmitshur · 2023-08-31T15:02:15Z

CC @golang/runtime.

Splizard · 2023-10-08T07:11:22Z

Note than Pin(unsafe.StringData(str)) can panic (argument is not a Go pointer).

ShiromMakkad · 2023-12-12T22:01:03Z

I agree that this documentation change will be helpful. Here's my example of pinning a Go string:

package main

import "fmt"
import "unsafe"
import "runtime"

func print_str(str unsafe.Pointer, length int) {
        for i := 0; i < length; i++ {
                fmt.Printf("%c", *(*byte)(unsafe.Add(str, i)))
        }
        fmt.Println()
}

func main() {
        tst := "test"
        tst_ptr := unsafe.Pointer(unsafe.StringData(tst))

        pnr := new(runtime.Pinner)
        pnr.Pin(&tst)

        print_str(tst_ptr, len(tst))

        pnr.Unpin()
}

This is my first time writing Go code, so I doubt it's perfect. I'm not really sure how Go knows the size of the pointer to pin when you call Pin(). From reading the code, Go allocates the heap in blocks called spans and when you call pin, it pins the entire span, not just what you pin. This should pin the result from unsafe.StringData but the docs aren't totally clear. Maybe tst just contains metadata and a pointer to the real string data and that metadata is being pinned. Unfortunately, like @Splizard said, you can't pin unsafe.StringData, and you can't pin unsafe.Pointer either.

It would be nice if the docs had an example of pinning a string. It's the most common use case, especially for language interop.

As far as I can see, Go strings are not null terminated, so to pass this into C, you'll need to keep track of the size. For Rust, things are more seamless.

gopherbot added the Documentation label Aug 30, 2023

ianlancetaylor changed the title ~~docs: (runtime, cgo): documentation on runtime.Pinner and cgo slightly conflicting or confusing~~ runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing Aug 30, 2023

gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 30, 2023

dmitshur added this to the Backlog milestone Aug 31, 2023

dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 31, 2023

mknyszek self-assigned this Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing #62380

runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing #62380

seebs commented Aug 30, 2023

dmitshur commented Aug 31, 2023

Splizard commented Oct 8, 2023

ShiromMakkad commented Dec 12, 2023

runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing #62380

runtime: documentation on runtime.Pinner and cgo slightly conflicting or confusing #62380

Comments

seebs commented Aug 30, 2023

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

What should we do?

dmitshur commented Aug 31, 2023

Splizard commented Oct 8, 2023

ShiromMakkad commented Dec 12, 2023

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?