Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: clarify unsafe.Pointer rules for package syscall #34684

Open
mdempsky opened this issue Oct 3, 2019 · 28 comments
Open

proposal: unsafe: clarify unsafe.Pointer rules for package syscall #34684

mdempsky opened this issue Oct 3, 2019 · 28 comments

Comments

@mdempsky
Copy link
Member

mdempsky commented Oct 3, 2019

unsafe.Pointer's rule 4 says:

Conversion of a Pointer to a uintptr when calling syscall.Syscall.

The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.

It talks about "the Syscall functions", but on Windows, there's also Proc.Call and LazyProc.Call.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

Clarifying this is relevant to determining how far we need to go in fixing #34474.

Incidentally, Q2 and Q3 also apply to rule 5. The status quo there is that cmd/compile safely handles unsafe.Pointer(f(...)) for all f(...), so we do allow indirect and method expression calls to reflect.Value.Pointer and reflect.Value.UnsafeAddr.

/cc @rsc @ianlancetaylor

@mdempsky mdempsky added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 3, 2019
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Oct 3, 2019

The part you quoted is the rationale. The real rule is in the part you didn't quote:

The compiler handles a Pointer converted to a uintptr in the argument list of a call to a function implemented in assembly by arranging that the referenced allocated object, if any, is retained and not moved until the call completes, even though from the types alone it would appear that the object is no longer needed during the call.

For the compiler to recognize this pattern, the conversion must appear in the argument list:

My answers, such as they are, to your questions.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Per above, the rule applies to any function implemented in assembly. syscall.Proc.Call and syscall.LazyProc.Call are not implemented in assembly, and, as such, this rule does not apply to them.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

I think the functions have to be called directly, and we should update the unsafe package docs to say that.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

I suppose that first we have to come up with a rule for Proc.Call and LazyProc.Call.

When I introduced go:uintptrescapes for #16035, my thinking was basically that there was no general rule that could make that code work, but since it already existed and people were using it we should at least try to keep those programs working. It wasn't intended to introduce a new general unsafe.Pointer exception, just one for those two methods.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 3, 2019

The part you quoted is the rationale. The real rule is in the part you didn't quote:

Hm, I've always interpreted the first line as the rule, and everything below as the rationale, restrictions, and details. For example, that rule 4 is you can convert unsafe.Pointer-to-uintptr when calling syscall.Syscall*, and the part you describe as the real rule is just an explanation of the implementation details about how the compilers implement this today (i.e., that syscall.Syscall* are implemented as assembly functions, and the compilers generically handle calls to assembly functions this way).

If rule 4 is in fact that all assembly functions should be handled that way (with syscall.Syscall as just a special case thereof), I think we should reword the first sentence to better emphasize that.

@ianlancetaylor
Copy link
Contributor

Yes, my interpretation is that this applies to all functions written in assembly, since the rule clearly applies to more than the single function syscall.Syscall. But perhaps I am mistaken.

@beoran
Copy link

beoran commented Oct 4, 2019

I think making //go:uintptrescapes work correctly for these functions is probably the best way to go, since there really isn't a general rule, it depends on semantics of the OS call.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 4, 2019

@beoran The issue here is we've never defined exactly what it means for //go:uintptrescapes to "work correctly". As far as I can tell, it's not documented anywhere (e.g., neither cmd/compile's godocs, nor src/runtime/HACKING.md).

I've generally been under the assumption that (1) rule 4 is meant to apply specifically to package syscall's functions like Syscall, RawSyscall, Proc.Call, and LazyProc.Call, and (2) //go:uintptrescapes and cmd/compile's semantics for calling assembly functions are implementation details about how we guarantee rule 4 today.

For example, I'm under the impression that while today users can write their own assembly functions that pass pointers as uintptr or //go:uintptrescapes-annotated Go functions that do the same, we do not guarantee this to work in the future and reserve the right at any point to break this (as long as package syscall's Syscall functions continue to work correctly).

However, it seems at least @ianlancetaylor had a different interpretation.

@rsc
Copy link
Contributor

rsc commented Oct 9, 2019

My thoughts, not definitive:

Q1: It seems like the answer must be yes or else those functions are impossible to use safely at all. That's the same reason we answer yes to syscall.Syscall.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

So I guess I'm suggesting yes/no/yes.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2019

Thanks for sharing your thoughts, @rsc.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

It should only introduce overhead for indirect calls involving unsafe.Pointer->uintptr conversions as arguments, and I don't think those are very common. So I don't expect the overhead would be significant, but I'll measure this.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

That's my tentative position as well. The two counter arguments here though are (1) historically it hasn't worked; and (2) package unsafe imposes its own set of rules, and maybe we think T.M(x) is uncommon enough to not allow it.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2019

Across the standard library and Kubernetes, the only indirect call that involves a unsafe.Pointer->uintptr conversion is here:

// Unmap the memory and update m.
if errno := m.munmap(uintptr(unsafe.Pointer(&b[0])), uintptr(len(b))); errno != nil {
return errno
}

(And the same code appears in x/sys/unix.)

For this particular call, the overhead would amount to an extra stack slot and an instruction to populate it.

@gopherbot
Copy link

Change https://golang.org/cl/200137 mentions this issue: cmd/compile: warn about indirect calls with unsafe.Pointer->uintptr conversions

@beoran
Copy link

beoran commented Oct 9, 2019

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

@ianlancetaylor
Copy link
Contributor

How do you plan to fix the indirect calls?

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2019

How do you plan to fix the indirect calls?

Conservatively assume any indirect function calls might be to an assembly function or a function annotated with //go:uintptrescapes, and handle unsafe.Pointer->uintptr conversions passed to their uintptr-typed arguments appropriately. See CL 200137 for details.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2019

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

Sorry, I don't understand what you're suggesting here.

In general, users shouldn't have to worry about //go:uintptrescapes. It's an implementation detail about how we make sure package syscall works.

Go doesn't support users writing their own functions that accept or return pointers using uintptr-typed parameters.

@beoran
Copy link

beoran commented Oct 9, 2019 via email

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2019

Notice the KeepAlive? It would be great if it wasn't needed and I could instruct the compiler that everything escapes, maybe with the mentioned pragma.

I agree it would be nice if users didn't have to manually insert runtime.KeepAlive calls when using os.File.Fd. Or if there was at least vet/lint tooling to suggest when they probably need it.

However, file descriptors are not pointers, so the unsafe.Pointer safety rules do not apply to os.File.Fd. You're suggesting entirely new functionality, whereas this issue is about clarifying corner cases of existing functionality.

I recommend filing a new feature request / proposal issue if you'd like to pursue that idea.

@rsc
Copy link
Contributor

rsc commented Oct 30, 2019

@mdempsky, it sounds like you are suggesting that the
"Conversion of a Pointer to a uintptr when calling syscall.Syscall."
section effectively be retitled to
"Conversion of a Pointer to a uintptr when calling a function."
and that after such a function call (any call at all) there would be a keepalive of the pointer inserted immediately after the return from the call. (And whatever pinning is needed during the call.)

Do I have that right?

(You didn't specifically say "any call", but it seems like if you are going to do a fixed set of specific calls as well as all indirect calls, you might as well just complete the set and do all calls. Otherwise direct calls are somehow disadvantaged compared to indirect calls.)

@mdempsky
Copy link
Member Author

@rsc I think that's close, yes.

I would say my suggestion was specifically that the conversion is safe when "calling (directly or indirectly) to a fixed set of specific functions".

The particular implementation detail today would be that we handle all indirect calls as though they might be to one of those functions (which has very few false positives), but compiler optimizations might later let us rule some of those out. E.g., calls to non-exported interface methods (outside of package syscall) can never be one of those specific functions; moving escape analysis to SSA might allow us to see that the set of possible called functions at a call-site is disjoint from the set of specific functions; or a Go JIT that does dynamic monomorphic call optimization might know the target function isn't one of those specific functions.

I think further simplifying to "when calling a function" is reasonable and has merits, but that's not specifically my suggestion. My only concern would be this might introduce more unnecessary KeepAlive calls in current code; I'll measure this.

@beoran
Copy link

beoran commented Oct 30, 2019

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier. As for unnecessary KeepAlive calls, in this case, I thought KeepAlive is always needed, only now it sometimes accidentally seems to work without the call.

@mdempsky
Copy link
Member Author

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier.

Note that even if we generalize to "when calling a function", to allow users to write their own functions that accept pointers as uintptr-typed parameters we'll still need to specify how those uintptr parameters can actually be used by the function.

Currently we side step that because only the standard library contains this code, and we can ensure it's kept in sync with the compiler's own conventions (e.g., using undocumented compiler directives).

@gopherbot
Copy link

Change https://golang.org/cl/198043 mentions this issue: cmd/compile: fix //go:uintptrescapes for basic method calls

gopherbot pushed a commit that referenced this issue Nov 5, 2019
The logic for keeping arguments alive for calls to //go:uintptrescapes
functions was only applying to direct function calls. This CL changes
it to also apply to direct method calls, which should address most
uses of Proc.Call and LazyProc.Call.

It's still an open question (#34684) whether other call forms (e.g.,
method expressions, or indirect calls via function values, method
values, or interfaces).

Fixes #34474.

Change-Id: I874f97145972b0e237a4c9e8926156298f4d6ce0
Reviewed-on: https://go-review.googlesource.com/c/go/+/198043
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/205244 mentions this issue: [release-branch.go1.13] cmd/compile: fix //go:uintptrescapes for basic method calls

@rsc
Copy link
Contributor

rsc commented Nov 6, 2019

It sounds like the "any function" rules may be fine but we are waiting on @mdempsky to confirm that there's not unexpected overhead. I'm going to tag this Proposal since it is a real (if small) change to the effective language.

@rsc rsc added the Proposal label Nov 6, 2019
@rsc rsc changed the title unsafe: clarify unsafe.Pointer rules for package syscall proposal: unsafe: clarify unsafe.Pointer rules for package syscall Nov 6, 2019
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Nov 6, 2019
@rsc rsc modified the milestones: Backlog, Proposal Nov 6, 2019
@rsc rsc added this to Active in Proposals (old) Nov 27, 2019
@rsc
Copy link
Contributor

rsc commented Nov 27, 2019

Putting this proposal on hold until @mdempsky has had a chance to check the overheads.
Matthew, please feel free to remove the label when you are ready. No hurry. Thanks.

@rsc rsc moved this from Active to Hold in Proposals (old) Dec 4, 2019
@petemoore
Copy link

petemoore commented Feb 10, 2020

unsafe.Pointer's rule 4 says:

Conversion of a Pointer to a uintptr when calling syscall.Syscall.
The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.

It talks about "the Syscall functions", but on Windows, there's also Proc.Call and LazyProc.Call.

This stung me too.

I had code wrapping (*syscall.LazyProc).Call(a ...uintptr) that logged the arguments called, and the values returned. I hadn't paid close attention to unsafe.Pointer rule 4 since I was not calling syscall.Syscall but rather LazyProc.Call.

I think if the documentation had been more explicit that other functions/methods also had special treatment, I might have spotted the issue sooner.

Although this was a genuine bug in my code, I was fortunate that the former escape analysis coincidentally worked in my favour and kept the values live. It appeared as an intermittent issue when upgrading to go 1.13. From bisecting, I discovered the failure started occurring in 996a687 when the escape analysis was cleaned up.

What would be the correct / supported approach for logging values in/out of (*syscall.LazyProc).Call(a ...uintptr)?

With the existing wrapper, I'm aware I could add the //go:uintptrescapes pragma to the wrapped (*syscall.LazyProc).Call(a ...uintptr) method. However, I'm also aware due to #23045 (comment) that this is an unsupported / undocumented feature, so I'm open to suggestion. I'm not sure if runtime.KeepAlive can help here, since the wrapper method received uintptr values. To get around this, one option might be to adapt it to take interface{} values instead, and change the call sites to not convert to uintptr first. However, some inputs may require uintptr(unsafe.Pointer()) conversions, whereas others (such as uint32 values) may just require e.g. uintptr(), so implementing the conversion directly in the wrapper function seems nontrivial. Perhaps a solution exists using the reflect package, but it starts getting pretty complicated...

It feels like logging inputs/outputs of syscalls should be something doable in the langauge without requiring the use of an undocumented feature, so I wanted to check if I'm missing something obvious, or what the recommended approach should be. Many thanks!

@ianlancetaylor
Copy link
Contributor

Offhand I don't think there is a safe way to interpose (*syscall.LazyProc).Call. That's unfortunate, but it's not so bad: the restriction only applies to a couple of functions.

Using runtime.KeepAlive doesn't work in general because that only keeps the value alive. What is needed here is something more: the value must not move. And if the value is on the stack, runtime.KeepAlive doesn't guarantee that.

@petemoore
Copy link

Offhand I don't think there is a safe way to interpose (*syscall.LazyProc).Call. That's unfortunate, but it's not so bad: the restriction only applies to a couple of functions.

Many thanks for the prompt reply.

Would there be value in making //go:uintptrescapes a supported/documented feature?

@ianlancetaylor
Copy link
Contributor

One guideline for adding yet another feature is whether it is possible to write simple succinct documentation for that feature that any programmer can understand. I don't think //go:uintptrescapes meets that bar.

@beoran
Copy link

beoran commented Feb 12, 2020

I respectfully disagree. I believe not all Go programmers need to know all Go language features. The common high level features of the language need to be widely understood, yes, and need therefore need to be easy to understand and easy to explain.

However most "advanced", or low level features, like unsafe.Pointer, syscalls, //go:uintpointerescapes, etc, are not for general use by Go programmers, but for use by those of us who want to integrate Go with OS kernels, existing C libraries, work on compilers or kernels in Go language, etc. But, the problem is that C libraries, kernels, etc, are not easily explained in any succinct way, and it takes time and effort to learn about these advanced topics. The best the Go team can do here is to provide a few pointers to the information available on the internet or in books, and explain the advanced features of Go in detail so they can be learned smoothly by those of us with the special interests mentioned.

petemoore added a commit to taskcluster/taskcluster that referenced this issue Feb 3, 2022
…ntptrescapes

The unofficial documentation for this pragma can be found in the source code:
  * https://github.com/golang/go/blob/go1.17.6/src/cmd/compile/internal/noder/lex.go#L71-L81

Instead of relying on this pragma, generic-worker now calls
syscall.Syscall<x> directly. This change has been made since I suspect
that nested go:uintptrescapes functions may not work as expected, but
furthmore, since this pragma is not an official feature, it presumably
may change over time, so we probably should not assume it will always be
supported.

There have also been several issues raised against functions and methods
that rely on its behaviour, so it is not clear whether it is entirely
safe to use:

  * golang/go#16035
  * golang/go#23045
  * golang/go#34474
  * golang/go#34642
  * golang/go#34684
  * golang/go#34810
  * golang/go#42680

Note, in future an option might be to generate the function bodies using mkwinsyscall:
  * https://github.com/golang/go/blob/go1.17.6/src/internal/syscall/windows/mksyscall.go#L9
  * https://github.com/golang/go/blob/go1.17.6/src/syscall/mksyscall_windows.go#L47
  * https://pkg.go.dev/golang.org/x/sys/windows/mkwinsyscall
  * https://github.com/golang/sys/blob/master/windows/mkwinsyscall/mkwinsyscall.go

One advantage of this approach is that we can probably log the syscalls
again using:

  * https://github.com/golang/sys/blob/50617c2ba19781ae46f34bb505064996b8fa32e8/windows/mkwinsyscall/mkwinsyscall.go#L43-L44

Alternatively if tracing doesn't do what we expect, we could write our own generator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Hold
Development

No branches or pull requests

6 participants