Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/cgo: add #cgo nocheckpointer annotation #62425

Closed
doujiang24 opened this issue Sep 2, 2023 · 15 comments
Closed

proposal: cmd/cgo: add #cgo nocheckpointer annotation #62425

doujiang24 opened this issue Sep 2, 2023 · 15 comments
Labels
Milestone

Comments

@doujiang24
Copy link
Contributor

design

#cgo nocheckpointer setheader
setheader(void *key, void *value)

After adding the nocheckpointer annotation for a C function, cgo compiler will skip generating cgocheckpointer for all of the arguments of the C function.
That means the checkpointer always will be disabled, even cgocheck=1 in the GODEBUG env.

why disable checkpointer

Background: we had built the Envoy Golang extension, that supports people using pure Golang to implement plugins to extend Envoy without recompile the Envoy binary, i.e. authorization plugins, waf plugins and more.

In the Envoy Golang extension, we have such a case:

func (c *httpCApiImpl) HttpSendLocalReply(r unsafe.Pointer, response_code int, body_text string, headers map[string]string, grpc_status int64, details string) {
	hLen := len(headers)
	strs := make([]string, 0, hLen)
	for k, v := range headers {
		strs = append(strs, k, v)
	}
	res := C.envoyGoFilterHttpSendLocalReply(r, C.int(response_code), unsafe.Pointer(&body_text), unsafe.Pointer(&strs), C.longlong(grpc_status), unsafe.Pointer(&details))
	handleCApiStatus(res)
}

We are passing a header map through a []string to C, which will cause checkpointer panic.
But, if we don't passing []string, we do need another memory allocation & copy for those strings, in Go side, before passing to C.
Now, we require people set GODEBUG=cgocheck=0 to disbale cgocheck while starting Envoy.

why not using runtime.Pinner

with runtime.Pinner, the code will be longer, 7 lines added:

func (c *httpCApiImpl) HttpSendLocalReply(r unsafe.Pointer, response_code int, body_text string, headers map[string]string, grpc_status int64, details string) {
	var pin runtime.Pinner
	defer pin.Unpin()
	hLen := len(headers)
	strs := make([]string, 0, hLen)
	for k, v := range headers {
		strs = append(strs, k, v)
		pin.Pin(unsafe.StringData(k))
		pin.Pin(unsafe.StringData(v))
	}
	pin.Pin(unsafe.SliceData(strs))
	pin.Pin(unsafe.StringData(body_text))
	pin.Pin(unsafe.StringData(details))
	res := C.envoyGoFilterHttpSendLocalReply(r, C.int(response_code), unsafe.Pointer(&body_text), unsafe.Pointer(&strs), C.longlong(grpc_status), unsafe.Pointer(&details))
	handleCApiStatus(res)
}

and worse, when a string (key or value) is not a Go pointer, i.e. a constant string, Content-Type, Pin will panic:
panic: runtime error: runtime.Pinner.Pin: argument is not a Go pointer

why not use GODEBUG=cgocheck=0

Yep, it does work fine.
But it requires people to set it, while people may forget it, i.e. envoyproxy/envoy#25178
And it takes effect globally, as an SDK provider, we hope people do not need to care about it, and people can still enable cgocheck in other their-self codes.

summary

  1. It is simple to use, no need many code additions compared to runtime.Pinner
  2. It takes effect at the C function level, with scope limited
  3. Also, it is simple to implement, similar to the noescape and nocallback annotations.

Any suggestions/feedback are welcome, thank you very much!

@gopherbot gopherbot added this to the Proposal milestone Sep 2, 2023
@rittneje
Copy link

rittneje commented Sep 2, 2023

The whole point of cgocheck and runtime.Pinner to to ensure your C code interacts properly with the Go garbage collector. Disabling or circumventing this check doesn't address the problem, it just means you are introducing undefined behavior that could cause the application to crash or misbehave. Definitely I don't think you should be recommending people use GODEBUG=cgocheck=0.

Looking through the source code, it appears Envoy incorrectly relies on the exact layout of slices and strings even though the Go documentation states this is not guaranteed.

Making matters worse, the return value of unsafe.StringData cannot be passed to Pinner.Pin which means there currently does not appear to be a way to safely avoid reallocating all the strings. It is unclear whether this was intentional or an oversight.

(By the way, your original code has a bug. You initialize strs with a capacity of len(headers) but you are adding two items per map entry so it should be 2*len(headers).)

@doujiang24
Copy link
Contributor Author

@rittneje Thanks for your reply.

it appears Envoy incorrectly relies on the exact layout of slices and strings even though the Go documentation states this is not guaranteed

Actually, cgo will generate headers for C users:

typedef struct { const char *p; ptrdiff_t n; } _GoString_;
typedef _GoString_ GoString;
typedef struct { void *data; GoInt len; GoInt cap; } GoSlice;

yep, we rely on the exact layout, but we don't think this layout will change frequently.
We are aware of this risk and are willing to bear it, and we also do not expect any guarantees from Golang.

The whole point of cgocheck and runtime.Pinner to to ensure your C code interacts properly with the Go garbage collector.

Yep, we know what cgocheck protects.
and we followed the Go garbage collector rules at the golang SDK level, upper level users do not need to care about it.

For Golang, it just provides another way to disable cgocheck at the C function-level scope, just like it provides GODEBUG=cgocheck=0 at the global level.
No new risks are exposed.

For us, Golang users, we hope to use the low-level capability for better performance(avoid memory copying), just like GODEBUG=cgocheck=0 does.
We know Golang wishes to protect bad code from users, and cgocheck is a really nice protection for users, but we just hope there is another way to use the low-level capability at a limited & smaller scope.

Making matters worse, the return value of unsafe.StringData cannot be passed to Pinner.Pin which means there currently does not appear to be a way to safely avoid reallocating all the strings. It is unclear whether this was intentional or an oversight.

Yep, we have to do memory copying without disabling cgocheck then, that's what we want to avoid.
We highly care about performance, it is our intention.

(By the way, your original code has a bug. You initialize strs with a capacity of len(headers) but you are adding two items per map entry so it should be 2*len(headers).)

Oh, thanks, it is a bug, it makes the slice grow again, we'll fix it soon.

@rittneje
Copy link

rittneje commented Sep 2, 2023

and we followed the Go garbage collector rules at the golang SDK level, upper level users do not need to care about it.

No, you didn't. The garbage collector has no idea you passed unpinned pointer to C. Hence the error from cgocheck. The garbage collector could very well free that memory while your C code is using it. Once that occurs, all bets are off.

@doujiang24
Copy link
Contributor Author

The garbage collector could very well free that memory while your C code is using it.

That depends.

  1. The garbage collector won't free the arguments(i.e. strs and all of the strings in the string slice), before the C function returns.
  2. After the C function returns, the C code should not use the arguments (strs and its strings) anymore, which will make use after free.

But these C functions only use the Go pointers before they return, since we wrote these C functions and we know the details of the C code.
That's what we said "we followed the Go garbage collector rules".

For runtime.Pinner, it's more powerful, people could use those pinned Go pointers in C, before Unpin in Go, even after the C function returns.
But we don't need this powerful capability.

@rittneje
Copy link

rittneje commented Sep 3, 2023

The garbage collector is also allowed to move values from one memory location to another, which would have equally bad results.

The short of it is, your code violates the rules of cgo and thus is broken. And adding your proposed API seems to serve no purpose other than to allow library authors to hide their broken implementations from consumers. (It would be akin to asking the runtime not to do bounds checking on arrays.)

@dominikh
Copy link
Member

dominikh commented Sep 3, 2023

To agree with @rittneje in the clearest way possible: If cgocheck=1 or cgocheck=2 complain, then your code is incorrect and violates the cgo pointer passing rules. cgocheck=0 doesn't exist to avoid warnings, but to increase performance of already correct code. Setting cgocheck=0 shouldn't be done lightly, and most certainly shouldn't be done at the code's behest.

@doujiang24
Copy link
Contributor Author

The garbage collector is also allowed to move values from one memory location to another

Yes to other languages' GC, i.e. Java.
But, nope to Golang's GC, at least in the current implementation.

The only case I know that golang may move values is goroutine stack grow or shrink(and only for the stack values).
And goroutine stack grow or shrink is disabled while running in C.

The key pointer is:

  1. Golang does allow passing Go pointers to C.
  2. cgocheck does not allow a Go pointer that contains Go pointers(when no Pin)

Please see this example:

type obj struct {
        a int
}
o := new(obj)
C.pointer(unsafe.Pointer(o)) // this is valid, since no pointers in obj.

str := "foo:" + strconv.Itoa(10)
C.pointer(unsafe.Pointer(&str)) // this is invalid, since str contains a pointer.

So, passing Go pointers to C is safe.

Then, what does cgocheck protect?
Golang garbage collector using Go pointers as reference between Go objects. Golang does not want to see those pointers modified by bad C code unexpectedly, that is what cgocheck protects.

And this is why we said: "we followed the Go garbage collector rules".

If cgocheck=1 or cgocheck=2 complain, then your code is incorrect

@dominikh Nope, I do not think so. please see the above details.

@rittneje
Copy link

rittneje commented Sep 3, 2023

@doujiang24 Again, you are NOT following the rules of cgo. You clearly decided to ignore them and instead go by the current implementation, which is irrelevant.

Instead of this proposal, which amounts to "let my library silently ignore the rules of cgo", you should instead focus on your actual problem - the inability to efficiently pass a list of strings from Go to C.

@doujiang24
Copy link
Contributor Author

you are NOT following the rules of cgo.

Yep, let's check the doc, there is the rule of cgo:

Go code may pass a Go pointer to C provided the memory to which it points does not contain any Go pointers to memory that is unpinned.

There is also an option to disable this check:

These checks may be disabled entirely using GODEBUG=cgocheck=0.

Why need the rule of cgo?

Go is a garbage collected language, and the garbage collector needs to know the location of every pointer to Go memory. Because of this, there are restrictions on passing pointers between Go and C.

So, the key pointer is: follow the rules of Go GC. This is the fundamental.
Also, as we always said: "we followed the Go garbage collector rules", not the rules of cgo.

Rules of cgo should be reasonable, and they can be changed when there is a more reasonable way.
This is the previous rule that without runtime.Pinner

Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers.

Not following the current rules of cgo, the code must be a bug then?
I don't think so.
I think the cgocheck rules are protecting against bad things happening, due to bad C code, since the C code is not trusted on the Golang side.
But if people trusted their own C code, could golang provide a better way(not a global env) to skip the cgocheck rule?
I do think so, that is the proposal.

If you are saying this is the rule, we are done, okay, I got it.

you should instead focus on your actual problem - the inability to efficiently pass a list of strings from Go to C.

Yep, totally agree.
We sincerely appreciate your help, and if you have any suggestions.

@rittneje
Copy link

rittneje commented Sep 3, 2023

Also, as we always said: "we followed the Go garbage collector rules", not the rules of cgo.

This is the heart of the problem. The rules of cgo are the garbage collector rules. That it happens to work today is not relevant. This is no different than writing code that relies on some internal implementation detail of the compiler instead of the actual language spec.

I think the cgocheck rules are protecting against bad things happening, due to bad C code, since the C code is not trusted on the Golang side.

That is not what they are protecting. It has always been legal to pass a top-level pointer from Go to C, provided that C does not retain the pointer after the function call. cgocheck does not and cannot attempt to protect against the C code breaking that rule. In other words, cgocheck has nothing to do with trusting the C code, that's on the developer.

What they are protecting is misuse in which you are passing pointers from Go to C in a way that is not visible to the garbage collector. Namely, you cannot pass un-pinned non-top-level pointers from Go to C. Even if it happens to work today, such misuse is subject to breakages in future Go releases.

We sincerely appreciate your help, and if you have any suggestions.

It is as I stated above. You essentially need to be able to pass the return value of unsafe.StringData to runtime.Pinner.Pin. I don't know why this is currently disallowed.

@doujiang24
Copy link
Contributor Author

The rules of cgo are the garbage collector rules.

Nope, I don't think so. As we said above: "Rules of cgo should be reasonable, and they can be changed when there is a more reasonable way."
If you insist on it, okay, I got it.

What they are protecting is misuse in which you are passing pointers from Go to C in a way that is not visible to the garbage collector.

Could you please explain it? Why does GC need it visible?
Actually, Go knows all pointers that pass from Go to C, that is why cgocheck can work.
That's why I said, it just protects against untrusted bad C code.

Even if it happens to work today, such misuse is subject to breakages in future Go releases.

Golang already exposes the pointer to users, and even allows passing pointers to C, so, I don't think Go GC can change to move values in the feature releases, without breaking the backward compatibility.
When feature releases decide to move values in GC, it will be a new feature without backward compatibility, "nocheckpointer" could also be broken then.

You essentially need to be able to pass the return value of unsafe.StringData to runtime.Pinner.Pin. I don't know why this is currently disallowed.

Thank you. let me explain it a bit.
It protects people from wrong usage, there is a check in Pin: the pointer should be allocated by GC, in an internal span of GC.
While the string is a const string, the string data is not allocated by GC, from .rodata instead, so the check in the Pin panic.

Actually, we could create a CL to shut down this check, it should work then.
But, I don't think runtime.Pinner is better than nocheckpointer, since runtime.Pinner is a more powerful API that people could use Go pointers in C code across multiple C API calls, while we are just using Go pointers in a single C API call.
We do think nocheckpointer could be a better performance in our case.

Okay, we can say that we have investigated the cgo for a long time, and we have made some improvements for cgo in the past year.
This proposal also came out after a deep thought and investigation.

I think the base rule that we discuss should be: what is good for Golang, and which could make Golang better.

To Golang team:

  1. This new annotation does not expose new risks to users, just provides a limited scope way that similar to GODEBUG=cgocheck=0.
  2. Disabling cgocheck could be safe if people do trust the C code, and will bear the risk in the C code.

@merykitty
Copy link

merykitty commented Sep 4, 2023

IIUC the rule that a nested pointer cannot be passed into C is because as a managed language, pointer load/store may require barriers, which means that they can only be performed by the Go compiled code, and passing nested indirection to C may violate this requirement.

Golang already exposes the pointer to users, and even allows passing pointers to C, so, I don't think Go GC can change to move values in the feature releases, without breaking the backward compatibility.

We can simply pin the referents of the pointers passed to cgo.

I think the cgocheck rules are protecting against bad things happening, due to bad C code, since the C code is not trusted on the Golang side.

This is simply not true, of course the Go code has to trust the C code to call it and expect a sensible result. The check is to protect against developer not conforming to the rules of cgo and violating the gc's assumption regarding the memory state of the program. Also, I see it really backward to annotate the trustfulness of "bad C code" from C, at least it should be declared on the Go side instead.

@rittneje
Copy link

rittneje commented Sep 4, 2023

Nope, I don't think so. As we said above: "Rules of cgo should be reasonable, and they can be changed when there is a more reasonable way."

Just because you don't like the rules does not mean you get to ignore them without consequence.

To be honest, I'm tired of having to post the exact same responses. I won't be responding anymore.

@doujiang24
Copy link
Contributor Author

pointer load/store may require barriers, which means that they can only be performed by the Go compiled code

I do think reading the Go pointers in C code is safe, before it's freed by GC.
But writing the Go pointers in C code is unsafe for Go GC, since the GC may scan such pointers during GC mark state.

of course the Go code has to trust the C code to call it and expect a sensible result

Sorry, the "bad code" that I mentioned above is only the C code that does not follow the Go GC rules, i.e. change the values of Go pointers.

at least it should be declared on the Go side instead.

Yep, I do think it's declared on the Go side, it's declared in the comments before import "C", and the cgo compiler will parse the annotation.

@ianlancetaylor
Copy link
Contributor

The background of the cgo pointer passing rules is https://go.googlesource.com/proposal/+/refs/heads/master/design/12416-cgo-pointers.md and #12416.

We are not going to add a new annotation for the purpose of making it easier to break the rules. The rules are there for a reason. Closing as infeasible.

@ianlancetaylor ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants