Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: provide a mechanism for debuggers to execute a function in the program being debugged #21678

Closed
komuw opened this issue Aug 29, 2017 · 33 comments

Comments

@komuw
Copy link
Contributor

komuw commented Aug 29, 2017

Background:

I'm new to Go. I've been going through the Go exercises over at; http://exercism.io/languages/go/about.
For some of the exercises in there, I needed a debugger so I came acrosss; https://github.com/derekparker/delve which is really nice.

However I needed to call a function - inside the debugging session- but I was unable to do so. After some google-fu, I came across this issue; https://github.com/derekparker/delve/issues/119 .
That issue ends with the conclusion that for the issue to be fixed, it would need some changes in Go itself.

And then the comment:
maybe we should file an issue in https://github.com/golang/go/issues? We had a conversation with @rsc, and he told us that Go team is going to provide a good basement for debugger at least. - https://github.com/derekparker/delve/issues/119#issuecomment-317032805

Proposal:

Go should provide the good basement for debugger that @rsc alluded to in the comment.

I'm raising this issue here because I asked in the delve issue mentioned above and they never got around to filing an issue with Go and also because, I believe, the floor is now open to discussing potentially backward incompatible things(Go2); some people want generics in Go2, as a new gopher I just want a well integrated debugger.

@gopherbot gopherbot added this to the Proposal milestone Aug 29, 2017
@dlsniper
Copy link
Contributor

This is far too wide to be a proposal (and it's not even what a proposal should be).

For the specific case at hand, the issue should be: provide a mechanism to execute functions via external debuggers. Even this would then need to be split into different sub-issues since the problem itself is pretty complex, see https://github.com/derekparker/delve/issues/119#issuecomment-317019335

As such, I think this should be closed and a separate issue should be opened for allowing function calls via debuggers.

Regarding the other aspects of debugging, I've seen a lot of CLs being merged that improve various aspects of debugging, such as better scoping information.

@komuw
Copy link
Contributor Author

komuw commented Aug 29, 2017

@dlsniper I stand corrected.
To make this proposal, I first went through some of the other proposals in this repo(some by the core team).

I honestly have nothing more to offer. I just know the way I would like things to be, as to how to get there; I'll defer to others.

@ianlancetaylor
Copy link
Contributor

Thank you for filing this. A proposal is more helpful when it provides a proposed solution (see https://github.com/golang/proposal/blob/master/README.md). At this point this is more of a bug report, and I think the bug you are reporting is that debuggers need some way to safely execute a Go function. I'll retitle this issue on that assumption.

@ianlancetaylor ianlancetaylor changed the title proposal: provide a good basement for debugger runtime: provide a mechanism for debuggers to execute a function in the program being debugged Aug 29, 2017
@ianlancetaylor ianlancetaylor modified the milestones: Unplanned, Proposal Aug 29, 2017
@komuw
Copy link
Contributor Author

komuw commented Aug 29, 2017 via email

@heschi
Copy link
Contributor

heschi commented Aug 30, 2017

Spoke with @hyangah, @derekparker, @aarzilli about this; any mistakes are my own.

To me, the most obvious way to call a function from a debugger would be:

  • Start with a stopped thread (that hit a breakpoint, presumably) and decide to call a function
  • Save all the registers of the stopped thread
  • Put the function arguments on the stack
  • Set a breakpoint at the end of the called function
  • Set the thread's PC to the beginning of the called function
  • Resume the thread
  • When it hits the end of the called function, restore its state and return control to the user.

There are a number of issues with this approach, but the most critical is that it will break the garbage collector. When a Go function calls another function, it is responsible for saving all its pointers onto the stack where the GC can see them. If you stop a function at some random instruction, there will be pointers in registers that the GC can't see. Even if you wanted to put them on the stack, you might not be able to if the compiler determined that a stack slot wasn't necessary for that particular pointer.

Disabling the GC for the duration of the function call might work, but to disable the GC you have to take locks that could be held by other threads that have been stopped by the debugger. (Though, even if you don't try to disable it, allocating memory in the called function could trigger a GC that will never finish for the same reason...)

An alternative would be to start a goroutine that calls the function. That would work from a fresh stack, so no issues with the GC. Instead there would be issues with the scheduler: there's no way to force that new goroutine to run, especially since there's no guarantee that there's a free thread. So that's not great either.

An approach that I think might actually be viable is to start a new thread (OS thread, not goroutine) that then enters the runtime as if it were a C to Go crosscall. When a non-Go thread enters Go code, a new G is allocated for it, and that G immediately runs the desired Go function. (I think that goroutine is locked to that thread for its duration, so you don't have to worry about some random goroutine using it to run, but I could be wrong). So now we'd have a new goroutine that was guaranteed to run promptly.

This could in principle be done entirely by the debugger injecting code into the debugged process, but it might be nice for the runtime to provide an actual supported hook for it.

@bcmills, @aclements, @ianlancetaylor, perhaps you have thoughts here?

@ianlancetaylor
Copy link
Contributor

My main comment is that we already support calling a Go function on a new thread, since that is exactly what we do using cgo when C code starts a thread that calls a //export Go function. So in a sense we already have the hook you describe: it's called runtime.cgocallback.

@bcmills
Copy link
Contributor

bcmills commented Aug 30, 2017

Presumably you want to be able to call those functions with arguments. Making the call from a different thread with local variables as arguments could easily violate the runtime's invariants about cross-goroutine access to the stack, so I think you'd have to restrict it to use only global or already-escaped variables — but that seems tricky to enforce.

@bcmills
Copy link
Contributor

bcmills commented Aug 30, 2017

Another option might be to only allow function calls at GC safepoints, but then I guess we'd want some way to tell the debugger to break at the next safepoint.

@bcmills
Copy link
Contributor

bcmills commented Aug 30, 2017

I think this interacts with #18597, in that the same mechanisms that would allow efficient use of callee-saved registers might also give the debugger enough insight to correctly spill them for an arbitrary function call. (But making an escaping call with non-escaping local arguments is still a problem.)

@heschi
Copy link
Contributor

heschi commented Aug 30, 2017

@ianlancetaylor: Yeah, but they'd still have to get a thread created somehow without using libc, which means some platform-specific code, right? So a little wrapper that created a thread and had it run cgocallback would be handy.

@bcmills: Hmm. I hadn't thought about that. So, concretely: I might have some variable that doesn't escape and so is stack-allocated. Then I pass it to a function as an argument. That function decides to stash a pointer to it somewhere, causing it to escape, and so it calls the write barrier, which gets mad because it wants a heap value, not a stack value. Very awkward. I don't see a solution but I don't know enough of the details here.

I suspect that only allowing calls at safepoints would be so restrictive as to make the feature useless.

@ianlancetaylor
Copy link
Contributor

We can't permit passing a pointer to a local variable to some random function. But that seems to me to be a fairly minor restriction, easily enforced by the debugger and not too troubling for the user.

@aarzilli
Copy link
Contributor

We can't permit passing a pointer to a local variable to some random function. But that seems to me to be a fairly minor restriction, easily enforced by the debugger and not too troubling for the user.

I disagree. I bet 90% of the use for this feature will be calling String methods and similar (especially for things in math/big) and I bet that almost always the receiver will be a pointer and often it will be on the stack.

I also have another concern about this mechanism. Let's call the goroutine where we make the call g_caller and let's say it's running on thread m_caller (let's say we don't allow parked goroutines to make calls). The debugger creates a new m_callee thread and executes cgocallback on it creating a new g_callee goroutine. Clearly the debugger is going to keep m_caller stopped, because the user doesn't expect g_caller to continue until after the call ends. Is this going to potentially cause a deadlock through the GC, as in: g_callee wants to allocate → gc decides it's time to stop the world → g_caller never gets to a safe point. Is this possible?

@heschi
Copy link
Contributor

heschi commented Aug 31, 2017

I didn't know whether the compiler would stack-allocate objects with optimizations off, so I looked at a math/big example. It did stack allocate many of the locals, so I'm inclined to agree that this is a really serious limitation. Asking the debugger to promote values to the heap is unrealistic, so I dunno how to deal with it.

GC deadlocks: yes, definitely a problem, but from my perspective it has little or nothing to do with which mechanism you choose. As long as there's any thread stopped by the debugger, any stop-the-world operation will lock up waiting for it. Presumably you want all goroutines stopped while stepping, so even if you were reusing the goroutine that hit the breakpoint to do the function call, the others still create this issue. I touched on this above -- the garbage collector is a particularly obvious problem, because it's the most likely STW operation, but turning it off is also STW.

Still, maybe turning off the GC is the better idea after all. As long as we're not in a mark phase we don't need the write barrier, and nothing will get mad about stack references, I think? Of course, if you call a function that stashes a pointer, segfaults will follow.

Something like:

  • stop all the goroutines in the program
  • set a breakpoint in the scheduler somewhere (at the beginning of startTheWorld, maybe?)
  • synthesize a call to runtime.SetGCPercent(-1)
  • resume all goroutines
  • wait to hit the breakpoint above, at which point all goroutines are preempted
  • stop all "other" threads
  • resume the primary thread and wait for it to finish SetGCPercent
  • make the desired call, then reset GCPercent

Pretty complicated and slow. You'd probably need a LockOSThread in there somewhere too.

@ianlancetaylor
Copy link
Contributor

That is a fair point about the String method. Unfortunately I still don't see a way that we could possibly permit this.

We can normally avoid the GC problem you describe by setting the preemptoff field in the m we create. That will permit it to keep going, and allocate memory as needed, even if the GC is trying to preempt all goroutines. A GC deadlock is still possible if you call a function that starts new goroutines and waits for them to complete while the GC is trying to preempt all goroutines.

@heschi
Copy link
Contributor

heschi commented Aug 31, 2017

@hyangah points out that disabling the gc while a goroutine is stopped at a breakpoint is impossible: that goroutine has to hit a safepoint itself. So you can't synthesize a call to runtime.SetGCPercent.

I can't think of anything that would work here short of modifying the GC. Maybe if it could be aborted without stopping the world.

@aarzilli
Copy link
Contributor

aarzilli commented Sep 1, 2017

Of course, if you call a function that stashes a pointer, segfaults will follow.

I didn't think about this problem at all, I think that even if all other problems are solved the debugger should check that it's not passing stack memory to escaping arguments.

Is there a way to tell the GC it can't scan a goroutine's stack? Could delve do this:

  1. Allocate a big chunk of memory on the inferior (say, 4MB)
  2. Unmap the first page
  3. Push the call frame for the fake call at the end of the chunk
  4. Set stackhi to the start of the chunk, set the goroutine as "don't scan"
  5. Set SP after the fake call frame, set PC to the function's entry point
  6. Resume everything

?

@heschi
Copy link
Contributor

heschi commented Sep 5, 2017

Is there a way to tell the GC it can't scan a goroutine's stack?

I'm getting pretty far out of my depth at this point, I'm afraid. Best guesses:

To be clear, you don't want to hide the whole stack from the GC, since if you do there might be some live objects left unmarked. You just want to hide the part you're using for the call. But I'm not sure you can simultaneously show the "real" stack to the GC and hide the "fake" part without also confusing the morestack() prelude.

But even if you could, I don't think that's sufficient. Fundamentally, if you want to pass stack memory to things that expect heap memory, you need to make sure the write barrier is disabled, and the only way to do that is to avoid being in a mark phase. (At one point there was discussion of an always-on write barrier; that would make this even harder.)

Let's take a quick step back. This looks like a very difficult feature, requiring a lot of expertise to evaluate and probably some actual runtime support. I think if we want this to happen we need to get it on the runtime team's formal priorities. Do we want to try to do that? I personally would lean towards no, at least not right now, but I don't have much exposure to the user community.

@dlsniper
Copy link
Contributor

dlsniper commented Sep 5, 2017

but I don't have much exposure to the user community.

While the rest of the thread is above what I can help with, at least I can provide some context for this.
If speaking from an IntelliJ / Gogland community, there's definitely interest for this:

Looking at Delve, https://github.com/derekparker/delve/issues/119 there are a few users interested as well.

Having to deal with the community quite a bit, I can tell you that better debugging support is a major issue for many companies / users that will refuse to adopt Go because of the debugging story it has compared to other languages (among other issues such as package management).

Hope it helps.

@komuw
Copy link
Contributor Author

komuw commented Sep 6, 2017 via email

@heschi
Copy link
Contributor

heschi commented Sep 6, 2017

I've spent a lot of this year working on debugging improvements of various kinds, particularly CL 41770. Others have been doing work as well. It may not be going as quickly as we'd like, but I don't think the Go team needs to be further convinced that debugging is important. My question was more around this specific feature. There's only so much time in the day, so is this more important than (e.g.) improving debugging of optimized binaries, or making the line number table more accurate?

@aarzilli
Copy link
Contributor

aarzilli commented Sep 6, 2017

There's only so much time in the day, so is this more important than (e.g.) improving debugging of optimized binaries, or making the line number table more accurate?

This is a good point, IMHO those as well as caching of non-optimized packages, better cgo support, constants in dwarf, plugin support and watchpoints are all more useful.

@heschi
Copy link
Contributor

heschi commented Sep 8, 2017

Since it's been a couple days with no further comment, I'm personally going to consider this on hold until we've got more basic features dealt with. I'll keep it in mind though.

@heschi
Copy link
Contributor

heschi commented Jan 4, 2018

I talked with @aclements about this a while back. We're going to try to investigate it for Go 1.11.

There's work planned for 1.11 that will make the stack maps much more complete, to the point where most (maybe even all) instructions are covered. That would make synthesizing a new stack frame for a function call safer. I'm not sure if it's perfect, since there could still be pointers in registers.

The major problem with allowing function calls on stack objects is the risk that a stack pointer will leak to the heap where it doesn't belong. One idea we had was for Delve to watch the execution of any function it calls and verify that it never writes a stack pointer to the heap. (I'm not sure how to do this quickly, maybe by looking at places where the write barrier is called, but at worst it can just watch all memory writes.) If the function tries to write such a pointer, Delve can stop it and reset the thread state back to before the function call began. That's still a little dangerous, since it could have mutated things, but so is allowing any function calls at all.

My concerns about the write barrier above were unfounded; it ignores stack pointers. So that's not an issue.

@aarzilli
Copy link
Contributor

aarzilli commented Jan 5, 2018

The major problem with allowing function calls on stack objects is the risk that a stack pointer will leak to the heap where it doesn't belong.

If we knew which arguments escape we could check before making the call, is there a way to do that?

@aclements
Copy link
Member

If we knew which arguments escape we could check before making the call, is there a way to do that?

That's not recorded in runtime information, but you could ask the compiler. The fast but unsupported way would be to look in the private export info in the object file for the function; however, that format is undocumented and tied to the compiler release. The slow but less unsupported way would be to run the compiler with -m and parse the diagnostic output. If Delve is compiling the binary anyway, it could just add this flag and remember the escape information for everything. If it's working with an already-built binary, I'm not sure what the answer is. Maybe the compiler could start writing this down in the DWARF info? We might still have to change the way the escape information is expressed as escape analysis improves with new compiler releases, but that would be more stable and easier to get at than the private export info.

@aarzilli
Copy link
Contributor

The slow but less unsupported way would be to run the compiler with -m and parse the diagnostic output

This would mean looking for leaking param/leaking param contents lines, right? Nothing else? I think we can do that to develop the feature and then think of a better way that also works with precompiled binaries.

@gopherbot
Copy link

Change https://golang.org/cl/109699 mentions this issue: runtime: support for debugger function calls

@pjweinb
Copy link

pjweinb commented Mar 24, 2022

var b strings.Builder
.....
return b.String()
while debugging, one would like to see the value of b.String() but delve complains
"... b.String as argument b in function strings.(*Builder).String: stack object passed to escaping pointer: b" I don't mean to reopen this particular discussion; just observing the failure in this common case ('common' because this was the first function I tried to evaluate in delve).

@findleyr
Copy link
Contributor

findleyr commented Nov 3, 2022

We just discussed this in our team meeting, and agree that this seems like a reasonable area to improve. Tentatively reopening.

@findleyr findleyr reopened this Nov 3, 2022
@aclements
Copy link
Member

@findleyr , what sorts of improvements were you thinking?

@findleyr
Copy link
Contributor

findleyr commented Nov 3, 2022

@aclements the context for this is golang/vscode-go#2522, which is essentially the same as Peter's #21678 (comment), which I suppose is related to the discussion around function calls on stack objects, which appeared unresolved when gopherbot froze the issue.

I had closed the VS Code issue as unactionable, but others on the team challenged why we couldn't do better. From the user's perspective is is surprising that strings.Builder.String cannot be watched, and I agree (without fully understanding the technical challenges).

If you think that this is fundamentally not feasible, please feel free to re-close.

@golang golang unlocked this conversation Nov 3, 2022
@aarzilli
Copy link
Contributor

aarzilli commented Nov 3, 2022

I don't think it's feasible to safely pass stack allocated things to arguments that escape, however at the moment delve is considering all arguments to escape, because we could not figure out how to determine which ones do. That would be the area of improvement for this, IMO, however issue #27007 exists, so I don't think it's necessary to reopen this one (but maybe we should see about making progress on #27007).

@findleyr
Copy link
Contributor

findleyr commented Nov 3, 2022

My apologies -- in my haste I thought that Gopherbot had closed this issue with unresolved discussion. Re-closing in favor of #27007.

@findleyr findleyr closed this as completed Nov 3, 2022
@golang golang locked and limited conversation to collaborators Nov 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

11 participants