Skip to content

proposal: runtime/mainthread: add mainthread.Do for mediating access to the main thread #70089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eliasnaur opened this issue Oct 29, 2024 · 57 comments
Labels
Milestone

Comments

@eliasnaur
Copy link
Contributor

Proposal Details

This is #64777 (comment) in proposal form. It is a reduced and compatible variant of #64777 (comment).

I propose to add a new package, mainthread, with a single function, Do, that allows Go programs to execute a function on the main thread.

// Package mainthread mediates access to the program's main thread.
//
// Most Go programs do not need to run on specific threads 
// and can ignore this package, but some C libraries, often GUI-related libraries,
// only work when invoked from the program's main thread.
//
// [Do] runs a function on the main thread. No other code can run on the main thread
// until that function returns.
//
// Each package's initialization functions always run on the main thread,
// as if by successive calls to Do(init).
//
// For compatibility with earlier versions of Go, if an init function calls [runtime.LockOSThread], 
// then package main's func main also runs on the main thread, as if by Do(main).
package mainthread // imported as "runtime/mainthread"

// Do calls f on the main thread.
// Nothing else runs on the main thread until f returns.
// If f calls Do, the nested call panics.
//
// Package initialization functions run as if by Do(init).
// If an init function calls [runtime.LockOSThread], then package main's func main
// runs as if by Do(main), until the thread is unlocked using [runtime.UnlockOSThread].
//
// Do panics if the Go runtime is not in control of the main thread, such as in build modes
// c-shared and c-archive.
func Do(f func())

The larger proposal (#64777 (comment)) adds Yield and Waiting to support sharing the main thread in a Go program. However, the Go runtime doesn't always have control over the main thread, most notably in c-shared or c-archive mode on platforms such as Android. In those cases, the platform facility for mediating main thread access are strictly superior to mainthread.Do. See #64777 (comment) for a detailed analysis and assumptions.

In short, I believe it's better to accept this simpler proposal to only allow Go programs access to the main thread when the Go runtime has control over it, and let other cases be handled by platform API.

I hope this can be implemented in Go 1.24.

@gopherbot gopherbot added this to the Proposal milestone Oct 29, 2024
@gabyhelp
Copy link

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@apparentlymart
Copy link

(I see that earlier versions of this proposal were already discussed at length elsewhere and I did try to catch up on it first, but I apologize if I'm asking a question that's redundant from earlier discussions.)

If this function will panic when called from an environment where the Go runtime does not "own" the main thread, is it justified to also offer a function to test whether a call to this function is possible? That could, for example, allow a caller to choose to treat "I'm running in the wrong mode" as an error to be handled gracefully, rather than as an exception to be handled by panicking.

package mainthread

// CanDo returns true if and only if a subsequent call to [Do] would not panic.
func CanDo() bool

(Another variation of this would be for Do itself to return an error, but the usability of not having to worry about error handling when you know you're running in a context where this should work seems nice... this concern of detecting whether it will succeed seems specific to library developers that want their library to degrade gracefully in c-shared/c-archive/etc build modes.)

@Jorropo
Copy link
Member

Jorropo commented Oct 29, 2024

I hope this can be implemented in Go 1.24.

Just so you know, the current merge window closes 21 11, this would be a quick turn around time. There is the option of getting exceptions but theses are rare and usually limited to very low dangerous community impact.

@qiulaidongfeng
Copy link
Member

Does this API mean that if the main package imports a package that calls runtime.LockOSThread in init (for event loop in main thread) ,
like fyne did, calls to mainthread.Do by other packages will block permanently?

If so, that means we may need to modify existing valid code when using the mainthread package, which I don't think is backward-compatible,see #64777 (comment).
Fyne Info:
On Windows:
call LockOSThread in https://github.com/fyne-io/fyne/blob/7d813563712924b381ced18c04869c059e2cb4c6/internal/driver/glfw/loop.go#L35
event loop in https://github.com/fyne-io/fyne/blob/7d813563712924b381ced18c04869c059e2cb4c6/internal/driver/glfw/loop.go#L107

@eliasnaur
Copy link
Contributor Author

@qiulaidongfeng I believe your comment is addressed by #64777 (comment). In short, LockOSThread during init does not compose automatically with mainthread.Do, because they both act on a single resource, the main thread. There is no backwards compatibility issue, however, because this proposal doesn't affect the behaviour of LockOSThread during init.

@eliasnaur
Copy link
Contributor Author

@apparentlymart the original proposal says to panic in c-shared/c-archive mode, but I'm not against CanDo or the like.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/628815 mentions this issue: runtime/mainthread: new package

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Nov 20, 2024
@rsc
Copy link
Contributor

rsc commented Dec 4, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Dec 4, 2024
@aclements
Copy link
Member

I think we need to re-ground this discussion in concrete use cases. I'm sure at least some of this will be me asking you to repeat what's already been said in #64777, but I think getting re-consolidating this information will be helpful.

Let's define main thread to mean the OS-created thread that started the process, and define startup thread to mean the thread we run Go init functions on. In typical Go binaries, these are one and the same. In c-shared and c-archive mode, the Go runtime always creates a new thread to run init functions, and exits that thread after init functions are done, so the startup thread is not the main thread. There's also a library load thread, which is the thread that first calls into the Go runtime in c-shared and c-archive mode. This may be the main thread or may be another thread, but the Go runtime relinquishes control of this thread very quickly.

What are the situations where a library needs to be called on the main thread (and not just consistently on some thread, and not just on the startup thread), and the platform doesn't provide a mechanism for calling code on the main thread? Can you give concrete examples so we have something to ground the requirements in?

How are libraries even sensitive to this? Do they behave differently from C if you link against them at build time (statically or dynamically) versus if you dlopen them at run time? (The abstract case we were able to come up with in proposal review is that a native library has global constructors/ELF initializers and is sensitive to other functions running on the same thread. But even that isn't necessarily the main thread if that library gets dlopened.)

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Dec 20, 2024

In c-shared and c-archive mode, the Go runtime always creates a new thread to run init functions, and exits that thread after init functions are done, so the startup thread is not the main thread.

Thank you for this nugget. I'm very surprised that the library load thread doesn't run Go init functions, which seems to imply that a c-shared/c-archive Go function may be called by C concurrently with Go init functions. This behaviour also seems inconsistent with constructors/ELF initializers which I believe complete before dlopen returns.

What are the situations where a library needs to be called on the main thread (and not just consistently on some thread, and not just on the startup thread), and the platform doesn't provide a mechanism for calling code on the main thread? Can you give concrete examples so we have something to ground the requirements in?

I know of only one example. Windows StartServiceCtrlDispatcher called by (among others) golang.org/x/sys/windows/svc.Run[0]. I don't know of a Windows facility for calling functions on the main thread.

Other issues mention Linux namespace ("container") APIs. However, from a very cursory glance (and no experience), they don't seem to require the main thread, merely some thread. @thediveo may have more information (from comment)

[0]: Incidentally, scv.Run doesn't document that it needs to run on the main thread, even though StartServiceCtrlDispatcher says

Connects the main thread of a service process to the service control manager, which causes the thread to be the service control dispatcher thread for the calling process.

@thediveo
Copy link
Contributor

thediveo commented Dec 20, 2024

Other issues mention Linux namespace ("container") APIs. However, from a very cursory glance (and no experience), they don't seem to require the main thread, merely some thread. @thediveo may have more information (from comment)

Correctly, in fact it is even better (while not strictly necessary) to do Linux namespace switching on OS-level threads (tasks) other than the main/initial thread, as in some cases you might end up with throw-away threads to not leak namespace state: if you do this on the main/initial thread this will become (in Go runtime parlance) "wedged", an idle thread. You cannot simply kill this thread because then some relevant process information becomes inaccessible to the other threads of the same process.

And yes, to the Linux kernel, all threads are to some extend created equal, as the main thread representing the process is still only a thread, albeit a group leader for organizational/orchestration purposes. These tasks can share certain resources to make them look like threads of the same task, but the Linux kernel allows some highly useful things, like a thread opting out of this sharing to do some useful shenanigans.

@ianlancetaylor
Copy link
Member

I'm very surprised that the library load thread doesn't run Go init functions, which seems to imply that a c-shared/c-archive Go function may be called by C concurrently with Go init functions.

When this happens, the call to the Go function blocks until the Go init functions have completed. See https://go.googlesource.com/go/+/refs/heads/master/src/runtime/cgocall.go#409 and also https://go.googlesource.com/go/+/refs/heads/master/src/runtime/cgo/gcc_libinit.c#52.

@eliasnaur
Copy link
Contributor Author

Thanks Ian for the clarification.

How are libraries even sensitive to this? Do they behave differently from C if you link against them at build time (statically or dynamically) versus if you dlopen them at run time?

As far as I know, all main thread requirements originate in the OS kernel (or OS libraries), and leak through to libraries. For this reason, I don't think build time versus runtime library loading makes a difference.

@aclements
Copy link
Member

In proposal review, we're realizing that we've lost track of the motivation for this whole change.

Thanks for the example of StartServiceCtrlDispatcher.

In #64755, you mentioned:

Some APIs, most notably macOS' AppKit and iOS' UIKit require exclusive control of the startup thread.

Given that this proposal has been through a lot, are these still good driving examples to be focusing on? Are these the only examples we can find?

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Jan 9, 2025

Android GUI API also requires exclusive control of the main thread, but because you're forced to run in c-shared mode, mainthread.Do is not necessary (and hard to implement).

Only other API I can think of are Linux(/etc.?) container API, but I'm no expert. Perhaps @thediveo can provide examples.

@thediveo
Copy link
Contributor

thediveo commented Jan 9, 2025

The APIs of Docker, containerd and the k8s CRI are all with rock solid client-server architecture, so there are no restrictions to specific OS threads.

Podman only so when deploying it socket-activated as a server and using its remote API socket. Using the podman grpc client is a mess though, but that's concerning unwanted namespace moves of the calling process, not a main thread restriction.

Linux per se has mainly the idiosyncracy that certain elements in the procfs become inaccessible when the main thread dies, but Go covers this by "wedging" G0 in this case. This also happens in a similar way with other operating systems.

My understanding for this issue is that the can safely ignore all this, because we're dealing with the prominent use cases of especially UI libraries.

@aclements
Copy link
Member

Thanks. Given this, its our understanding that this API is thus only necessary for UI toolkits and then only on macOS Catalyst and iOS.

Given this context, do we need to support Yield or Waiting? It sounds like the answer is "no", in which case this more limited API is sufficient.

The other issue was implementing this for c-shared and c-archive mode. Are those needed for macOS Catalyst or iOS? It sounds like there's currently not even a workaround in these build modes, suggesting it's somehow not a problem. @eliasnaur (or anyone else), could you provide more clarity on that?

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Jan 23, 2025

Thanks. Given this, its our understanding that this API is thus only necessary for UI toolkits and then only on macOS Catalyst and iOS.

What about macOS' AppKit and Windows' StartServiceCtrlDispatcher? AppKit provides affordances to fetch and dispatch events, but that's a red herring: gestures such as drag-window-edge-to-resize are modal and will block the main thread until completed.

Given this context, do we need to support Yield or Waiting? It sounds like the answer is "no", in which case this more limited API is sufficient.

I agree.

The other issue was implementing this for c-shared and c-archive mode. Are those needed for macOS Catalyst or iOS? It sounds like there's currently not even a workaround in these build modes, suggesting it's somehow not a problem. @eliasnaur (or anyone else), could you provide more clarity on that?

By "workaround", do you mean the LockOSThread-during-init trick? There's no equivalent in c-archive or c-shared programs, but it's easy to arrange for the host environment to call into Go on the main thread[0].

It would be nice to allow a GUI Go package to work in all build modes, but the proposed panic behaviour does increase the amount of preparation to call Do. To illustrate, here is a sketch program for macOS AppKit:

//go:build darwin

package gui

/*
static void NSApp_run(void) {
    [NSapp run];
}

// Run a function on the main thread using native API.
static void runOnMainThread(f uintptr) {
    dispatch_async(dispatch_get_main_queue(), ^{
        callGoFunc(f);
    });
}
*/
import "C"

//export callGoFunc
func callGoFunc(h uintptr) {
    f := cgo.Handle(h).Value().(func())
    f()
}

var mainOnce sync.Once

func NewWindow() *Window {
    mainOnce.Do(func() {
        go func() {
            defer func() {
                if err := recover(); err != nil {
                    // Probably c-archive or c-shared mode.
                }
            })
            // Note that C.runOnMainThread is not going to work
            // as long as the Go runtime controls it.
            mainthread.Do(C.NSApp_run) // never returns.
        }
    })
    // Create a new window, knowing that the main thread event loop
    // is running.
    // Note that this is not using mainthread.Do, because the Go runtime
    // no longer have control over the main thread; it is blocked inside
    // [NSApp run].
    C.runOnMainThread(cgo.NewHandle(func() {
        ...
    }))
}

The ceremony for calling mainthread.Do is quite long:

  • A sync.Once to not leak a goroutine every call to NewWindow.
  • A goroutine because either C.NSApp_run or mainthread.Do blocks forever.
  • A recover to tolerate c-* modes (the host must call [NSApp run]).

Ceremony suggests the API is not quite right. For the sake of comparison, here's hypothetical mainthread.Loop that is precisely tailored to the use-case of forever-running event loops:

package mainthread

// Loop schedules f to be called on the main thread.
// Loop returns immediately and does not wait for f to return.
// Once a function is scheduled, every subsequent call is ignored.
//
// Calls when the runtime doesn't control the main thread are
// ignored. This applies to c-shared and c-archive programs and
// programs that call [LockOSThread] during init.
func Loop(f func())

Loop eliminates all ceremony:

func NewWindow() *Window {
    mainthread.Loop(C.NSApp_run)
    ...
}

[0]: In fact, Gio calls main even in Android's c-shared mode to paper over the missing support for buildmode=exe programs on Android.

@andydotxyz
Copy link
Contributor

On this topic Fyne is just completing a thread model migration in which we implemented a Do function to call on the main goroutine.

What we discovered in the process is that you likely need two versions - one which will wait until completed and the other is just scheduling the call and returning without waiting (likely an immediate return).

Honestly we have been blown away by how fast the goroutine context switching is, having more builtin functionality to handle these would be a boost for sure.

I agree that the questions above about specificity are critical - as the "main" routine may or may not truly be what people need. If this API can tie it to a clearly defined thread in the Go ecosystem we should be best, rather than setting expectations based on OS or other system "thread". Part of me wonders if this may need to (or in the future consider) allowing insertion into a specified goroutine instead? (i.e. main vs startup vs graphics ...)

@aarzilli
Copy link
Contributor

Honestly we have been blown away by how fast the goroutine context switching is, having more builtin functionality to handle these would be a boost for sure.

Not if LockOSThread is involved #21827

@andydotxyz
Copy link
Contributor

Honestly we have been blown away by how fast the goroutine context switching is, having more builtin functionality to handle these would be a boost for sure.

Not if LockOSThread is involved #21827

Even with that on I was getting around 2'500'000 goroutine context changes per second which is surprising to me and more than enough for most apps.

@rsc
Copy link
Contributor

rsc commented Feb 5, 2025

It sounds like the claim is that we need mainthread.Do only to run an event loop that never returns, and then at that point there is no portable way to run another Go function on the main thread. If apps know which event loop is running they are encouraged to use cgo to communicate directly with it. That seems a little unsatisfying, but perhaps it is sufficient. Is it?

It sounds more like mainthread.Take than mainthread.Do. Perhaps it should panic when f returns?

@andydotxyz
Copy link
Contributor

It sounds like the claim is that we need mainthread.Do only to run an event loop that never return

That's not my take on it at all - from Fyne's point of view the Do is to allow a given function to run on the main thread. Yes it probably needs an event loop to make this possible, but the outcome is having routines join main to complete briefly rather than a standard way to block the main thread...

@ianlancetaylor
Copy link
Member

@andydotxyz Can you expand on exactly when Fyne needs to run something on the main thread, and why? Thanks.

@ianlancetaylor
Copy link
Member

In my mind running code on "the correct thread" is very different from running code on "the main thread." Code that has to run on the main thread is for a small, though important, set of cases. Code that has to run on a specific thread is a much larger set of cases, but fortunately is also much easier to implement. We shouldn't try to mix the complicated case of running on the main thread with the simpler case of running on a specific thread.

@andydotxyz
Copy link
Contributor

Apologies if I was being too vague. "the correct thread" in my previous message meant the main thread on all platforms except macOS where it can be main or the thread which the graphics was initialised on (which is often main but does not have to be). I hope that helps.

@ianlancetaylor
Copy link
Member

@andydotxyz Thanks, let's try to pin this down. In #70089 (comment) @aclements asked exactly when we need to support this. See their summary at #70089 (comment). I'm not sure that your requirements got recorded anywhere. Can you describe the exact environments in which code needs to run on the main thread? Thanks.

@andydotxyz
Copy link
Contributor

My understanding is that to access the graphical context correctly you will need to call the functions from the main thread on Windows, Linux, Android and iOS. On macOS this is typically done as well by convention (When using Apple's AppKit I think that is the default) though technically it could be a different thread as long as it is consistent for the life of the app.

Perhaps I read the title and assumed a larger scope than it is being whittled down to. Given that init/LockOSThread exists I find the Do more useful than the Take - but if the idea is to remove the usage of init then I could see the value of the latter.

@ianlancetaylor
Copy link
Member

ianlancetaylor commented Feb 6, 2025

I could certainly be wrong, but that is not my current understanding. My understanding is that on Linux systems, with some GUI packages, you need to consistently call the GUI on the same thread. But there is no requirement that this be the main thread of the application. It can be any thread, as long as it's consistent. Linux in general does not care about the main thread of the application at all (except that it sometimes makes a difference if the main thread exits).

@rsc
Copy link
Contributor

rsc commented Feb 12, 2025

@andydotxyz, it sounds like even if those graphic frameworks do require running code on the main thread, they also require running an event loop (that they provide) on the main thread, so getting other code to run on the main thread necessarily requires coordination with the external event loop. That's not something Go can do easily, so I think @eliasnaur is saying that we should just provide a way to start the external event loop and then it's up to the client of that event loop to coordinate with it to run other Go code from time to time.

The question is whether that's (1) true and (2) the only use case for running code on the main thread.

Can anyone link to references that support answers to either of those questions?

@andydotxyz
Copy link
Contributor

I don't think it's true to say that you must also run an event loop that the graphic framework provides. Though it depends on who "they" is I suppose.

Yes some code needs to listen for events and probably coordinate activities in some way - but it doesn't require a dedicated event thread nor (in most cases) executing some C based event loop from the OS / graphics framework (except macOS perhaps).

I'm not sure what makes Go unsuitable for this, but that's ok.
As I mentioned above the "starting the event loop on main" is already possible in the language so that's fine for us. What seemed nice was a way to execute on that context later (mainthread.Do) but as the discussion seems to be focused more on replacing the init/LockOsThread (mainthread.Take) it's less interesting for my usage because a solution already exists.

Aside from these discussions Fyne has already implemented a thread management system and in the upcoming release this includes a fyne.Do, so in essence we are covered. But it did require a lot of custom Go code to manage access to this special main thread.

@ianlancetaylor
Copy link
Member

This proposal is focused only on code that must execute on the main thread of the process, by which we mean the first thread created by the system when starting the process.

Can you point us to the custom Go code that you had to implement? Thanks.

@andydotxyz
Copy link
Contributor

andydotxyz commented Feb 13, 2025

This proposal is focused only on code that must execute on the main thread of the process, by which we mean the first thread created by the system when starting the process.

Yes indeed, though it seems with the differentiation of mainthread.Take and mainthread.Do with the main emphasis leaning to the former that the discussion may no longer solve the problem that I understood it to from the title of the issue.

Can you point us to the custom Go code that you had to implement? Thanks.

The main loop of our driver is at:
https://github.com/fyne-io/fyne/blob/2ba364149b76beab2ad5de05041782aeeeebf421/internal/driver/glfw/loop.go#L138

This supports inserted functions called into the driver at:
https://github.com/fyne-io/fyne/blob/2ba364149b76beab2ad5de05041782aeeeebf421/internal/driver/glfw/loop.go#L40

Which is made publicly available through a helper:
https://github.com/fyne-io/fyne/blob/2ba364149b76beab2ad5de05041782aeeeebf421/thread.go#L18

The runloop is started when an app runs (i.e. after a := app.New() calling a.Run() fires up the driver loop linked at the top.

As you can see we use the init/LockOSThread (in loop.go) with this both the loop setup, and all requested functions, can run on the main thread.

(all the links above go to the develop branch as we are still testing this approach and have not yet released a final - it's available under a v2.6.0-alpha1 version at the moment.)

@eliasnaur
Copy link
Contributor Author

@andydotxyz, it sounds like even if those graphic frameworks do require running code on the main thread, they also require running an event loop (that they provide) on the main thread, so getting other code to run on the main thread necessarily requires coordination with the external event loop. That's not something Go can do easily, so I think @eliasnaur is saying that we should just provide a way to start the external event loop and then it's up to the client of that event loop to coordinate with it to run other Go code from time to time.

The question is whether that's (1) true and (2) the only use case for running code on the main thread.

Can anyone link to references that support answers to either of those questions?

All of the GUI frameworks that require running code on the main thread also require their event loop to take (or have) control of that thread. This applies to macOS AppKit and Catalyst, iOS UIKit, Android. Windows and Linux GUI require some thread, but not the main thread in particular.

As for your (2), I can only offer StartServiceCtrlDispatcher. It runs an event loop on the main thread, but it is otherwise not for GUI and AFAIK there is no other API can must run on the main thread (and so there's no facility to schedule code on the main thread either). StartServiceCtrlDispatcher does fit the mainthread.Take design.

@ianlancetaylor
Copy link
Member

@andydotxyz Thanks. From that description it sounds like mainthread.Do will do what you need. The library would call it to run the event loop, which would run forever.

I think it's too much to ask the Go standard library to support two different packages that must run on the main thread.

@andydotxyz
Copy link
Contributor

This conversation is feeling a little circular now, the "Do" you've described sounds like it can only be executed once if the intent is for main loops. In which case the "Take" naming does make more sense.

Anyhow, I'm not sure why a function that simply replaces init/LockOSThread is worth a new package.

Maybe "os.TakeMainThread()" is more consistent with the current API and has a smaller impact by not needing the new package?

@ianlancetaylor
Copy link
Member

This conversation is feeling a little circular now, the "Do" you've described sounds like it can only be executed once if the intent is for main loops. In which case the "Take" naming does make more sense.

Personally I think mainthread.Do is clear: it runs something on the main thread. If the function never returns, then indeed the main thread will never be usable by any other code. So be it. That seems consistent and clear. But if it does return, then the main thread will be available. That also seems consistent and clear. But I don't feel strongly about the name.

Anyhow, I'm not sure why a function that simply replaces init/LockOSThread is worth a new package.

Because init/LockOSThread can only be done in the main package. The goal of mainthread.Do is to permit a GUI package that requires the main thread to be standalone, without requiring people using that package to understand that they need to do a little dance in the main package of every program that uses the GUI package.

@andydotxyz
Copy link
Contributor

without requiring people using that package to understand that they need to do a little dance in the main package of every program that uses the GUI package.

This feels like a strange characterisation, maybe I'm still not on the same page. When someone wants to use Fyne they import fyne.io/fyne/v2/app and call app.New - this will run the init and LockOSThread code without the developer knowing any of the details. Yes they do have to call App.Run() before main exists, but I don't see how that changes.

@eliasnaur
Copy link
Contributor Author

With mainthread.Do the call to App.Run can become optional, and you can mix GUI packages that otherwise have no relation to each other.
For Windows services, the call to svc.Run can become optional and doesn't need to take up the main goroutine.

@andydotxyz
Copy link
Contributor

By "optional" I guess you mean that app developers can choose to manage the maintop on their own through use of the mainthread.Do - there is no escaping the need for an event loop somewhere right?

So in response the libraries will all expose a "process my events" and "process draws" and any other internal details that was in the default loop before?

@eliasnaur
Copy link
Contributor Author

No, I mean the GUI package that needs a running main loop can initialize it on demand, say in its NewWindow or equivalent. See sketch in #70089 (comment).

@andydotxyz
Copy link
Contributor

Won't that conflict with whatever may be running in main()?

If the idea for this to be callable from anywhere but also that it would be for launching event loops will such a loop wait for the main to return, will it block the current main or will they operate interspersed both being in the main context?

@ianlancetaylor
Copy link
Member

Go does not promise that the main function runs in the main thread. There is no conflict there in general.

Except for the special case where an init function calls runtime.LockOSThread and does not unlock it. In that special case, mainthread.Do will block until the main function calls runtime.UnlockOSThread.

@andydotxyz
Copy link
Contributor

Go does not promise that the main function runs in the main thread. There is no conflict there in general.

Except for the special case where an init function calls runtime.LockOSThread and does not unlock it. In that special case, mainthread.Do will block until the main function calls runtime.UnlockOSThread.

Good point, I had forgotten that. So in this case the scheduler would vacate the mainthread?

@ianlancetaylor
Copy link
Member

Yes, the scheduler would preempt any goroutine running on the main thread and have it start running the mainthread.Do function.

@aclements
Copy link
Member

I think I see three high-level options here:

  1. A Take(f func()) function that can be called at most once in a process' life time. On a subsequent call, it panics, even if f has returned. (Making subsequent calls a no-op was mentioned, but that sounds like a recipe for confusing bugs.) This could be blocking or non-blocking.
  2. A Take(f func()) function that can be called at most once at a time in a process. If Take is called again before an earlier Take has returned, it panics (or maybe returns an error). Take blocks until f returns, and after Take returns , it can be called again.
  3. A Do(f func()) call that runs at f on the main thread, blocking until any pending Do calls finish and f runs and returns. We might need a non-blocking form that fails if it can't start f immediately, so you can say "if I can't start my event loop right away, something's wrong."

There's also a Do function that can start multiple goroutines all logically bound to the main thread that Go can schedule between. It sounds like we don't want this because in practice it's often used to start a platform event loop that will never get back to the Go scheduler anyway. (I'm okay with this, because this approach is also really hard to implement!)

For event loops in an application context, any of these are sufficient, though option 3 may lead to confusing failures if there are two attempts to start a forever-running event loop.

Hypothetically, in a testing context, I could see an application wanting to stop the event loop and start a new one for each test, which is possible with option 1 but certainly more natural and composable with option 2.

It seems there are two categories of event loops: platform event loops (e.g., used by Gio), which don't interact with the Go scheduler but generally provide a way to inject work; and Go event loops (e.g., used by Fyne). I think the main argument for option 3 is that it pushes the event loop into the runtime itself. This may seem more composable because different packages could submit work without coordination. But I think this is an illusion. The moment you start a platform event loop, you effectively lock up the runtime's event loop with no way to indicate that it can no longer handle short work. I think what ultimately sank #64777 was trying to integrate both a platform event loop and a runtime event loop and that led to significant complexity.

To me, this all points at option 2.

The good news is we don't need Yield or Waiting.

Maybe this should live in runtime or os instead of its own package. That's a separable question.

Build modes

There's still the problem of running in a build mode where we have no control over the main thread. I see two options here:

  1. Take always fails in these build modes.
  2. If the runtime can tell if it's running on the main thread, Take could succeed on that thread and fail if you're not on that thread. This is possible on Linux (gettid() == getpid()), BSDs including MacOS. I haven't been able to figure out if this is possible on Windows.

The advantage of option 2 is that the main binary at least has the possibility of coordinating with Go to call Take on the main thread. On the other hand, if you have that much control over the main binary, maybe you can just call start the event loop from there.

@eliasnaur
Copy link
Contributor Author

Thank you for the careful analysis, in particular for bringing up stopping event loops and making Take work in other build modes.

It seems to me your option 2 Take will require the ceremony listed in the sketch. I ask you to reconsider my Loop proposal (I believe your option 1 Take), perhaps amended to returning an error. That is, a Take that starts a new goroutine bound to the main thread, otherwise returns an error. If the goroutine exits, the next call to Take will start another.

The upside of a non-blocking Take is that you no longer need ceremony (once.Do, goroutine, recover).

We might need a non-blocking form that fails if it can't start f immediately, so you can say "if I can't start my event loop right away, something's wrong."

I argue that you generally won't care about the failure of Take to give you the main thread. The event loop merely has to be started by someone, but it doesn't matter who. It might have been done by the caller already or a C driver in c-archive/c-shared mode.
Even if Take fails (say in c-archive/c-shared mode) and no-one else has started the event loop, the worst that happens is that the subsequent calls to the platform's runOnMainThread will hang until something does start the loop.

Since a failing Take is normal and (usually) non-fatal, I believe any variant of Take shouldn't panic.

You probably do care about Take failing in a testing context or when you otherwise assume complete control of the environment, which is what the error return is for.

@andydotxyz
Copy link
Contributor

3. A Do(f func()) call that runs at f on the main thread, blocking until any pending Do calls finish and f runs and returns. We might need a non-blocking form that fails if it can't start f immediately, so you can say "if I can't start my event loop right away, something's wrong."

I agree it seems that Take vs Do are different high level ways to go about this, and it isn't totally clear which satisfies the requirements.
Just to add two cents here, Fyne fully implemented its own looping (integrated with system event loops where required) and has moved to a Do approach as outlined above. We did find the need for any async version too, and ended up with fyne.DoAndWait (sync) with fyne.Do being async. The latter is used most of the time but waiting is essential in some inner situations and also in many unit tests.

@aclements
Copy link
Member

It seems to me your option 2 Take will require the ceremony listed in #70089 (comment). I ask you to reconsider my Loop proposal (I believe your option 1 Take), perhaps amended to returning an error. That is, a Take that starts a new goroutine bound to the main thread, otherwise returns an error. If the goroutine exits, the next call to Take will start another.

Thanks for connecting back to your earlier comment.

I personally don't find the "too much ceremony" argument very compelling. This is 10 lines that's going to be written in maybe half a dozen libraries? We can make it return an error instead of panicking and get it down to ~5 lines.

OTOH, the platform APIs provide ways to exit and restart the event loop. A single-shot Take takes an API that exposes (and is necessary because of) the lack of composability in the platform APIs and makes it even less composable than the platform APIs.

I argue that you generally won't care about the failure of Take to give you the main thread. The event loop merely has to be started by someone, but it doesn't matter who.

This assumes that there's a singular way to start the event loop. I don't think that's true. E.g., on Windows this could be used to start a GUI GetMessage loop or a Windows service. On Unixes, this could be used to start a Gtk event loop or a Qt event loop or an X event loop, etc. If a program tries to start incompatible event loops, I would like that not to fail silently.

One counter-argument is that this API is specifically about event loops that must run on the main thread, which almost none of my examples fall into. But if it's really the case that each platform has a singular kind of event loop with a singular implementation that must be done on the main thread, then I would argue that this whole API is at the wrong level and there should be a StartEventLoop() in x/sys/....

@eliasnaur
Copy link
Contributor Author

It seems to me your option 2 Take will require the ceremony listed in #70089 (comment). I ask you to reconsider my Loop proposal (I believe your option 1 Take), perhaps amended to returning an error. That is, a Take that starts a new goroutine bound to the main thread, otherwise returns an error. If the goroutine exits, the next call to Take will start another.

Thanks for connecting back to your earlier comment.

I personally don't find the "too much ceremony" argument very compelling. This is 10 lines that's going to be written in maybe half a dozen libraries? We can make it return an error instead of panicking and get it down to ~5 lines.

OTOH, the platform APIs provide ways to exit and restart the event loop. A single-shot Take takes an API that exposes (and is necessary because of) the lack of composability in the platform APIs and makes it even less composable than the platform APIs.

I realize it wasn't clear, but I wrote the amendment that "If the goroutine exits, the next call to Take will start another." That is, I proposed a multi-use Loop.

I argue that you generally won't care about the failure of Take to give you the main thread. The event loop merely has to be started by someone, but it doesn't matter who.

This assumes that there's a singular way to start the event loop. I don't think that's true. E.g., on Windows this could be used to start a GUI GetMessage loop or a Windows service. On Unixes, this could be used to start a Gtk event loop or a Qt event loop or an X event loop, etc. If a program tries to start incompatible event loops, I would like that not to fail silently.

One counter-argument is that this API is specifically about event loops that must run on the main thread, which almost none of my examples fall into. But if it's really the case that each platform has a singular kind of event loop with a singular implementation that must be done on the main thread, then I would argue that this whole API is at the wrong level and there should be a StartEventLoop() in x/sys/....

StartEventLoop in x/sys/... sounds good to me.

@eliasnaur
Copy link
Contributor Author

Note that StartEventLoop will need take a callback similar to my Loop. The reason is that you likely want to initialize the environment (e.g. register a NSApplication delegate) before starting the event loop itself. An extra advantage is that the Go standard library and/or x/sys don't need to link and call the event loop runner; the callback can do that last.

@aclements
Copy link
Member

I realize it wasn't clear, but I wrote the amendment that "If the goroutine exits, the next call to Take will start another." That is, I proposed a multi-use Loop.

Thanks, I had missed that.

Unfortunately, I think a non-blocking multi-shot Take is inherently racy. The only way (without complicating the API) to communicate that it's okay to call Take again is to signal, e.g., via a channel, that f is about to return and let the goroutine exit. But there's going to be a window between any communication from f and the goroutine actually exiting during which another call to Take would fail. (You could add a way for Take itself to communicate that it can be called again, but then you might as well just making it blocking so that signal is just that Take returns.)

Note that StartEventLoop will need take a callback similar to my Loop.

Sorry, my StartEventLoop proposal was meant to be "If there's truly one way to do an event loop on a given platform, we should just implement that in x/sys." It was meant to be a proof by contrapositive that Take can't just ignore multiple calls (we can't have just one implementation in x/sys -> there's not just one way to do an event loop on a given platform -> Take can't assume its callback is idempotent), but I can see how that may have only been clear in my own brain. 😆

@aclements
Copy link
Member

Circling back to this, it seems like one dimension is single-shot vs multi-shot:

  1. Single-shot Take can only be called once in a process' lifetime.
  2. Multi-shot Take can be called more than once, but only once at a time.

Failure in either case can be a panic or an error return. I'd lean toward panicking because misuse seems like a programmer error to me, but either way works.

Another question is blocking vs non-blocking. Single-shot Take can be either. But multi-shot Take must be blocking. A non-blocking multi-shot Take has a window between when your callback signals that it's about to exit, and when it truly has exited where you don't know if it's safe to call Take again.

I'm still inclined toward multi-shot Take primarily because it's about as composable as this can be. The fact that it has to be blocking does mean there needs to be a little more ceremony around invoking it, but it's only a few lines and this is only going to be done in a handful of places across all Go code.

The question of what to do with c-shared and c-archive is still open.

There's also the question of how this interacts with init. I rather like how you specified this at the very top:

// Each package's initialization functions always run on the main thread,
// as if by successive calls to Do(init).
//
// For compatibility with earlier versions of Go, if an init function calls [runtime.LockOSThread], 
// then package main's func main also runs on the main thread, as if by Do(main).

...

// Package initialization functions run as if by Do(init).
// If an init function calls [runtime.LockOSThread], then package main's func main
// runs as if by Do(main), until the thread is unlocked using [runtime.UnlockOSThread].

This particular phrasing requires a multi-shot Take. It also implies that if an init function calls Take, that must fail, and if main starts with the OS thread locked, calling Take anywhere must fail.

@eliasnaur
Copy link
Contributor Author

Failure in either case can be a panic or an error return. I'd lean toward panicking because misuse seems like a programmer error to me, but either way works.

How is a failing Take a programmer error? The use-case most important to me is: two unrelated Go packages that both want to, say, show a window. To do that (on some platforms) the main event loop must be lazily started, if not already started. Say:

package gui

func NewWindow() *Window {
   // Start the main event loop if it is not already started.
   go func() {
       // Will this call panic if the event loop was already running?
       mainthread.Take(func() {
           ...
       })
   }()
   // Create the window, show it, etc.
}

My point is, NewWindow doesn't know whether it's safe to call Take before calling it, in particular it doesn't know whether some other Go package (or the caller itself) has started the event loop. There's no way to ask whether the main loop has started (any such facility would be racy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Active
Development

No branches or pull requests