|
|
Created:
14 years, 8 months ago by brainman Modified:
14 years, 7 months ago Reviewers:
CC:
rsc, golang-dev Visibility:
Public. |
Descriptionruntime: new windows stdcall implementation
Patch Set 1 #
Total comments: 6
Patch Set 2 : code review 2009045: runtime(windows): make sure scheduler runs on os stack ... #
Total comments: 14
Patch Set 3 : code review 2009045: runtime(windows): make sure scheduler runs on os stack ... #
MessagesTotal messages: 14
Seems reasonable. http://codereview.appspot.com/2009045/diff/1/2 File src/pkg/runtime/386/asm.s (right): http://codereview.appspot.com/2009045/diff/1/2#newcode60 src/pkg/runtime/386/asm.s:60: SUBL $(0xbbb), SP Can drop the parens here: 0xbbb. But you should comment what is going on here. I have no idea. I don't believe that m->osstack is necessary. It should be sufficient to use something like m->g0->stackguard - 1024 to grab space below the standard g0 stack. http://codereview.appspot.com/2009045/diff/1/4 File src/pkg/runtime/windows/386/sys.s (right): http://codereview.appspot.com/2009045/diff/1/4#newcode34 src/pkg/runtime/windows/386/sys.s:34: MOVL AX, 4(SP) // return value I don't understand this. You're storing AX into 4(SP) as if this were a Go function, but it's a C function. The return value goes in AX. http://codereview.appspot.com/2009045/diff/1/4#newcode49 src/pkg/runtime/windows/386/sys.s:49: MOVL SP, m_osstack(CX) This shouldn't be necessary, as in 386/asm.s, but there is definitely something missing here. If threadstart is going to run on an operating system stack, one important thing it must do is set g->stackbase and g->stackguard to appropriate values for the system stack that it is executing on. I am surprised that calling stackcheck didn't trigger a crash.
Sign in to reply to this message.
http://codereview.appspot.com/2009045/diff/1/2 File src/pkg/runtime/386/asm.s (right): http://codereview.appspot.com/2009045/diff/1/2#newcode60 src/pkg/runtime/386/asm.s:60: SUBL $(0xbbb), SP > Can drop the parens here: 0xbbb. Done. > But you should comment what is going on here. > I have no idea. Sorry, I should have prompted you here. I'm getting my osstack for first m here. I'm planing it to live right after m->g0 stack and occupy the remainder of os stack. When any windows program starts, os is given 2 stack parameters SizeOfStackReserve and SizeOfHeapCommit (see ld/pe.c). First one tell os how big stack is going to get, and last tell os how big initial stack should be. When first thread of process starts, os allocates SizeOfHeapCommit bytes (in 4K pages) of real memory to start with. As thread runs and stack grows, it reaches beyond of that allocation and gets "access denied" error. The error is automagically handled by os by committing new 4k page and getting thread to continue. That is how stack grows, up to a limit of SizeOfStackReserve. Os needs to know SizeOfStackReserve so it can reserve hole in address space big enough to fit, if it ever comes to stack grow that big. There is one note here. Stack will grow one page at a time, no more. So if your stack grows by little bit at the time (less then 4k), you're ok, but if you try to reach too far ahead straight away, you will get non handled "access violation" and die. Our SizeOfStackCommit is 4K, first m->g0 is 8K, so if I allocate beyond that and try use it before first 8K are "committed" I get a problem. So, what I'm doing here is "touching" every 0xbbb byte at the beginning of stack to make sure all pages are "committed". I'm sure, this is windows specific and should move, but where and how? Another alternative, we can change value of SizeOfHeapCommit in pe header, but we need to know how much. Also CreateThread function uses SizeOfStackReserve and SizeOfStackCommit values for every new thread by default (we do specify "default" in our CreateThread call), so we have to be aware of that. > I don't believe that m->osstack is necessary. > It should be sufficient to use something like > m->g0->stackguard - 1024 to grab space > below the standard g0 stack. > For main m yes, but ever new m creates it's stack on the heap, not os stack, see proc.c/matchmg() just before it calls newosproc(). See what happens when cgo is in play? It does its own things. Are you saying we should all do what cgo does and get newosproc allocate m->g0 stack on os stack? http://codereview.appspot.com/2009045/diff/1/4 File src/pkg/runtime/windows/386/sys.s (right): http://codereview.appspot.com/2009045/diff/1/4#newcode34 src/pkg/runtime/windows/386/sys.s:34: MOVL AX, 4(SP) // return value On 2010/08/27 01:34:28, rsc1 wrote: > I don't understand this. You're storing AX into 4(SP) > as if this were a Go function, but it's a C function. > The return value goes in AX. > You're, probably <g>, right. I mixed the two. I'll remove and check. http://codereview.appspot.com/2009045/diff/1/4#newcode49 src/pkg/runtime/windows/386/sys.s:49: MOVL SP, m_osstack(CX) > If threadstart is going to run on an operating > system stack, ... It is running on os stack at the start of threadstart function. I thought someone will switch to m->g0 (that was allocated on heap matchmg()) after threadstart calls whatever it is (minit() I think). > ... one important thing it must do > is set g->stackbase and g->stackguard to > appropriate values for the system stack that it > is executing on. I'm ok with that, but we already have 8k of stack in m->g0. Why? > ... I am surprised that calling > stackcheck didn't trigger a crash. > ...mmm...
Sign in to reply to this message.
> When any windows program starts, os is given 2 stack parameters > SizeOfStackReserve and SizeOfHeapCommit (see ld/pe.c). First one tell os > how big stack is going to get, and last tell os how big initial stack > should be. When first thread of process starts, os allocates > SizeOfHeapCommit bytes (in 4K pages) of real memory to start with. As > thread runs and stack grows, it reaches beyond of that allocation and > gets "access denied" error. The error is automagically handled by os by > committing new 4k page and getting thread to continue. That is how stack > grows, up to a limit of SizeOfStackReserve. Os needs to know > SizeOfStackReserve so it can reserve hole in address space big enough to > fit, if it ever comes to stack grow that big. There is one note here. > Stack will grow one page at a time, no more. So if your stack grows by > little bit at the time (less then 4k), you're ok, but if you try to > reach too far ahead straight away, you will get non handled "access > violation" and die. I think this is overkill. Go won't have such big stacks, and calling into Windows C libraries presumably they already know about this restriction. > Our SizeOfStackCommit is 4K, first m->g0 is 8K, so if I allocate beyond > that and try use it before first 8K are "committed" I get a problem. > > So, what I'm doing here is "touching" every 0xbbb byte at the beginning > of stack to make sure all pages are "committed". I'm sure, this is > windows specific and should move, but where and how? 8K is two pages. At most you have to touch the word at -0x1000(SP). That's just one instruction, which is fine to leave in for all architectures. 0xbbb is too weird. > Another alternative, we can change value of SizeOfHeapCommit in pe > header, but we need to know how much. Also CreateThread function uses > SizeOfStackReserve and SizeOfStackCommit values for every new thread by > default (we do specify "default" in our CreateThread call), so we have > to be aware of that. > >> I don't believe that m->osstack is necessary. >> It should be sufficient to use something like >> m->g0->stackguard - 1024 to grab space >> below the standard g0 stack. > > For main m yes, but ever new m creates it's stack on the heap, not os > stack, see proc.c/matchmg() just before it calls newosproc(). See what > happens when cgo is in play? It does its own things. Are you saying we > should all do what cgo does and get newosproc allocate m->g0 stack on os > stack? Yes. On the other systems, there are two modes of operation: (1) completely standalone and (2) occasional calls into the C library via cgo. In the second case it's very important that the OS allocate the stack so that it will have the right properties for the C library calls. The Windows port only has one mode of operation, but it is #2. So it should always use the OS stack. If you want to arrange to have malg called with -1 that's fine, but leaking the 8k is fine for now too. > I'm ok with that, but we already have 8k of stack in m->g0. Why? Because malg didn't get the message that you're in mode #2. Russ
Sign in to reply to this message.
> I think this is overkill. Go won't have such big stacks, > and calling into Windows C libraries presumably they > already know about this restriction. Perhaps for Go, not so sure about Windows. And if we do callbacks, things might get out of hand. > 8K is two pages. At most you have to touch the word at -0x1000(SP). > That's just one instruction, which is fine to leave in for all architectures. > 0xbbb is too weird. If you say I should put all m->g0 stacks on "os stack", then we don't have to worry about it, 4k is good enough to start and it will grow. But, let me be clear on what you're saying. You say that m->g0 scheduling stack should be shared between Go scheduler and all our Windows calls? How big should I make it (stackguard - stackbase)? > >> I don't believe that m->osstack is necessary. > >> It should be sufficient to use something like > >> m->g0->stackguard - 1024 to grab space > >> below the standard g0 stack. If everyone will use m->g0 stack, then we don't need to do anything special, just make it big enough. And, again, the size doesn't matter for Windows and won't matter for Go until we do callback. > If you want to arrange to have malg called with -1 that's fine, How do I do that? Alex
Sign in to reply to this message.
On Thu, Aug 26, 2010 at 23:31, <alex.brainman@gmail.com> wrote: >> I think this is overkill. Go won't have such big stacks, >> and calling into Windows C libraries presumably they >> already know about this restriction. > > Perhaps for Go, not so sure about Windows. And if we do callbacks, > things might get out of hand. But we're not doing callbacks yet and still things aren't working. Let's worry about making things work without callbacks. >> 8K is two pages. At most you have to touch the word at -0x1000(SP). >> That's just one instruction, which is fine to leave in for all > > architectures. >> >> 0xbbb is too weird. > > If you say I should put all m->g0 stacks on "os stack", then we don't > have to worry about it, 4k is good enough to start and it will grow. > > But, let me be clear on what you're saying. You say that m->g0 > scheduling stack should be shared between Go scheduler and all our > Windows calls? How big should I make it (stackguard - stackbase)? Yes. You should let the OS make it. When the OS allocates it it does things like arrange for automatic growth and guard pages to catch real stack overflows. I would set up g0 to have 8k at the top of the stack and then use the space below that for the OS calls. >> If you want to arrange to have malg called with -1 that's fine, > > How do I do that? I'd probably put #ifdef __WINDOWS__ enum { Windows = 1 }; #else enum { Windows = 0 }; #endif in runtime.h, and then in the call that says m->g0 = malg(8192); I'd say if(Windows) m->g0 = malg(-1); else m->g0 = malg(8192); Russ
Sign in to reply to this message.
> Yes. You should let the OS make it. When the OS allocates it > it does things like arrange for automatic growth and guard pages > to catch real stack overflows. I would set up g0 to have 8k at the > top of the stack and then use the space below that for the OS calls. I'm not sure we're on the same page here. I was planing to switch to m->g0 stack before stdcall and use whatever sp it is set to. Are you saying I should use "special" area below stackguard - 1024 or something? > I'd probably put > > #ifdef __WINDOWS__ > enum { > Windows = 1 > }; > #else > enum { > Windows = 0 > }; > #endif > > in runtime.h, and then in the call that says > > m->g0 = malg(8192); > > I'd say > > if(Windows) > m->g0 = malg(-1); > else > m->g0 = malg(8192); > It doesn't look good, but alternative of "leaking" memory goes against my nature <g>. Thank you. Alex
Sign in to reply to this message.
On Thu, Aug 26, 2010 at 23:47, <alex.brainman@gmail.com> wrote: >> Yes. You should let the OS make it. When the OS allocates it >> it does things like arrange for automatic growth and guard pages >> to catch real stack overflows. I would set up g0 to have 8k at the >> top of the stack and then use the space below that for the OS calls. > > I'm not sure we're on the same page here. I was planing to switch to > m->g0 stack before stdcall and use whatever sp it is set to. Are you > saying I should use "special" area below stackguard - 1024 or something? Either is fine. The Unix implementations switch to m->g0 directly but they don't have the same requirements that you do of trying to lay out the arguments on the stack in C before jumping into the assembly. Laying them out below the part of the stack that Go is using makes sure you'll never conflict, even if it's m->g0 that is making the stdcall. Russ
Sign in to reply to this message.
> ... Laying them out below the > part of the stack that Go is using makes sure you'll never > conflict, even if it's m->g0 that is making the stdcall. I'm with you now. A "little" bit below is all I need. Thank you. Alex
Sign in to reply to this message.
Hello rsc (cc: golang-dev@googlegroups.com), I'd like you to review this change.
Sign in to reply to this message.
http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/mkasmh.sh File src/pkg/runtime/mkasmh.sh (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/mkasmh.sh#ne... src/pkg/runtime/mkasmh.sh:85: /^aggr CgoThreadStart$/ { aggr = "cgots" } Shouldn't need this anymore (see below). http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h File src/pkg/runtime/runtime.h (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h#ne... src/pkg/runtime/runtime.h:204: struct CgoThreadStart Shouldn't need to do this anymore (see below). http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h#ne... src/pkg/runtime/runtime.h:243: CgoThreadStart *ts; // os thread start parmeters, to be freed once m is running Just use m as the parameters. See comment later. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... File src/pkg/runtime/windows/386/sys.s (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... src/pkg/runtime/windows/386/sys.s:32: // convert parameter count into number of bytes they occupy Various cleanup: * use REP;MOVSL directly instead of bothering to call mcpy * switch g too (not just SP) * use same register for get_tls (makes code easier to follow) TEXT syscall_raw(SB),7,$4 // Copy arguments from stack. MOVL fn+0(FP), AX MOVL count+4(FP), CX // words MOVL args+8(FP), BP // Switch to m->g0 if needed. get_tls(DI) MOVL m(DI), DX MOVL g(DI), SI MOVL SI, 0(SP) // save g MOVL SP, m_gostack(DX) // save SP MOVL m_g0(DX), SI CMPL g(DI), SI JEQ 3(PC) MOVL (m_sched+gobuf_sp)(DX), SP MOVL SI, g(DI) // Copy args to new stack. SUBL $(10*4), SP // padding MOVL CX, BX SARL $2, BX SUBL BX, SP MOVL SP, DI MOVL BP, SI CLD REP; MOVSL // Call function. CALL AX // Restore original SP, g. get_tls(DI) MOVL m(DI), DX MOVL m_gostack(DX), SP // restore SP MOVL 0(SP), SI // restore g MOVL SI, g(DI) // Someday the convention will be D is always cleared. CLD RET Untested. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... src/pkg/runtime/windows/386/sys.s:66: MOVL 4(SP), BX // ts make the argument "m" instead of "ts" and this will get easier. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... File src/pkg/runtime/windows/thread.c (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... src/pkg/runtime/windows/thread.c:266: ts->m->ts = ts; Instead of this, I'd suggest passing m and assuming g == m->g0 and fn == mstart. Other architectures do this already. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... src/pkg/runtime/windows/thread.c:276: // for every m, but first bootstrap m. s/but/except/
Sign in to reply to this message.
Thank you for review. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/mkasmh.sh File src/pkg/runtime/mkasmh.sh (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/mkasmh.sh#ne... src/pkg/runtime/mkasmh.sh:85: /^aggr CgoThreadStart$/ { aggr = "cgots" } On 2010/09/08 15:36:54, rsc1 wrote: > Shouldn't need this anymore (see below). > Done. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h File src/pkg/runtime/runtime.h (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h#ne... src/pkg/runtime/runtime.h:204: struct CgoThreadStart On 2010/09/08 15:36:54, rsc1 wrote: > Shouldn't need to do this anymore (see below). > Done. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/runtime.h#ne... src/pkg/runtime/runtime.h:243: CgoThreadStart *ts; // os thread start parmeters, to be freed once m is running On 2010/09/08 15:36:54, rsc1 wrote: > Just use m as the parameters. See comment later. > Done. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... File src/pkg/runtime/windows/386/sys.s (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... src/pkg/runtime/windows/386/sys.s:32: // convert parameter count into number of bytes they occupy > // Copy args to new stack. > SUBL $(10*4), SP // padding > MOVL CX, BX > SARL $2, BX It is SALL, not SARL. Right? http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/386/... src/pkg/runtime/windows/386/sys.s:66: MOVL 4(SP), BX // ts On 2010/09/08 15:36:54, rsc1 wrote: > make the argument "m" instead of "ts" > and this will get easier. > Done. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... File src/pkg/runtime/windows/thread.c (right): http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... src/pkg/runtime/windows/thread.c:266: ts->m->ts = ts; On 2010/09/08 15:36:54, rsc1 wrote: > Instead of this, I'd suggest passing m and assuming g == m->g0 > and fn == mstart. Other architectures do this already. > Done. http://codereview.appspot.com/2009045/diff/11001/src/pkg/runtime/windows/thre... src/pkg/runtime/windows/thread.c:276: // for every m, but first bootstrap m. On 2010/09/08 15:36:54, rsc1 wrote: > s/but/except/ > Won't need that code now.
Sign in to reply to this message.
Hello rsc (cc: golang-dev@googlegroups.com), I'd like you to review this change.
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=4b5d59d1660c *** runtime(windows): make sure scheduler runs on os stack and new stdcall implementation R=rsc CC=golang-dev http://codereview.appspot.com/2009045 Committer: Alex Brainman <alex.brainman@gmail.com>
Sign in to reply to this message.
|