New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: difficulty handling system page aligned stacks #41008
Comments
Change https://golang.org/cl/250183 mentions this issue: |
I'm not sure I understand this requirement. How does the system ensure it? What does the system check? |
@cherrymui I assume this is associated with |
How big is the physical page size on the system? If we set in the runtime the physical page size as at least 16KB, then all mmap should be 16K aligned, including MAP_STACK. I assume the actual goroutine stack does not necessarily start at the beginning of the mmap, and it would probably be fine. |
@cherrymui - OpenBSD on Octeon runs with 16KB physical pages. Changing the runtime page size to 16KB would be one solution (however that seemed like a larger hammer than might be acceptable).
I believe this is correct. The failures are when, for example, setting an alternate signal stack and passing a stack that is not physical page aligned. |
The runtime should query the physical page size at startup and only do physical-page-aligned mmaps, regardless of runtime's internal page size. If that is not the case, we probably should fix that. Maybe we just need to round the address when calling sigaltstack? |
Two mmaps are required on OpenBSD - the first is done for allocation, the second is to mark an allocation as being stack (
I'm not sure how we would fix/avoid this situation without (2) or (3) from the original issue comment. |
Can we round the mmap in (3) to physical page size? I guess it would not be a problem if we map more? |
Hrmmm, we can likely round a stack allocation down to get physical page alignment and then round up to get a multiple of physical page size. The problem would then potentially be two back to back stack allocations, that would then have overlapping mmaps - I'll have to see what the kernel says if we do that (and particularly if we remove one of them again). |
So I do not think this is going to work - if we do an mmap (96 physical pages) we get:
We then mmap again one physical page into the previous allocation and 32 physical pages in size:
We then mmap again, 32 physical pages into the original allocation and 32 physical pages in size (resulting in a single physical page overlap with the previous allocation, as would be required in the round down/up scenario):
We then stop using the second stack and remove MAP_STACKMAP:
If the top physical page of the first stack mapped allocation is then used and a system call is made (or a signal occurs), the process will now be terminated. So we'd be back to having to over allocate in order to guarantee there was no overlap in the case of rounding down/up. I'll take another look to see if this is possible in combination. |
I think basing this off stackFromSystem is the wrong approach, I'm afraid. This approach is going to cause all goroutine stacks to start at 16 KiB on openbsd/mips64, rather than the usual 2 KiB; it's going to throw off stack memory metrics; and it's going to dramatically change scavenging behavior for stack memory. stackFromSystem is really meant for runtime debugging (I'm fine with adding the osStackAlloc on that path, since that does fix that debugging mode on openbsd, but as a separate change). If I understand the underlying problem, we need to mark MAP_STACK regions on physical page boundaries (which happens to work right now only when the physical page size is <= the runtime page size). I just discussed this with @cherrymui and @mknyszek and they came up with several ideas. I'll summarize here, but let them weigh in with more details:
|
To close this out, from an operating system perspective this appears to work, however the over allocation required is pretty substantial and there are parts of Go's runtime that seem to dislike this (I've not tracked down exactly why this is yet). |
@aclements thanks for the detailed response - I agree that stackFromSystem is not ideal, but it does allow things to work :)
This is correct.
Yes, we need to switch to libc calls (see issue #36435), although this is currently blocked on #39257 (and https://go-review.googlesource.com/c/go/+/249978). This is likely the cleanest and easiest solution, providing that the g0 stacks are always used for libc calls (and therefore system calls). The g0 stack is presumably the OS allocated stack, which will already be MAP_STACK, at which point this entire issue should be moot.
Possible, but seems like a lot of effort and a wasted register.
This is a pretty big hammer that would allow a ROP stack to be created and run from any part of the heap - while Go mitigates against buffer overflows and makes it harder to gain control flow, it does not necessarily help when cgo or unsafe are in use.
Hrmmm, that could work reasonably well. I'll have a poke at this.
That likely works but seems like more effort the doing it in allocSpan. |
Change https://golang.org/cl/266919 mentions this issue: |
This seems to work reasonably well - https://go-review.googlesource.com/c/go/+/266919 |
@cherrymui looked into this a little and determined that this won't work. She points out that the kernel patch mentions "We can also perform this check on standard synchronous traps, for instance page faults." These happen all the time while running on the g stack, so the g stack has to be |
Ah, correct - checks are performed on both syscalls and page faults, so a page fault occurring on a g stack would be problematic without it being |
The current runtime stack allocation is based on mheap, which uses its own fixed page size (4096 bytes). On some systems there is a requirement to have stacks be system page aligned (for example, 16KB alignment for OpenBSD/octeon). In this case, if mheap provides memory that is not 16KB aligned various things fail.
There are a couple of options to address this:
Make it possible to system allocate the stacks (i.e. malloc) - this is the simplest option.
Over allocate via
allocManual()
and round to system page alignment during stack allocation - this should not require changes to mheap, however would require changes totype stack
, which would need to track the actual allocation (address and size), in addition to the lo/hi pointers (this is needed in order forstackfree()
to correctly return the allocation). Two additional pointers in this struct are probably acceptable, however various assembly knows about its size and offset.Change mheap to provide system page size aligned allocations - this would be doable, but seems to require a fair amount of effort, however avoids the need to change
type stack
(and various assembly).For the time being I plan on using (1) for the openbsd/mips64 port, however we may want to consider alternatives at a later date.
The text was updated successfully, but these errors were encountered: