-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: windows tests fail with race detector #23900
Comments
It seems like my change may be genuinely causing us to use significantly more memory on Windows. Particularly, the "fatal error: failed to allocate arena index" buried in there suggests that the machine is actually low on RAM at that point. I'll try running some of the x/benchmarks benchmarks on the gomote since they report their own memory use, and I'll make that panic dump more information about memory consumption and the layout of the heap. @alexbrainman, any suggestions of other good ways to inspect memory usage and memory layout on Windows? |
Every failure above and most (though not all) of the failures I've looked at on the dashboard have been mapping the very first heap arena. We haven't even mapped the race shadow for the heap yet. The failing process can't possibly be using much memory itself at this point, so it's almost certainly the other things running concurrently that are using the memory. It may simply be that TSAN maps 4.5 bytes for every byte we map (MapShadow, I could change Go so that it calls Ping @dvyukov for thoughts. Dmitry, last week I changed the Go memory allocator to map 64MB at a time (always 64MB aligned) and eliminated the assumption that the heap is contiguous in memory. TSAN seems fine with this on Linux, but is running into "out of memory" errors on Windows. I know there are various Go-specific changes to the TSAN runtime. Have I just violated some of its assumptions about the Go heap? |
The fact that we reserve and commit 32MB of memory for the arena index may also be contributing to the problem. @alexbrainman, it would be fine to map that memory incrementally, but it would be really nice to have zero-filled memory backing most of it, even if it's read-only. Is there a way to get zeroed memory in Windows that doesn't count against the commit limit? |
Sometimes I use https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap and https://docs.microsoft.com/en-us/sysinternals/downloads/rammap Unfortunately these are GUI apps and they won't work over gomote. But if you need a Windows computer to play with, you can always build one on GCE. You might be interested to watch https://www.youtube.com/watch?v=AjTl53I_qzY if you have 3 spare hours in your life. I also use https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer (not for memory problems), but for debugging in general.
I do not know. I will try to search old my records. But you can also search yourself. For example looking at my usual hunting ground (Raymond Chen's blog) - searching for "VirtualAlloc zero", I get this list and https://blogs.msdn.microsoft.com/oldnewthing/20170113-00/?p=95185 on that list looks interesting. But ultimately you should try your ideas and see what happens with VMMap. I hope it helps. Alex |
I did. There is nothing much there: https://blogs.msdn.microsoft.com/oldnewthing/20170317-00/?p=95755 Maybe @dvyukov can also help with your question. He does know a bit about Windows, but he is shy. Alex |
MapShadow assumes that Go heap is continuous and grows up. See the mapped_meta_end variable in MapShadow. Tsan also has pretty tight assumptions about layout of virtual address space: That MapShadow complexity was actually imposed by Windows. On linux/bsd we simply map 128TB of memory at start and we are done. But Windows eagerly allocates page tables (potentially even for reserved memory) and charges the process for it. So if we map 128TB and page tables take, say, 1/1000-th, Windows will try to eagerly allocate 128GB which always fails. So we had to do this lazy shadow allocation. There is additional problem with the "meta" shadow which is 1/2 of application memory, so even if an app allocates in dwAllocationGranularity chunks, the shadow is smaller than dwAllocationGranularity. I've looked at a dozen of failures on the build dashboard and it seems that the broken assumption about continuous heap is not the problem (at least not yet) and they simply fail due to OOM errors:
These are allocations of memory shadow and meta shadow of expected sizes at expected addresses. Is it possible to reduce 64MB const under race detector on Windows? |
@alexbrainman, thanks for the pointers. I'd looked over the MSDN docs for virtual memory functions and didn't get much out of it, but those gave me some ideas I can at least try, though it may just not be possible to do what I want.
Thanks for pointing me to the memory map. Go will still grow the heap contiguously from 0xc000000000 unless it runs into some other mapping. If it does, it will jump to 0xc100000000 and so on. It looks like TSAN is okay with that from the memory layout standpoint, but that If Go falls all the way back to 0xe000000000, it will ask the OS for whatever it can give, but something would have to go pretty wrong to get to that point.
Unfortunately, that would increase the size of the arena index proportionally, and it's already 32MB. That said,
|
No, it's just am implementation detail to deal with dwAllocationGranularity for the meta shadow.
Sounds like this can work. |
Change https://golang.org/cl/96780 mentions this issue: |
Change https://golang.org/cl/96779 mentions this issue: |
That would be great. We can definitely guarantee 128KB alignment. In fact, it's 64MB-aligned right now, and will still be 4MB-aligned after CL 96780. Should I file a bug to track that, and should I do it here or is there a better place to file it? |
Currently, the heap arena map is a single, large array that covers every possible arena frame in the entire address space. This is practical up to about 48 bits of address space with 64 MB arenas. However, there are two problems with this: 1. mips64, ppc64, and s390x support full 64-bit address spaces (though on Linux only s390x has kernel support for 64-bit address spaces). On these platforms, it would be good to support these larger address spaces. 2. On Windows, processes are charged for untouched memory, so for processes with small heaps, the mostly-untouched 32 MB arena map plus a 64 MB arena are significant overhead. Hence, it would be good to reduce both the arena map size and the arena size, but with a single-level arena, these are inversely proportional. This CL adds support for a two-level arena map. Arena frame numbers are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2 index. At the moment, arenaL1Bits is always 0, so we effectively have a single level map. We do a few things so that this has no cost beyond the current single-level map: 1. We embed the L2 array directly in mheap, so if there's a single entry in the L2 array, the representation is identical to the current representation and there's no extra level of indirection. 2. Hot code that accesses the arena map is structured so that it optimizes to nearly the same machine code as it does currently. 3. We make some small tweaks to hot code paths and to the inliner itself to keep some important functions inlined despite their now-larger ASTs. In particular, this is necessary for heapBitsForAddr and heapBits.next. Possibly as a result of some of the tweaks, this actually slightly improves the performance of the x/benchmarks garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.26ms ± 1% -1.07% (p=0.000 n=17+19) (https://perf.golang.org/search?q=upload:20180223.2) For #23900. Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff Reviewed-on: https://go-review.googlesource.com/96779 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
This tracker is good (add label:RaceDetector). |
Done: #24133. |
windows-adm64-2008 and windows-adm64-2012 builders fail with tests that use race detector. Like:
see https://build.golang.org/log/58c395893d056c4144f33d21648db2385b55b885 for full log. windows-adm64-race is also broken, but windows-386-2008 and windows-amd64-2016 are fine. This seems started with 2b41554 so paging @aclements .
Alex
The text was updated successfully, but these errors were encountered: