Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go for Windows should generate PDBs #62420

Open
randomascii opened this issue Sep 1, 2023 · 10 comments
Open

proposal: Go for Windows should generate PDBs #62420

randomascii opened this issue Sep 1, 2023 · 10 comments

Comments

@randomascii
Copy link

Chromum uses Go for various build tools, including the up-and-coming siso build tool. Windows has a rich set of tools (better than on Linux in many ways) for debugging and profiling. All Windows executables should generate PDB files (Windows debug-information files) to let them use these tools. Go does not generate PDBs
and therefore these powerful and standard tools cannot be used with Go executables. This disconnect between Windows expectations and what Go provides makes adoption of Go on Windows more difficult, and means that some performance problems will be unnecessarily difficult to investigate.

Details: Virtually every EXE and DLL on Windows ships with a link to a PDB that contains, at the very least, the names of functions in the binary. Here is an example for notepad:

dumpbin C:\Windows\System32\notepad.exe /headers | find /i "rsds"
BDD4ADCD cv 24 0002B4E4 2A0E4 Format: RSDS, {67D551E7-B9BB-3B68-E823-F5B998BD9453}, 1, notepad.pdb

The PDB isn't actually shipped, but it can be obtained (in an automated way) from symbol servers that are published by Microsoft, Chrome, Firefox, and other companies (https://randomascii.wordpress.com/2013/03/09/symbols-the-microsoft-way/).

Go binaries, on the other hand, lack this information:

dumpbin third_party\siso\siso.exe /headers | find /i "rsds"

This means that windbg, VS, ETW, and many other amazing tools can do very little with Go programs.

ETW profiling in particular (https://randomascii.wordpress.com/2015/09/24/etw-central/) is tremendously powerful. It allows viewing of CPU usage (both sampled and based on context switches), disk I/O, memory allocations, and many other performance-relevant attributes, all on a unified timeline with configurable ways of drilling into the available information.
ETW sampling can seamlessly straddle the user-mode/kernel-mode boundary, making kernel calls appear just like function calls in the tables. ETW's context-switch graphs allow diagnosing of where a thread is waiting and what it is waiting on.

While integrated profiling with -cpuprofile (supported by siso) is nice, it is not a substitute for ETW profiling.

Concretely, I am currently trying to understand some suboptimal performance in siso (Googler-only link - b/298409062) but siso's -cpuprofile option doesn't make this easy to do. Simultaneously I am looking at ETW profiles (from a user) that show that most of the time in Chrome's setup process is being spent in KiPageFault - this is information that no user-mode profiler could reveal.

The Go team should consider giving their build tools an option (on by default!) to generate basic PDBs that can be consumed by standard Windows tools. The lld-link linker already generates rich PDBs (with stack frame, type, and inlining information) and generating just function name information would be simple in comparison, and could reuse the expertise and code of the lld-link project.

TL;DR - Support of PDBs is expected on Windows and it would be great if Go could satisfy that expectation.

@gopherbot gopherbot added this to the Proposal milestone Sep 1, 2023
@ianlancetaylor
Copy link
Contributor

CC @golang/runtime @golang/windows

@cespare
Copy link
Contributor

cespare commented Sep 1, 2023

Is there some kind of spec or official documentation for PDB? The closest I can find is this now-archived Microsoft GitHub repo.

@randomascii
Copy link
Author

I will ask if there is any better documentation.

@alexbrainman
Copy link
Member

@randomascii go.1.21 introduced support for windbg - see #57302 for details. Make sure you build your exes with go.1.21.

I do not know anything about windbg or ETW to help with your issue, but hopefully @qmuntal knows more and will comment.

Alex

@mauri870
Copy link
Member

mauri870 commented Sep 2, 2023

Not sure if it is helpful but there is also a way to convert DWARF into PDB with cv2pdb

@randomascii
Copy link
Author

Converting DWARF to PDB would be fine as long as the necessary debug directory is in the PE file to point to the PDB. For instance, here is the debug directory for Chrome:

Debug Directories

    Time Type        Size      RVA  Pointer
-------- ------- -------- -------- --------
64E8F817 cv            60 0028727C   28607C    Format: RSDS, {ADFEBF32-F6DA-AEA4-4C4C-44205044422E}, 1, C:\b\s\w\ir\cache\builder\src\out\Release_x64\initialexe\chrome.exe.pdb
64E8F817 dllchar        4 002872DC   2860DC

The GUID, age (the '1'), and the PDB name are sufficient to let debuggers and profilers know when they have found the correct PDB. It also give a way of finding the PDB in a symbol server.

@randomascii
Copy link
Author

@randomascii go.1.21 introduced support for windbg - see #57302 for details. Make sure you build your exes with go.1.21.

Ah - interesting. Having support for stack unwinding is also important for debugging and profiling. I'm glad that this has been done.

PDB support to let tools find symbol names is probably simpler than the stack unwinding task, I would guess, and would complement the stack unwinding very nicely.

@randomascii
Copy link
Author

From an lexan (clang-cl) team member:

We wrote https://llvm.org/docs/PDB/index.html back then. The llvm-pdbutil and llvm/lib/DebugInfo/{PDB,MSF} code is probably useful.

https://github.com/microsoft/microsoft-pdb was the reference code they open-sourced for us.

@rnk
Copy link

rnk commented Sep 2, 2023

I would say that making PDBs is hard, but generating the CodeView debug info that goes into the object file prior to linking is very easy. If Go has a mode that produces relocatable COFF object files and the final PE link is handled by MSVC link.exe or lld-link.exe, like a mode for statically linking together Go and C++ code, that would be a significant shortcut to generating PDBs.

Getting function names, or public symbols, is probably a matter of emitting S_GPROC32_ID records into a .debug$S section, and the record format is in the cvinfo.h header. You can also check Clang's output on godbolt to see the assembly directives for comparison. It has comments to try to make the records a bit more readable.

The next thing you would want is line tables, and I couldn't find any docs for that. The best reference is probably in LLVM's assembler in MCCodeView.cpp. That's not needed for profiling, so that is out of scope for this issue, but very nice to have.

You can use llvm-pdbutil dump -all foo.obj (portable LLVM tool) or cvdump.exe (Microsoft tool) to dump the records and check if you're generating the right data.

@qmuntal
Copy link
Contributor

qmuntal commented Sep 3, 2023

First of all, 100% agree with this proposal. I have done some work (together with @dagood) to generate PDBs out of Go debug metadata, and it is already proving to be very handy when debugging syscalls/cgo on Windows.

Go.1.21 introduced support for windbg - see #57302 for details. ... I do not know anything about windbg or ETW to help with your issue, but hopefully @qmuntal knows more and will comment.

Go 1.21 do contain SEH unwinding information, which is a requirement for getting meaningful stack traces in WinDbg and friends. SEH has its limitations though: it is not designed as a debugging mechanism, its primary goal is to support exception handling. For example, it doesn't contain function names nor information about inlined functions. On the other hand, CodeView and PDB are designed explicitly to contain debugging metadata.

We wrote https://llvm.org/docs/PDB/index.html back then. The llvm-pdbutil and llvm/lib/DebugInfo/{PDB,MSF} code is probably useful.

https://github.com/microsoft/microsoft-pdb was the reference code they open-sourced for us.

I've written a minimal PDB writer that can generate accurate stack traces for Go binaries (not open sourced yet) only using microsoft-pdb and LLVM's codebase, so yes, it is possible for Go to fully support PDB files by only looking at public documentation.

Not sure if it is helpful but there is also a way to convert DWARF into PDB with cv2pdb

cv2pdb can only generate the subset of information that PDB and DWARF share. It would be a pity that an hypothetical Go PDB file would be limited to that. For example. PDB can contain source file checksums, while DWARF can't (AFAIK), and inside Microsoft it is a requirement to generate PDBs with those checksums wherever possible.

If Go has a mode that produces relocatable COFF object files and the final PE link is handled by MSVC link.exe or lld-link.exe, like a mode for statically linking together Go and C++ code, that would be a significant shortcut to generating PDBs.

Go already generates COFF object files when using external linking, the problem is that, on Windows, it only supports GCC-like toolchains, which doesn't support PDB. IMO writing PDBs is hard, but easy to maintain once you have it, as the format is almost written in stone, it doesn't change. If Go hard built-in support for writing PDB files, then it could generate PDB files when using internal linking or with external linkers that don't support PDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Incoming
Development

No branches or pull requests

9 participants