Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug/elf: uncompress .zdebug sections in (*Section).Open #58254

Closed
brancz opened this issue Feb 2, 2023 · 9 comments
Closed

debug/elf: uncompress .zdebug sections in (*Section).Open #58254

brancz opened this issue Feb 2, 2023 · 9 comments
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@brancz
Copy link
Contributor

brancz commented Feb 2, 2023

I would like to propose adding an API to the debug/elf package to fully read sections (compressed and uncompressed alike). Current APIs only allow reading uncompressed sections, or in select cases may handle compressed sections, but do not allow accessing the decompressed bytes of a compressed section.

Why?

I have recently come across the dwz tool, which some Linux distributions (eg. Fedora and Debian) use to optimize debuginfos (as in debuginfos that have already been split off of a binary using objcopy --only-keep-debug $FILE $FILE.debug).

dwz deduplicates as much data as possible and subsequently splits the original debuginfos into two files, a primary debuginfo file which has a .gnu_debugaltlink section added, which links to the second file, which is also a valid elf file, where the deduplicated data lives, the primary debuginfo can then have references to point to where to read deduplicated data from in the supplementary debuginfo file. An example: The primary debuginfo DWARF DIEs, let's say a DW_AT_name could not be a string, but instead an offset into the supplementary debuginfo file's .debug_str or decompressed .zdebug_str section, directing us to read a null-terminated string at that offset.

Once the bytes of the section are available writing the code to read null-terminated strings at an offset is simple and could have been done separately. If it was just uncompressed .debug_str we could just use the debug/elf package's functionality to read the section, however, the issue arises when the section is compressed, as the unexported field compressionOffset needs to be set appropriately to read its content successfully.

For background: I happen to work on the open-source continuous-profiler, Parca, which uses the debug/elf and debuf/dwarf packages for symbolizing profiling data, and we'd like to support symbolizing these cases.

I could see a couple of different APIs for me to achieve my goal, but the ability to read compressed sections would have a minimal API surface and would allow any very special cases that the go runtime itself doesn't need, to be handled separately.

Alternatively, I could imagine an API within the debug/dwarf package that exposed a reader to read strings at a given offset, but that would be easy to misunderstand and misuse I feel.

@brancz brancz added the Proposal label Feb 2, 2023
@gopherbot gopherbot added this to the Proposal milestone Feb 2, 2023
@ianlancetaylor
Copy link
Contributor

Right now if you use the (*Section).Open method it will automatically decompress the section contents, if they are marked with the ELF standard SHF_COMPRESSED flag. Your mention of .zdebug_str makes me wonder whether dwz is generating files with the older mechanism in which the section name indicates that the contents are compressed. Right now we handle that case in the (*File).DWARF method. Perhaps we should also handle it in (*Section).Open: if the section name starts with ".z" and the section contents start with ZLIB then we automatically decompress the data.

Would that fix your problem?

@seankhliao seankhliao changed the title proposal: Add API to allow reading compressed ELF sections; affected/package: debug/elf proposal: debug/elf: add API to allow reading compressed ELF sections Feb 2, 2023
@brancz
Copy link
Contributor Author

brancz commented Feb 3, 2023

Just to reiterate you're suggesting to essentially move the offset handling done in the (*File).DWARF) function to the (*Section).Open function? That would absolutely solve my problem, and I'd be more than happy to submit a patch for it!

@ianlancetaylor
Copy link
Contributor

OK, taking this out of the proposal process.

@ianlancetaylor ianlancetaylor changed the title proposal: debug/elf: add API to allow reading compressed ELF sections debug/elf: uncompress .zdebug sections in (*Section).Open Feb 3, 2023
@ianlancetaylor ianlancetaylor added help wanted NeedsFix The path to resolution is known, but the work has not been done. and removed Proposal labels Feb 3, 2023
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Backlog Feb 3, 2023
@aarzilli
Copy link
Contributor

aarzilli commented Feb 6, 2023

This would be backwards incompatible and require a GODEBUG setting per #56986, right?

@ianlancetaylor
Copy link
Contributor

It would be backwards incompatible, but not every single backward incompatibility requires a GODEBUG setting. We only need a GODEBUG setting if the change is likely to break a reasonable number of real programs. In this case that seems to me to be unlikely. There may well be programs that are prepared to uncompress the ELF section data themselves, but I would expect those problems to work correctly if the receive uncompressed data. Do you know of examples where that would not work?

@aarzilli
Copy link
Contributor

aarzilli commented Feb 7, 2023

Do you know of examples where that would not work?

I do not, delve for example will assume the section is not compressed if it doesn't start with ZLIB: https://github.com/go-delve/delve/blob/4303ae45a8e2996b30d2318f239677a771aef9c1/pkg/dwarf/godwarf/sections.go#L88. However I don't think it would be too strange if some program existed that simply errored in that circumstance. The string ZLIB at the start of the section exists to determine the type of compression, if you didn't see ZLIB you could also assume that a different compression is being used (no other compression algorithms are allowed at the moment but someone could write that to future-proof their code).

The counterpoint to this is that moving decompression from debug/dwarf to debug/elf just provides a small convenience, users could always write the decompression themselves. Also does the auto-decompression only go in debug/elf or also debug/macho and debug/pe? Compilers other than go do not produce compressed zdebug sections in PE and Mach-O executables.

@ianlancetaylor
Copy link
Contributor

The .zdebug compression is obsolete, as all new files should be using SHF_COMPRESSED. I don't think we need to introduce a GODEBUG setting because there might be hypothetical code that would be affected.

As far as I can see the suggested change doesn't affect debug/dwarf at all. The code that would change, besides (*elf.Section).Open, is (*elf.File).DWARF.

Given that (*elf.Section).Open already handles SHF_COMPRESSED sections, it seems appropriate to me that it should also handle .zdebug sections. The current approach seems inconsistent.

mauri870 added a commit to mauri870/go that referenced this issue Jul 27, 2023
The method (*Section).Open automatically decompresses section contents
when they are marked with SHF_COMPRESSED. This is the recommended
approach in the ELF standard. Some tools use the older mechanism of
prefixing the sections with a z as in .zdebug_str for example, this case
was unhandled.

The new implementation will look for SHF_COMPRESSED as well as sniffing
for ZLIB magic bytes and for sections starting with .z.

Fixes golang#58254
mauri870 added a commit to mauri870/go that referenced this issue Jul 28, 2023
The method (*Section).Open automatically decompresses section contents
when they are marked with SHF_COMPRESSED. This is the recommended
approach in the ELF standard. Some tools use the older mechanism of
prefixing the sections with a z as in .zdebug_str for example, this case
was unhandled.

The new implementation will look for SHF_COMPRESSED as well as sniffing
for ZLIB magic bytes and for sections starting with .z.

Fixes golang#58254
@gopherbot
Copy link

Change https://go.dev/cl/513875 mentions this issue: debug/elf: uncompress .zdebug sections in Open

@ianlancetaylor
Copy link
Contributor

Already fixed by https://go.dev/cl/480895.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants