Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go, x/tools/go/packages: how to instrument cgo code? #30479

Open
josharian opened this issue Feb 28, 2019 · 10 comments
Open

cmd/go, x/tools/go/packages: how to instrument cgo code? #30479

josharian opened this issue Feb 28, 2019 · 10 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@josharian
Copy link
Contributor

go-fuzz instruments source code, writes it out to a temp dir, and then calls go/build to compile it.

go/packages appears to support this: it generates CompiledGoFiles, which have been preprocessed by cgo. However, if you instrument those files and write them back out, "go build" doesn't work.

The first issue, easily fixed, is that they don't have a ".go" suffix.

The second issue is that cmd/go complains when it finds C/C++ files and/or cgo directives in a directory with only precompiled cgo code, because the precompiled cgo code doesn't contain import "C". Sample error message: //go:cgo_ldflag "/usr/local/Cellar/re2/20190101/lib/libre2.a" only allowed in cgo-generated code.

If you add a dummy file containing just import "C" to work around that issue, then cmd/go runs cgo on that, which generates duplicate definitions within the package. Sample error message: _Cgo_ptr redeclared in this block.

As an inferior fallback, I tried to instrument the pre-cgo code instead. However, the AST and type information that go/packages generates is for the CompiledGoFiles, not the GoFiles, so that'd mean typechecking everything by hand, which means losing a bunch of the value provided by go/packages.

For now, I'm just leaving cgo code completely uninstrumented. Unfortunately, this requires adding semi-fragile hacks to find cgo code (whether the file is missing a ".go" suffix) and then reaching through the abstraction layers and copying it verbatim.

I'm not sure how best to fix this. One option would be to have a way to tell cmd/go that all cgo preprocessing has been done. I don't see an obvious fix within go/packages, but it'd be ideal to find one, since that doesn't require waiting six months to be usable.

cc @matloob @ianthehat @bcmills @alandonovan @dvyukov

@josharian josharian added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 28, 2019
@matloob
Copy link
Contributor

matloob commented Feb 28, 2019

I don't completely understand, when

cmd/go complains when it finds C/C++ files and/or cgo directives in a directory with only precompiled cgo code"

are you passing the modified CompiledGoFiles to cmd/go to build them?

@josharian
Copy link
Contributor Author

are you passing the modified CompiledGoFiles to cmd/go to build them?

That was what I tried, yes. Is there something else I should try instead?

@matloob
Copy link
Contributor

matloob commented Feb 28, 2019

This is going to be really ugly, but I think the only option is to invoke the Go compiler directly with those files rather than running go build. go/packages' CompiledGoFiles returns the set of Go files that are provided to the Go compiler, so I think that's the only reliable way those files can be used.

Could that work? I wonder if I'm misunderstanding something?

@josharian
Copy link
Contributor Author

This is going to be really ugly, but I think the only option is to invoke the Go compiler directly

That might work...but then I think I'm also going to have to invoke the linker directly. (And compile the other packages as well?) And manage the cache myself. And manage other compiler flags, like -trimprefix. Which is to say: re-create a whole lot of cmd/go. Unless I'm missing something (which would be nice).

@matloob
Copy link
Contributor

matloob commented Feb 28, 2019

Unfortunately, as far as I know, there's not a great way to get in between cgo generation and the rest of the build. This is one of the things that makes cgo really frustrating. You'd need to instrument the pre-cgo "GoFiles" to invoke go build, but that has its own separate problems...

@josharian
Copy link
Contributor Author

@bcmills any thoughts about what a cmd/go fix for this might look like? I think this was one of the use cases we intended to support with go/packages.

@matloob what do you think about adding a knob to go/packages to ask it to use GoFiles instead of CompiledGoFiles for the Syntax and TypesInfo (and other?) fields. Then I could at least instrument the unprocessed source code. Kinda icky. Other suggestions welcome.

@josharian
Copy link
Contributor Author

(Oops, I missed your message before sending mine.)

@matloob
Copy link
Contributor

matloob commented Feb 28, 2019

Hm... I'd be wary of having a knob allow the usage of GoFiles for type checking because cgo files can't reliably be type-checked. And if we don't do that, that just leaves parsing Syntax trees, which isn't difficult enough that I think it would be worth adding a separate mode for that.

Though if this becomes a big issue for more folks, we should think a bit harder about this...

I'm hopeful that in the long term, we finally do the work of getting the Go compiler to understand cgo files "natively" so that the distinction between CompiledGoFiles and Go files (for the purposes of Cgo at least) disappears. Until then, there's going to be pain when cgo gets involved... like there always is when cgo is involved...

@josharian
Copy link
Contributor Author

Unfortunately, as far as I know, there's not a great way to get in between cgo generation and the rest of the build.

One way, I suppose, is to add official s2s support to cmd/go, as discussed in a meandering way in #29430.

I'd be wary of having a knob allow the usage of GoFiles for type checking

Fair enough.

I tried hacking at cmd/compile and cmd/go to get them to accept precompiled cgo sources, just as a POC, and got lost in cmd/go.

@jpap
Copy link
Contributor

jpap commented Oct 1, 2020

I'm hitting this as well. Instead of compiling the resulting instrumented Go code, I am wanting to modify the AST and rewrite the original source files in-place.

To modify the AST, I need type information, which packages will only supply when the Go files are compiled. But the AST I get back is that of the processed ("compiled") Cgo source in the go-build cache directory, not the original Go source file. The only way to link back to the original source is to notice the three line generated comment block/header beginning with // Code generated by cmd/cgo; DO NOT EDIT. and ending with //line ${PATH_TO_GO_FILE}:1:1.

Another quirk is that without the NeedCompiledGoFiles option, the Package.CompiledGoFiles field is not populated, but the Package.Syntax slice is still populated for all of the files that would normally be reported by CompiledGoFiles when requesting type information. That is, the syntax trees still correspond to the compiled Go files and not the regular GoFiles field. That meant when enumerating Syntax, and wanting to index GoFile, I ran off the end of the slice and panicked.

I am now looking to see how I can detect Cgo files, and then re-parse the original Go file with an "original" syntax tree (without types), and somehow map between the "compiled" version provided by packages (with types), to make my modifications. Any pointers for this would be appreciated.

I would love if there was a way to deal with Cgo files more easily with packages. What I'm aiming for is a more complex form of gorename coupled with code-generation. See #15793, where gorename also doesn't work with Cgo because it bails on a "DO NOT EDIT" marker: presumably for similar reasons to what we have here?

@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants