Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/cover: extend coverage testing to include applications #51430

Open
thanm opened this issue Mar 2, 2022 · 73 comments
Open

cmd/cover: extend coverage testing to include applications #51430

thanm opened this issue Mar 2, 2022 · 73 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Milestone

Comments

@thanm
Copy link
Contributor

thanm commented Mar 2, 2022

Proposal: extend code coverage testing to include applications

Author(s): Than McIntosh

Last updated: 2022-03-02

Detailed design document: markdown, CL 388857

Abstract

This document contains a proposal for improving/revamping the system used in Go for code coverage testing.

Background

Current support for coverage testing

The Go toolchain currently includes support for collecting and reporting
coverage data for Golang unit tests; this facility is made available via the "go
test -cover" and "go tool cover" commands.

The current workflow for collecting coverage data is baked into "go test"
command; the assumption is that the source code of interest is a Go package
or set of packages with associated tests.

To request coverage data for a package test run, a user can invoke the test(s)
via:

  go test -coverprofile=<filename> [package target(s)]

This command will build the specified packages with coverage instrumentation,
execute the package tests, and write an output file to "filename" with the
coverage results of the run.

The resulting output file can be viewed/examined using commands such as

  go tool cover -func=<covdatafile>
  go tool cover -html=<covdatafile>

Under the hood, the implementation works by source rewriting: when "go test" is
building the specified set of package tests, it runs each package source file
of interest through a source-to-source translation tool that produces an
instrumented/augmented equivalent, with instrumentation that records which
portions of the code execute as the test runs.

A function such as

  func ABC(x int) {
    if x < 0 {
      bar()
    }
  }

is rewritten to something like

  func ABC(x int) {GoCover_0_343662613637653164643337.Count[9] = 1;
    if x < 0 {GoCover_0_343662613637653164643337.Count[10] = 1;
      bar()
    }
  }

where "GoCover_0_343662613637653164643337" is a tool-generated structure with
execution counters and source position information.

The "go test" command also emits boilerplate code into the generated
"_testmain.go" to register each instrumented source file and unpack the coverage
data structures into something that can be easily accessed at runtime.
Finally, the modified "_testmain.go" has code to call runtime routines that
emit the coverage output file when the test completes.

Strengths and weaknesses of what we currently provide

The current implementation is simple and easy to use, and provides a good user
experience for the use case of collecting coverage data for package unit tests.
Since "go test" is performing both the build and the invocation/execution of the
test, it can provide a nice seamless "single command" user experience.

A key weakness of the current implementation is that it does not scale well-- it
is difficult or impossible to gather coverage data for applications as opposed
to collections of packages, and for testing scenarios involving multiple
runs/executions.

For example, consider a medium-sized application such as the Go compiler ("gc").
While the various packages in the compiler source tree have unit tests, and one
can use "go test" to obtain coverage data for those tests, the unit tests by
themselves only exercise a small fraction of the code paths in the compiler that
one would get from actually running the compiler binary itself on a large
collection of Go source files.

For such applications, one would like to build a coverage-instrumented copy of
the entire application ("gc"), then run that instrumented application over many
inputs (say, all the Go source files compiled as part of a "make.bash" run for
multiple GOARCH values), producing a collection of coverage data output files,
and finally merge together the results to produce a report or provide a
visualization.

Many folks in the Golang community have run into this problem; there are large
numbers of blog posts and other pages describing the issue, and recommending
workarounds (or providing add-on tools that help); doing a web search for
"golang integration code coverage" will turn up many pages of links.

An additional weakness in the current Go toolchain offering relates to the way
in which coverage data is presented to the user from the "go tool cover")
commands. The reports produced are "flat" and not hierarchical (e.g. a flat list of
functions, or a flat list of source files within the instrumented packages).
This way of structuring a report works well when the number of instrumented
packages is small, but becomes less attractive if there are hundreds or
thousands of source files being instrumented. For larger applications, it would make
sense to create reports with a more hierarchical structure: first a summary by module,
then package within module, then source file within package, and so on.

Finally, there are a number of long-standing problems that arise due to the use
of source-to-source rewriting used by cmd/cover and the go command, including

#23883
"cmd/go: -coverpkg=all gives different coverage value when run on a
package list vs ./..."

#23910
"cmd/go: -coverpkg packages imported by all tests, even ones that
otherwise do not use it"

#27336
"cmd/go: test coverpkg panics when defining the same flag in
multiple packages"

Most of these problems arise because of the introduction of additional imports
in the _testmain.go shim created by the Go command when carrying out a coverage
test run in combination with the "-coverpkg" option.

Proposed changes

Building for coverage

While the existing "go test" based coverage workflow will continue to be
supported, the proposal is to add coverage as a new build mode for "go build".
In the same way that users can build a race-detector instrumented executable
using "go build -race", it will be possible to build a coverage-instrumented
executable using "go build -cover".

To support this goal, the plan will be to migrate the support for coverage
instrumentation into the compiler, moving away from the source-to-source
translation approach.

Running instrumented applications

Applications are deployed and run in many different ways, ranging from very
simple (direct invocation of a single executable) to very complex (e.g. gangs of
cooperating processes involving multiple distinct executables). To allow for more
complex execution/invocation scenarios, it doesn't make sense
to try to serialize updates to a single coverage output data file during the
run, since this would require introducing synchronization or some other
mechanism to ensure mutually exclusive access.

For non-test applications built for coverage, users will instead select an
output directory as opposed to a single file; each run of the instrumented
executable will emit data files within that directory. Example:

$ go build -o myapp.exe -cover ...
$ mkdir /tmp/mycovdata
$ export GOCOVERDIR=/tmp/mycovdata
$ <run test suite, resulting in multiple invocations of myapp.exe>
$ go tool cover -html=/tmp/mycovdata
$

For coverage runs in the context of "go test", the default will continue to be
emitting a single named output file when the test is run.

File names within the output directory will be chosen at runtime so as to
minimize the possibility of collisions, e.g. possibly something to the effect of

  covdata.<metafilehash>.<processid>.<nanotimevalue>.out

When invoked for reporting, the coverage tool itself will test its input
argument to see whether it is a file or a directory; in the latter case, it will
read and process all of the files in the specified directory.

Programs that call os.Exit(), or never terminate

With the current coverage tooling, if a Go unit test invokes os.Exit() passing a
non-zero exit status, the instrumented test binary will terminate immediately
without writing an output data file. If a test invokes os.Exit() passing a zero exit
status, this will result in a panic.

For unit tests, this is perfectly acceptable-- people writing tests generally
have no incentive or need to call os.Exit, it simply would not add anything in
terms of test functionality. Real applications routinely finish by calling os.Exit,
however, including cases where a non-zero exit status is reported.
Integration test suites nearly always include tests that ensure an application
fails properly (e.g. returns with non-zero exit status) if the application
encounters an invalid input. The Go project's all.bash test suite has many of these sorts of tests,
including test cases that are expected to cause compiler or linker errors (and
to ensure that the proper error paths in the tool are covered).

To support collecting coverage data from such programs, the Go runtime will need
to be extended to detect os.Exit calls from instrumented programs and ensure (in
some form) that coverage data is written out before the program terminates.
This could be accomplished either by introducing new hooks into the os.Exit
code, or possibly by opening and mmap'ing the coverage output file earlier in
the run, then letting writes to counter variables go directly to an mmap'd
region, which would eliminated the need to close the file on exit (credit to
Austin for this idea).

To handle server programs (which in many cases run forever and may not call
exit), APIs will be provided for writing out a coverage profile under user
control, e.g. something along the lines of

  import "<someOfficialPath>/cover"

  var *coverageoutdir flag.String(...)

  func server() {
    ...
    if *coverageoutdir != "" {
        f, err := cover.OpenCoverageOutputFile(...)
        if err != nil {
            log.Fatal("...")
	   }
    }
    for {
      ...
      if <received signal to emit coverage data> {
        err := f.Emit()
        if err != nil {
            log.Fatalf("error %v emitting ...", err)
        }
      }
    }

In addition to OpenCoverageOutputFile() and Emit() as above, an Emit() function
will be provided that accepts an io.Writer (to allow coverage profiles to be
written to a network connection or pipe, in case writing to a file is not
possible).

Coverage and modules

Most modern Go programs make extensive use of dependent third-party packages;
with the advent of Go modules, we now have systems in place to explicitly
identify and track these dependencies.

When application writers add a third-party dependency, in most cases the authors
will not be interested in having that dependency's code count towards the
"percent of lines covered" metric for their application (there will definitely
be exceptions to this rule, but it should hold in most cases).

It makes sense to leverage information from the Go module system when collecting
code coverage data. Within the context of the module system, a given package feeding
into the build of an application will have one of the three following dispositions (relative to
the main module):

  • Contained: package is part of the module itself (not a dependency)
  • Dependent: package is a direct or indirect dependency of the module (appearing in go.mod)
  • Stdlib: package is part of the Go standard library / runtime

With this in mind, the proposal when building an application for coverage will
be to instrument every package that feeds into the build, but record the
disposition for each package (as above), then allow the user to select the
proper granularity or treatment of dependencies when viewing or reporting.

As an example, consider the Delve debugger
(a Go application). One entry in the Delve V1.8 go.mod file is:

    github.com/cosiner/argv v0.1.0

This package ("argv") has about 500 lines of Go code and a couple dozen Go
functions; Delve uses only a single exported function. For a developer trying to
generate a coverage report for Delve, it seems unlikely that they would want to
include "argv" as part of the coverage statistics (percent lines/functions executed),
given the secondary and very modest role that the dependency plays.

On the other hand, it's possible to imagine scenarios in which a specific
dependency plays an integral or important role for a given application, meaning
that a developer might want to include the package in the applications coverage
statistics.

Merging coverage data output files

As part of this work the proposal to enhance the "go tool cover" command to
provide a profile merging facility, so that collection of coverage data files
(emitted from multiple runs of an instrumented executable) can be merged into a
single summary output file. Example usage:

  $ go tool cover -merge -coveragedir=/tmp/mycovdata -o finalprofile.out
  $

The merge tool will be capable of writing files in the existing (legacy)
coverage output file format, if requested by the user.

In addition to a "merge" facility, it may also be interesting to support other
operations such as intersect and subtract (more on this later).

Differential coverage

When fixing a bug in an application, it is common practice to add a new unit
test in addition to the code change that comprises the actual fix.
When using code coverage, users may want to learn how many of the changed lines
in their code are actually covered when the new test runs.

Assuming we have a set of N coverage data output files (corresponding to those
generated when running the existing set of tests for a package) and a new
coverage data file generated from a new testpoint, it would be useful to provide
a tool to "subtract" out the coverage information from the first set from the
second file. This would leave just the set of new lines / regions that the new test causes to
be covered above and beyond what is already there.

This feature (profile subtraction) would make it much easier to write tooling
that would provide feedback to developers on whether newly written unit tests
are covering new code in the way that the developer intended.

Design details

Please see the design document for details on proposed changes to the compiler, etc.

Implementation timetable

Plan is for thanm@ to implement this in go 1.19 time frame.

Prerequisite Changes

N/A

Preliminary Results

No data available yet.

@thanm thanm added the Proposal label Mar 2, 2022
@thanm thanm self-assigned this Mar 2, 2022
@gopherbot gopherbot added this to the Proposal milestone Mar 2, 2022
@gopherbot
Copy link

Change https://go.dev/cl/388857 mentions this issue: proposal: design document for redesigned code coverage

@mvdan
Copy link
Member

mvdan commented Mar 2, 2022

#30306 is also likely relevant :)

@qmuntal
Copy link
Contributor

qmuntal commented Mar 2, 2022

In the past I've had the need to merge coverage profiles from the same application executed in different OSs, therefore different binaries, in order to report a single coverage metric and have an OS-agnostic report.

I see no explicit reference to this use case in the design, would it be covered? (pun intended)

@thanm
Copy link
Contributor Author

thanm commented Mar 2, 2022

Merging coverage profiles produced in different GOOS/GOARCH environments: yes, this will absolutely be supported.

One of the interesting (and IMHO slightly weird) aspects of the current coverage system is that its definition of "all source code in the package" is limited by what's picked up by the build tags in effect for the "go test" run.

So it's possible to do "go test -cover" and see "100%" statement coverage on linux, even if there may be a 500-line function in a file foo_windows.go in the package (with build tags to select only for GOOS=windows) that is effectively invisible to the coverage tooling.

@stapelberg
Copy link
Contributor

This sounds great! Can’t wait to try this out! :)

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Mar 2, 2022
@bcmills
Copy link
Contributor

bcmills commented Mar 2, 2022

(#31007 is also somewhat related.)

@ChrisHines
Copy link
Contributor

I really like the direction this proposal would take code coverage for Go. It looks like this proposal would lay the foundation to bring Go's code coverage story to the next level. From a developer's perspective I really liked the capabilities of the EMMA tool for Java code coverage I used to use years ago. I felt I had lost something with the code coverage tools available to Go when it became my main programming language roughly ten years ago. Consider taking a look at it as prior art if you haven't already.

In particular I would like to advocate for the "Intra-line coverage" feature mentioned in the "Possible extensions" section of the proposal. That was a feature of EMMA that I got a lot of value out of when I used it (EMMA calls it fractional line coverage).

But also, the ability to measure code coverage for a whole application and gather coverage data from integration tests are two use cases my current project at work needs a solution for. This was a discussion that came up just last week, so the timeliness of this proposal is amazing.

@adonovan
Copy link
Member

adonovan commented Mar 2, 2022

Nice idea. This would bring the designs for coverage used by the go build tool and Blaze/Bazel into convergence. (The latter already implements coverage as a build mode, and supports whole-program coverage.) Moving the instrumentation step from a source-based transformation to a compiler pass should also simplify both go and Bazel, and may permit finer-grained coverage of short-circuit expressions such as a || b && c than line granularity.

@rsc rsc changed the title proposal: extend Go's code coverage testing to include applications proposal: cmd/cover: extend Go's code coverage testing to include applications Mar 2, 2022
@rsc rsc changed the title proposal: cmd/cover: extend Go's code coverage testing to include applications proposal: cmd/cover: extend coverage testing to include applications Mar 2, 2022
@rsc
Copy link
Contributor

rsc commented Mar 2, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Mar 2, 2022
@ianlancetaylor
Copy link
Contributor

CC @robpike

@robpike
Copy link
Contributor

robpike commented Mar 3, 2022

While I understand the weaknesses of the current approach as well as anyone, moving the heavy lifting into the compiler seems like a mistake to me. There are advantages to the extant decoupling, and the ease of maintenance of the tool and tweaks to the build chain.

Wouldn't it be simpler and more flexible to keep the source-to-source translator and just work on go build? The problems of compilation and coverage can be kept separate, with all the advantages of simplicity and portability that result.

@thanm
Copy link
Contributor Author

thanm commented Mar 3, 2022

Wouldn't it be simpler and more flexible to keep the source-to-source translator and just work on go build?

This is a fair question, and something I've been thinking about a good bit lately.

My initial instinct was to use an entirely compiler-based approach because it seemed to me to offer the most control/freedom in terms of the implementation, but also (to be honest) because the compiler + linker are where I have the most expertise, e.g. my "comfort zone".

I've written CLs to do things in the compiler, and this approach seems to work well overall there are definitely some headaches.

One headache is that if we move things into the main GC compiler, we also need something that will work with the other compilers (gccgo, gollvm).

A second problem is that the compiler at the moment doesn't capture all of the source position information that you need. For example, consider this function:

func addStr(x, y string) string {	// line 19
	return x + y 			// line 20
}					// line 21

Here's the AST dump from the compiler before the 'walk' phase:

before walk addStr
.   RETURN tc(1) # p.go:20:2
.   RETURN-Results
.   .   ADDSTR esc(h) string tc(1) # p.go:20:11
.   .   ADDSTR-List
.   .   .   NAME-p.x esc(no) Class:PPARAM Offset:0 string tc(1) # p.go:19:13
.   .   .   NAME-p.y esc(no) Class:PPARAM Offset:0 string tc(1) # p.go:19:16

The compiler doesn't capture any source position info for the open and close brackets on lines 19 and 21 (why bother, we don't need them for debugging), and if you look at the references to "x" and "y" on line 20, their source posisitions correspond to their definitions, not their uses.

This doesn't create any issues when reporting coverage basic metrics (e.g. percent of statements covered) but it's a real problem when generating HTML reports, since you want to be able to "paint" chunks of the code red or green (depending on whether they executed or not)-- you can't paint it correctly if you didn't capture any source position info for it.

Of course, I could change the compiler to work in a mode where it captures/records more of this info (if building with "-cover"), but the risk there is that it might slow down the "fast path" for compilation even when the extra info collection is turned off.

Given that I've already written a compiler-based implementation, I think what I am going to do now is try prototyping a source-to-source based alternative as well (but moving to emitting the new output format). That might be a better middle ground.

@ChrisHines
Copy link
Contributor

Is there any synergy between this proposal and the new go test -fuzz feature? Could fuzz testing reuse the coverage tooling from this proposal to any benefit?

@thanm
Copy link
Contributor Author

thanm commented Mar 3, 2022

Is there any synergy between this proposal and the new go test -fuzz feature? Could fuzz testing reuse the coverage tooling from this proposal to any benefit?

This is a reasonable question (I might add that this also came up previously during an internal design review).

Although coverage testing and fuzzing both incorporate some notion of "coverage data", the two things are sufficiently different in terms of how they use the data means that it probably isn't worth trying to share implementations.

In both cases the compiler (or tool) is adding instrumentation code (counter increment or modification) on control flow edges, but beyond that things diverge in a big way.

For fuzzing there is no need to capture source position information for counters at all (there would be no point), and the values of the coverage counters are are used only internally / on-line within the test, never written out or stored. The only thing the fuzzer wants to know is whether coverage "changed" as a result of a mutation (AIUI).

For coverage testing on the other hand, source position info is critical (it's arguably the key piece), and unlike fuzzing we don't just want to determine that coverage has changed, we need to be able to write it out and report on it.

So with that in mind, my guess is that there isn't going to be a lot of useful overlap. Which I think is probably OK -- the existing fuzzer support for coverage instrumentation is pretty simple, e.g.

https://go.googlesource.com/go/+/f1dce319ffd9d3663f522141abfb9c1ec9d92e04/src/cmd/compile/internal/walk/order.go#444

gopherbot pushed a commit to golang/proposal that referenced this issue Mar 3, 2022
Detailed design document for proposal golang/go#51430, revamping
code coverage testing.

Updates golang/go#51430.

Change-Id: I692c6c6f977b9e4e14a6c4cc5fbd086d53021c27
Reviewed-on: https://go-review.googlesource.com/c/proposal/+/388857
Trust: Than McIntosh <thanm@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
@komuw
Copy link
Contributor

komuw commented Mar 9, 2022

problem:

  • Lets say you have a codebase and the whole test suite takes a considerable time to run.
  • And you need to make a change to a particular section of the codebase(maybe one function).
  • What I always want to know is; which particular testcase/s covers the function that I'm about to modify?
    That way, once I'm done modifying that function: I can run just the particular testcase/s that cover that function without having to run the whole test suite. Of course the whole test suite would still be ran in the builders/CI, but on my laptop I just want to run the subset of tests that cover the function I was working on.

The above is a situation I always find myself in and something I had hoped go tool cover would help with.
I don't know if this proposal will make it possible to answer the question; what testcase/s cover func Bar?,
but I'm sharing my usecase just in case.

(edit) my current workaround is usually;

func Bar(){
+   debug.PrintStack()
}

and then I run the whole test suite with -v and figure out the testcases that cover Bar based on the stacktrace.

@thanm
Copy link
Contributor Author

thanm commented Mar 9, 2022

which particular testcase/s covers the function that I'm about to modify?

This is covered in the detailed design document in this section:

https://go.googlesource.com/proposal/+/master/design/51430-revamp-code-coverage.md#tagging-coverage-profiles-to-support-test-origin-queries

I agree that having tools to answer these sorts of queries would be really valuable.

When we were rewriting the Go linker during the 1.15/1.16 time frame it seemed as though we were running into this situation on a daily basis (the linker has many "dark corners" that are only executed with obscure inputs and build modes, leading to many questions of the form "What test do I need to run in order to trigger the execution of this function?").

@rsc
Copy link
Contributor

rsc commented Mar 30, 2022

The discussion seems to have trailed off. Is there anything left to discuss? Does anyone object to this plan?

@adonovan
Copy link
Member

Rob's point about compiler complexity/portability is valid, and Than's "middle ground"---retaining source-to-source translation but standardizing the build and runtime interfaces of coverage---sounds like a good compromise.

@rsc
Copy link
Contributor

rsc commented Apr 6, 2022

I am not sure we want to end up in a world where there are two different coverage mechanisms we have to maintain, so I am not sure about the middle ground of keeping both - what would use the old source-to-source translator, and why would we maintain it?

LLVM and gccgo do not support cmd/cover right now; they just use the coverage built in to those compilers. (It might be nice to have some kind of adapter to generate the files that cmd/cover builds nice HTML displays from though.)

Is there anything more we need to know from the implementation side in order to decide here? That is, do we want to wait for any more CLs?

@adonovan
Copy link
Member

adonovan commented Apr 6, 2022

My understanding of Than's "middle ground" approach is that he intended to proceed with the changes to the run-time interface (the Go packages used within the target process, and the contract between those packages and the build tool regarding where files are written), but to back away from compiler-based instrumentation and keep the existing source-to-source translation algorithm.

@rsc
Copy link
Contributor

rsc commented Apr 13, 2022

@thanm, what do you think the status of this proposal is? Are there any details in flux that we still need to work out?
And can you confirm @adonovan's comment that the plan is to stick with source-to-source and not do compiler instrumentation?

@thanm
Copy link
Contributor Author

thanm commented Apr 13, 2022

what do you think the status of this proposal is? Are there any details in flux that we still need to work out?

hi, thanks for the "ping".

In terms of the design, I think things are mostly settled overall. There are a couple smallish items relating to command line API that need hashing out; I am in the process of adding more details in these areas to the design doc, and when I am done I will post an update here on the issue.

In terms of the implementation, I have "go build -cover" working and all.bash passing with the new stuff turned on (new design does everything that the old one did). I have not actually landed any of my CL stack yet however.

The test "origin" feature and the intra-line coverage featureare not yet implemented; I think at this point (given that the release freeze in in four weeks) I'll need to postpone features until after Go 1.19.

can you confirm @adonovan's comment that the plan is to stick with source-to-source and not do compiler instrumentation?

Confirmed, there will still be source-to-source rewriting, then there will be a small amount of additional "special sauce" done by the compiler when building the rewritten source. Doing things this way (IMO) provides the best overall solution.

Thanks.

@thediveo
Copy link

thediveo commented Oct 4, 2022

The error messages in https://go.dev/cl/438503 use still the old "clunky" function names.

@thanm
Copy link
Contributor Author

thanm commented Oct 4, 2022

@thediveo Thanks, I'll fix that up.

@mszostok
Copy link

mszostok commented Oct 5, 2022

On the other hand, it's possible to imagine scenarios in which a specific
dependency plays an integral or important role for a given application, meaning
that a developer might want to include the package in the applications coverage
statistics.

FYI: I'm exactly waiting for this option. My use-case is simply. I develop a version Go library that, as you may expect, relays on build process. So I have a bunch of e2e tests where I have just example "programs" that are using this lib and are built in different ways. I would be happy to get a clear way to see what lib functionally is already well tested. Currently, I'm a bit blind here.

Hope that this will be shipped too 🤞

gopherbot pushed a commit that referenced this issue Oct 5, 2022
Use shorter more Go-like names for the new APIs being added in the
runtime/coverage package for writing coverage data under user control
from server programs. Old names were a bit too clunky/verbose.

Updates #51430.

Change-Id: Ifdd5b882a88613c7c4342b40ed93b58547483c77
Reviewed-on: https://go-review.googlesource.com/c/go/+/438503
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@CarlJi
Copy link

CarlJi commented Nov 27, 2022

Great proposal, this is definitely a big step forward for the go coverage toolchain, looking forwarding to it being release ready.

In addition, i have a question that does the new solution support get coverage data remotely?

The problem is that in some complex scenario like system or e2e testing, the main entity under test is usually a complete remote cluster composed by a lot of applications running in many different machines or containers. so, it will be convenient for users to obtain the coverage data of remote services locally.

For the current (or old) go's coverage ability, this didn't seem to be supported, so we built a workaround named goc to address it. While, based on my understanding this proposal should solve this problem natively, but after going through the latest code, i didn't find such ways. please help.

@thediveo
Copy link

From the existing description I would say that you might run a bunch of tests and upload the coverage data directory. But I expect this to be very basic so you'll have to add your more specific handling of the generated data anyway.

@gopherbot
Copy link

Change https://go.dev/cl/455496 mentions this issue: _content: add section on integration test code coverage

gopherbot pushed a commit to golang/website that referenced this issue Dec 7, 2022
Add a landing page with material on how to gather coverage
profiles from integration tests.

Updates golang/go#51430.

Change-Id: I79aed14a3a2f87ed15ed8c603f26dd2b1a1fb508
Reviewed-on: https://go-review.googlesource.com/c/website/+/455496
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
@JeanChristopheMorinPerso

Hi, I just tried this in Go 1.20rc1 and it looks really great. It really improves a lot of things.

I have a questions though. How can we merge coverage data generated by integration tests and coverage data generated by unit tests? So far, I haven't found a way to tell go test to output coverage data in the new format. I can surely convert the integration tests coverage into the old format using go1.20rc1 tool covdata textfmt ... and then merge (manually) the two coverage reports together.

I also see cases where someone would want to run unit tests on multiple platforms and simply merge the coverage to get a single coverage report. Right now it doesn't seems to be natively possible?

Thanks!

@thanm
Copy link
Contributor Author

thanm commented Dec 9, 2022

@JeanChristopheMorinPerso this is a feature that didn't quite make it into the release. It is on my short list of follow-on features to add, however. I have a CL 456595 with a prototype implementation, but it didn't make it in before the freeze.

There is one hack/workaround you can use in the mean time, which is to build a test executable and then run it with an arg to pass in the GOCOVERDIR value. Example:

$ rm -f t.exe
$ go test -cover -c -o t.exe .
$ mkdir somedata
$ ./t.exe -test.count=1 -test.gocoverdir=somedata
...
$ ls somedata
covcounters.5e4ee7d09d99e0afcf93eafc6af2dcfd.2577901.1670614336696882176
covmeta.5e4ee7d09d99e0afcf93eafc6af2dcfd
$ go tool covdata percent -i=somedata
	cov.example/p	coverage: 63.8% of statements
$

At the moment -test.gocoverdir is an unexported test flag, but in theory you can hijack it for this purpose.

@JeanChristopheMorinPerso
Copy link

JeanChristopheMorinPerso commented Dec 9, 2022

Great, thanks @thanm . It looks like -test.gocoverdir does excatly what I was looking for!

go1.20rc1 test -cover ./... -args -test.gocoverdir=mydir

I'm really exited about these improvements. Thanks a lot @thanm and also great work!

@rbroggi
Copy link

rbroggi commented Feb 15, 2023

Hello all,

I would like to raise a discussion about a potential issue on how the go tool covdata displays the coverage coming from the main file:

For the sake of this discussion, I created a reproducer repository also hosting a CI execution that displays my point.

It seems that, unlike all the other files which get prefixed by their respective package/module path, the main.go file enters the report prefixed with the absolute path to it:

here you can see that after merging the results of unit-tests coverage reports and binary coverage reports using go tool covdata you get something like this:

$ cat coverage/merged/coverage_percent.out
mode: set
/usr/src/myapp/main.go:10.13,12.2 1 1
github.com/rbroggi/binarycov/greetings/greetings.go:3.23,5.2 1 1

the first line of this report breaks tools like gocover-cobertura which is the default way to allow repositories hosted on gitlab to feature MR coverage visualizations.

Was there a reason why prefixing main.go with the absolute path instead of the package path like all the other files?

@thanm
Copy link
Contributor Author

thanm commented Feb 15, 2023

Thanks for the feedback. Just as a meta-remark, it might make sense to have this conversation in the context of another (new) issue, since this issue is really more about the decision to extend the coverage tooling as opposed to the nitty gritty on how the tools work.

What's happening here has to do with how the Go command classifies packages; you can actually see similar behavior in 1.19 prior to the changes in this proposal. Here is an example:

$ cat go.mod
module repro.test
go 1.19
$ cat main/main.go
package main
func main() {
	println("himom")
}
$ cat main/main_test.go
package main

import "testing"

func TestMumble(t *testing.T) {
	if 42 != 42 {
		t.Errorf("bad")
	}
}
$ go test -coverprofile=firstcover.txt repro.test/main
ok  	repro.test/main	0.014s	coverage: 0.0% of statements
$ cat firstcover.txt
mode: set
repro.test/main/main.go:3.13,5.2 1 0
$ cd main
$ go test -coverprofile=secondcover.txt ./main.go ./main_test.go
ok  	command-line-arguments	0.014s	coverage: 0.0% of statements
$ cat secondcover.txt
mode: set
/tmp/repro/main/main.go:3.13,5.2 1 0
$

The convention is that for "local" imports of this sort (e.g. where the package is selected via a relative path) we use an absolute path for the filename so as to make HTML reporting easier -- see this code in the Go command.

This is what's triggering the behavior you are seeing in your example. To work around it, try changing your Makefile from

build_coverage: coverage_dir
	@go build -cover  -v -o $(BINARY_NAME) .

to

build_coverage: coverage_dir
	@go build -cover  -v -o $(BINARY_NAME) github.com/rbroggi/binarycov

That should give you a regular import path instead of a local full path.

There are arguments to be made both ways in terms of whether the Go command should be doing this, but for the moment it's the convention.

@rbroggi
Copy link

rbroggi commented Feb 15, 2023

Dear @thanm thank you a lot for the spot-on observation. I feared so indeed that there was a good reason for that to happen. As far as I'm concerned, the fact that we can achieve both output formats depending on how the inputs are provided is satisfactory and indeed it solves the issue I was facing.

@thediveo
Copy link

At the moment -test.gocoverdir is an unexported test flag, but in theory you can hijack it for this purpose.

Any news on making -test.gocoverdir more prominently visible? As it is, I use it to run my unit tests once as root and once as non-root and as of Go 1.20 I don't need an external merging tool anymore, as the Go toolchain now supports this out of the box. At least with the unexported flag.

What's in for -test.gocoverdir= in the near future?

@thanm
Copy link
Contributor Author

thanm commented Mar 27, 2023

What's in for -test.gocoverdir= in the near future?

I will be working on this during the 1.21 timeframe (this sort of change is not the kind of thing we back-port to minor releases, so no changes planned for 1.20.X).

@kokes
Copy link

kokes commented May 9, 2023

I'm wondering if these docs should be extended to at least briefly mention runtime/coverage functions, which are quite handy for long running binaries. I only discovered these functions by introspecting the commit history here. I got it working without these by killing my process and collecting the profiles (which was fun since I was running these as PID 1), but I suppose this programmatic way is a bit more ergonomic.

@thanm
Copy link
Contributor Author

thanm commented May 9, 2023

@kokes I agree, this makes sense. This is something that I've considered but hasn't made it to the top of my "to do" list (too many other pots on the stove).

@KumanekoSakura
Copy link

I have a question/suggestion for https://go.dev/doc/build-cover#panicprof .

I want to collect coverage data of a server application. And I want coverage data be written without terminating the server process so that I can determine what request should I try for not yet covered statements.

Is something like below possible (are coverage write functions goroutine safe)?

package main

import (
	"net/http"
	"os"
	"runtime/coverage"
)

func main() {
	if dir := os.Getenv(`GOCOVERDIR`); dir != `` && coverage.WriteMetaDir(dir) == nil {
		http.HandleFunc(`/save-coverage`, func(w http.ResponseWriter, r *http.Request) {
			if err := coverage.WriteCountersDir(dir); err != nil {
				w.WriteHeader(http.StatusInternalServerError)
				w.Write([]byte(err.Error()))
			} else {
				w.WriteHeader(http.StatusNoContent)
			}
		})
	}
	http.ListenAndServe(`0.0.0.0:1080`, nil)
}

If yes, please consider appending something like "But you can save profile data using coverage.WriteMetaDir() and coverage.WriteCountersDir() as needed, in order to minimize amount of profile data lost by crash." after "profile data from statements executed during the run will be lost." .

@thanm
Copy link
Contributor Author

thanm commented Dec 15, 2023

are coverage write functions goroutine safe

Not sure quite what you mean by "goroutine safe" in this context.

Your code looks as though it ought to run in theory, have you run it and tried it out? By the way, the coverage.WriteMetaDir() call in your code is unnecessary; if a program is built with "-cover" and GOCOVERDIR is set, meta-data will be written on program startup.

please consider appending something like

I'm not sure that is really a good idea; most applications just don't panic or crash often enough to justify that sort of extra hackery and additional overhead.

@KumanekoSakura
Copy link

Oops, I missed "Programs that call os.Exit(), or never terminate" already answered my question.

  • To handle server programs (which in many cases run forever and may not call exit), APIs will be provided for writing out a coverage profile under user control

Anyway, replying to your comment.

Not sure quite what you mean by "goroutine safe" in this context.

src/runtime/coverage/emit.go says

  • This file contains functions that support the writing of data files emitted at the end of code coverage testing runs, from instrumented executables.

which might imply that it is not safe to call functions in emit.go outside of process termination event. But WriteCountersDir() says

  • The counter data written will be a snapshot taken at the point of the call.

which might imply that it is safe to call WriteCountersDir() outside of process termination event.

Your code looks as though it ought to run in theory, have you run it and tried it out?

I wanted to know if we are allowed to call WriteCountersDir() outside of process termination event.

By the way, the coverage.WriteMetaDir() call in your code is unnecessary; if a program is built with "-cover" and GOCOVERDIR is set, meta-data will be written on program startup.

OK, I see.

I'm not sure that is really a good idea; most applications just don't panic or crash often enough to justify that sort of extra hackery and additional overhead.

My case was

  • Can I collect coverage data of a server application which won't terminate?

and the answer is

  • Yes, call coverage.WriteCountersDir(os.Getenv(GOCOVERDIR)) to collect a snapshot of coverage data.

.

Thank you.

@thanm
Copy link
Contributor Author

thanm commented Dec 18, 2023

I wanted to know if we are allowed to call WriteCountersDir() outside of process termination event.

Yes, this is allowed/supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Projects
Status: Triage Backlog
Status: Accepted
Development

No branches or pull requests