Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: testing: store test artifacts #71287

Open
neild opened this issue Jan 15, 2025 · 47 comments
Open

proposal: testing: store test artifacts #71287

neild opened this issue Jan 15, 2025 · 47 comments
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal Proposal-FinalCommentPeriod
Milestone

Comments

@neild
Copy link
Contributor

neild commented Jan 15, 2025

This is an offshoot of #43936.

Some tests produce output files which the user may want to examine. For example, a test might produce files which are compared to some reference. If the comparison fails, the user will want to examine the generated files. Or a test might produce a packet capture or similar trace which can be used to debug failures.

We call these output files "test artifacts".

This is a proposal to add support for storing test artifacts to the testing package.

We add a new method to testing.TB:

package testing

// OutputDir returns a directory for the test to store output files in.
// When the -outputdir flag is provided, this directory will be located
// under that directory. Otherwise, OutputDir returns a temporary directory
// which is removed after the test completes.
//
// Each test or subtest has a unique artifact directory.
// Repeated calls to OutputDir in the same test or subtest return the same directory.
// Subtest outputs are not located under the parent test's output directory.
func (t *testing.T) OutputDir() string

The -outputdir flag already exists, and is currently used to specify a location to put output files from profiling. We're adding an additional meaning to it here: It's now where all your saved test outputs go.

When -outputdir is specified, the first call to OutputDir in a test or subtest will emit a line to the test output consisting of "=== ARTIFACTS ", the test name, and the test artifact directory, separated by spaces:

=== ARTIFACTS TestName/subtest_name /path/to/root/artifact/dir/TestName__subtest_name

When -json is specified, this will appear as an entry with an Action of "artifacts", the usual Time, Package, and Test keys, and a "Path" key containing the artifact directory:

{"Time":"2025-01-15T13:39:27.75235-08:00","Action":"artifacts","Package":"path","Test":"TestName","Path":"/path/to/root/artifact/dir/TestName"}

That's the proposal.

A few points on the design:

  • I'm reusing the existing -outputdir flag, on the theory that output files from profiling are just another test artifact. If we don't like that reuse, then we could add a new -artifactdir flag and rename TB.OutputDir to TB.ArtifactDir for consistency.

  • The test output uses the word "ARTIFACTS" because the JSON already has "output" events.

  • TB.OutputDir returns a directory, same as TB.TempDir. This seems simpler than asking the testing package to copy files around.

  • TB.OutputDir returns a directory even if we aren't saving artifacts so test behavior doesn't change depending on the presence or absence of the -outputdir flag.

In simple interactive use, users can pass -outputdir to store test artifacts when debugging a failing test.

Test frameworks that collect artifacts can arrange to pass -outputdir to the tests they run and collect any artifacts after the fact.

As a concrete use case, within Google our testing infrastructure sets an environment variable to the location of a directory. Tests can write files into this directory, and those files will be stored and associated with the test run. If we implement this proposal, we can arrange for the test infrastructure to also pass this directory as an -outputdir flag, and any test using TB.OutputDir will automatically use the right location.

@mknyszek
Copy link
Contributor

mknyszek commented Jan 15, 2025

Speaking from the perspective of Go's CI infrastructure, it would be helpful to have some standard convention to indicate output artifacts at the standard library level, so I'm in support of this proposal (or something like it).

We share the tooling for processing Go tests with several other teams, and without a standard for this sort of thing, we would be stuck having to define our own amongst ourselves. That may turn out to be OK in practice, but having a single way to do it would save us all the time of trying to define our own convention, since it wouldn't even be a question. As far as test metadata goes, I think this would be one of the most valuable forms, for the Go team anyway, and IMO more valuable than Attr on its own due the well-defined convention. I can think of several use-cases for exactly this:

  • Execution trace tests and pprof tests would emit artifacts when they encounter broken traces produced by the runtime, for debugging.
  • Crashes in CI would result in core dumps that can be easily downloaded and dissected. (So many more issues would be possible to debug!)
  • Integration tests for binary format serialization/deserialization could dump intermediate artifacts when things go wrong.

I don't have many thoughts on the specifics of how this should look in the output or in the testing package, but from my perspective this doesn't seem like much of an overreach into framework territory. It's a fairly common and simple piece of functionality, and again, essentially just defining a convention.

@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Jan 15, 2025
@seankhliao
Copy link
Member

How would this interact with fuzzing which currently stores new inputs that cause crashes in testdata/fuzz/ ?

@neild
Copy link
Contributor Author

neild commented Jan 15, 2025

No changes to fuzzing, unless I'm missing something. Is there some interaction that should happen there?

@seankhliao
Copy link
Member

I was thinking they're similar in that they generate test artifacts, and if you're running in some remote CI system, you may want a (relatively) easy way to extract the new additions from the other fuzz inputs, such as redirecting them to a specified directory as in this proposal.

@neild
Copy link
Contributor Author

neild commented Jan 16, 2025

I don't think we can have -outputdir stop us from writing fuzz crashes to testdata/fuzz; anyone using fuzzing now is presumably depending on the current behavior and won't want it changed by an orthogonal feature.

Perhaps we could write fuzz crashes to both testdata/fuzz and -outputdir. But I think simpler would be to have a separate -fuzzoutputdir flag to set where new fuzz crashes get written. If you want everything in the same place, set both -fuzzoutputdir and -outputdir.

I think a -fuzzoutputdir flag is probably a separate proposal, though.

@earthboundkid
Copy link
Contributor

Instead of returning a string, could it return an os.Root?

@neild
Copy link
Contributor Author

neild commented Jan 16, 2025

Instead of returning a string, could it return an os.Root?

It could, but that'd be a gratuitous divergence from TB.TempDir. If you want a Root, it's easy enough to open the directory as one. And sometimes you do want just the directory name (when passing it as a flag to some other process, say).

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jan 16, 2025
@willfaught
Copy link
Contributor

willfaught commented Jan 16, 2025

If I run such tests for an unfamiliar package with just go test, will my system accrue these output artifacts permanently without my knowing? How do I clean these, or prevent them entirely?

Where will the paths likely be? Under /tmp, under the package, under GOPATH?

Do tests get unique paths over multiple test runs, or does one test run overwrite the artifacts of another for the same test? If overwrite, are all previous artifacts deleted first?

I was thinking they're similar in that they generate test artifacts, and if you're running in some remote CI system, you may want a (relatively) easy way to extract the new additions from the other fuzz inputs, such as redirecting them to a specified directory as in this proposal.

Perhaps fuzz artifacts should be thought of as generated inputs, not outputs, so they shouldn't be included in this.

@neild
Copy link
Contributor Author

neild commented Jan 16, 2025

If I run such tests for an unfamiliar package with just go test, will my system accrue these output artifacts permanently without my knowing? How do I clean these, or prevent them entirely?

If you don't pass the --outputdir flag to "go test", artifacts get written to a temporary directory (exactly the same as if they were written to a t.TempDir) and are deleted after the test finishes.

If you do pass --outputdir, artifacts get written to that directory and you can decide what to do with them from there. Persistent artifacts are only produced when you ask for them, in the location you specify.

Do tests get unique paths over multiple test runs, or does one test run overwrite the artifacts of another for the same test? If overwrite, are all previous artifacts deleted first?

That is an excellent question. I'm going to have to think about it.

@neild
Copy link
Contributor Author

neild commented Jan 21, 2025

When a test using T.OutputDir is run multiple times (-count=N), I propose that each run gets a separate, empty output directory. We store all the outputs from all runs.

I see three possible behaviors here:

  1. Each run gets the same output directory. Nothing is cleaned up between runs, so a test may start with files in its output directory. I don't like this option at all. Test runs should be as independent of each other as we can make them, and leaving around old outputs works against that.
  2. Each run gets a clean output directory. We delete most of them and preserve only a few. (Maybe the outputs from the first run to succeed and the first to fail?) Keeps the total size of the outputs down, but we need to decide on which runs to keep. And what if someone wants all those outputs?
  3. Store everything. Simple. Easy to describe. Outputs might get large, but disks are big and the user can run go test multiple times with a smaller -count if they want to limit the total output size per invocation.

@aclements
Copy link
Member

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— aclements for the proposal review group

@aclements aclements moved this from Incoming to Active in Proposals Jan 23, 2025
@aclements
Copy link
Member

I'm reusing the existing -outputdir flag, on the theory that output files from profiling are just another test artifact.

This is currently documented as

	-outputdir directory
	    Place output files from profiling in the specified directory,
	    by default the directory in which "go test" is running.

This will muddy the "default" value. I think under this proposal, if -outputdir isn't specified, profiles would still go to the current directory, but artifacts would go to a temporary directory.

There's also an implementation issue with using -outputdir. Currently, go test always passes -test.outputdir if any test profiling flags are passed. It has to do this because go test can change the current directory before running the test binary and needs to communicate the "original" output directory.

One option would be for go test to rewrite the paths to the profiling flags to always include the full path, and pass -test.outputdir if and only if the user passed -outputdir. Then the test would be able to tell if the user set -outputdir.

When a test using T.OutputDir is run multiple times (-count=N), I propose that each run gets a separate, empty output directory. We store all the outputs from all runs.

This sounds reasonable. If we give each iteration the same output directory, there's bound to be a confusing mix of fixed file names that get overwritten each time and generated file names that accumulate.

@neild
Copy link
Contributor Author

neild commented Jan 27, 2025

If muddying the default value for -outputdir is a concern, we can add a separate -artifactdir flag instead. If we do that, I think the method should be T.ArtifactDir for consistency.

Another option might be to add -profiledir and -artifactdir flags, and say that -outputdir sets the default for both.

@rsc
Copy link
Contributor

rsc commented Feb 5, 2025

For profiling, there are two separate concerns: "enable profiling" and "where do profiles get written" (outputdir).
The current artifact proposal tries to merge them into one concern "where do artifacts get written" (unfortunately also called outputdir).

It seems reasonable to have one outputdir and just have a flag for "write artifacts" (default off). Then we can still keep the new OutputDir method, which may be useful regardless, but also have a "Artifacts() bool" method or something like that. Maybe it's not a method but is a top-level function like testing.Short. So you'd do

go test -artifacts

to drop them in the current directory, and by default you wouldn't get them at all.

Although I'm not sure why it's one flag to enable all possible artifacts. It seems like you might want each artifact writer to define its own flag, or at least have some kind of sub-selection. In the explanation:

Some tests produce output files which the user may want to examine. For example, a test might produce files which are compared to some reference. If the comparison fails, the user will want to examine the generated files. Or a test might produce a packet capture or similar trace which can be used to debug failures.

If one test is failing, it doesn't seem like you want to save all the artifacts from all the tests. Maybe -artifacts takes a test pattern? Or something else?

artifacts is also such a long string to type. It would be nice if we had a shorter name for these.

@neild
Copy link
Contributor Author

neild commented Feb 5, 2025

If you don't want artifacts from all tests, you can just run the tests that you want artifacts for, no? "Run these tests, but only save the artifacts from those tests" seems overly complicated.

A separate "write artifacts (to the output dir, which might be the current dir)" flag sounds reasonable to me.

I don't really like an Artifacts() bool method. That implies that a test running in save-artifacts mode behaves differently than one running in the default configuration. A test might fail in one configuration and not the other, and there will be extra test complexity around deciding whether and where to put files.

For example, one place I might use artifact storage is in a test for a code generator, like the protobuf compiler. These tests often compare the output of the generator to a golden reference. Storing the current output in the test artifacts is useful when debugging changes. With the current proposal, that's a simple matter of putting the files in t.OutputDir rather than t.TempDir directory. With an Artifacts() bool method it gets more complicated; do you copy the files after the test runs, do you change which directory you use based on the bool?

@rsc
Copy link
Contributor

rsc commented Feb 12, 2025

If we were starting from scratch and didn't already pass -test.outputdir to existing test invocations, @neild's previous comment would be spot on. Given that we have to add a different knob than outputdir (or else all tests will write artifacts to the current directory just because you're profiling), it feels like unnecessary expense to write files to a temporary directory that will be deleted.

It seems to come down to:

  1. Is there a way for tests to find out that the artifacts are going to be deleted and skip writing them?
  2. Do tests call ArtifactDir and use that directory, or do they call Artifacts and use the OutputDir?

For (1), especially if artifacts can be big, it seems very wrong to me to write them 100% of the time when 99.9% of the time the go command will just delete them right away.

For (2), -artifactdir doesn't seem much worse than -artifacts, and maybe people will want to set -artifactdir separately from -outputdir, so probably ArtifactDir is better. And it can return an empty string if artifacts aren't being kept?

@neild
Copy link
Contributor Author

neild commented Feb 12, 2025

This is a bit of a philosophical question, I think. My opinion is that tests should, by default, endeavor to always have consistent behavior. In the common case, there's no real cost to writing artifacts and then deleting them. There are also several benefits:

  • Storing artifacts doesn't perturb the test. Turning on artifact storing won't make a test start passing (and confound debugging).
  • You don't need to test another configuration of the test. If a test behaves differently with and without artifact storage, you need to exercise both paths to make sure they work.
  • Test code is simpler.

In the event that artifacts are expensive to generate, a test can define its own flag to control generation. I don't think that's the common case, though.

@rogpeppe
Copy link
Contributor

Perhaps this is implied by the existing proposal and I've missed it, but what happens if I do this?

go test -outputdir /tmp/something ./...

Does each package being tested get its own output directory within /tmp/something ? It didn't look like that from the sample ARTIFACTS line quoted above, but this seems like a common enough use case that it would be nice if it worked.

The other thing I'm thinking here is that sometimes (usually?) we only care about test artifacts from failing tests. Both the use cases mentioned in the initial proposal seem to come into that category.

How about this: if we haven't printed the name of the artifact directory anywhere (i.e. the test has failed and we're not running in verbose mode) the artifacts for that test are removed after the test completes?

When there are thousands of tests, this would vastly improve the signal-to-noise ratio of the artifact directory contents, and
it would mean that in the usual case (all tests pass) nothing is left behind on disk.

A flag could be added to request that all artifacts are written regardless (or just running tests -json or -v mode would also be sufficient).

@neild
Copy link
Contributor Author

neild commented Feb 20, 2025

Does each package being tested get its own output directory within /tmp/something ?

This is a good point: When testing multiple packages, tests with the same name in different packages must be given unique artifact directories. The simplest way to achieve this is probably to give each package a unique artifact directory and put the tests under that.

sometimes (usually?) we only care about test artifacts from failing tests

An option to store only artifacts from failing tests seems pretty reasonable, but it's another flag for users to think about. I don't have a strong opinion on whether it's worth it.

@rsc proposes (#71287 (comment)) an -artifacts flag to enable artifact storage. Perhaps this could be something like -artifacts=all or -artifacts=failing. Or we could have separate -artifacts and -allartifacts flags.

@AndrewHarrisSPU
Copy link

An option to store only artifacts from failing tests seems pretty reasonable, but it's another flag for users to think about. I don't have a strong opinion on whether it's worth it.

Thinking about published versus unpublished artifacts is helping me make sense of this proposal. I wonder if users have the right headroom, the necessary hooks from testing to implement pass/fail publishing decisions independently. Maybe users would inevitably want to explore more fine-grained logic than the "publish-on-pass/fail" flag, or make publishing decisions on other factors.

A first pass at how I could make headway on optionally publishing artifacts on pass/fail logic:

  1. I can start with an assumption that any artifact is ephemeral, like the contents of a t.TempDir. Sometimes an artifact is literally a file existing in a t.TempDir, or sometimes it's something in-memory that could be written to a file later.
  2. For each artifact I might be interested in publishing, I can (in a number of ways...) programmatically associate it with testing.T.
  3. On observing a failure, I can elect to publish the artifact, for persistence outside the invocation of go test. At this point it's copying existing files or writing in-memory resources to the artifact directory (and choosing names, etc.)

From prior experimentation I know that the implementation details for step 2 can get a bit weedy, and there can be some rough edges. For example, I'm not sure how well this approach helps for T.Parallel or when the count flag is set. But there's some headroom here, and it could be extended. I'm not sure it'll be my opinion later, but in the moment ... it'd be nicer to give users what they need to implement the publishing logic.

Along these lines there might be a "worse-is-better" appeal to just having a flag that, when present, says what directory artifacts are written to, and when not present, everything is ephemeral.

@neild
Copy link
Contributor Author

neild commented Feb 23, 2025

I hypothesize that:

  1. Artifacts are usually not expensive to generate. This matches my experience with Google tests (where the test infrastructure generally stores all artifacts by default). The examples of artifacts mentioned so far in this issue are not expensive.
  2. Artifacts are usually not expensive to store. Disks are huge.
  3. We already have mechanisms to handle expensive tests: The -short flag can be used to disable expensive tests, tests can define their own flags if -short isn't sufficient, and tests can be split into multiple subtests. We don't need a new mechanism to disable artifact storage for a test; users can just not run the test using the existing mechanisms.

So I think the right thing to do is to generate and store artifacts for all tests in a user-defined output directory.

If users want some more complicated behavior (store only artifacts for failing tests, say), they can use a tool that parses the test JSON output and does whatever is desired. We don't need to do anything more complicated in the go tool, at least not to start with.

@aclements
Copy link
Member

It sounds like we're converging on adding an -artifacts flag that enables saving artifacts in the output directory (the value of -outputdir, or the current directory if that is not specified), with an API like

package testing

// ArtifactDir returns a directory for the test to store output files in.
// When the -artifacts flag is provided, this directory will be located
// under the output directory. Otherwise, ArtifactDir returns a temporary directory
// which is removed after the test completes.
//
// Each test or subtest within each test package has a unique artifact directory.
// Repeated calls to ArtifactDir in the same test or subtest return the same directory.
// Subtest outputs are not located under the parent test's output directory.
func (t *testing.T) ArtifactDir() string

Under the output directory, we create a separate directory for each test package, and within that we create a separate directory for each call of a test function or subtest (presumably named after the test plus some uniquifier if necessary).

@aclements
Copy link
Member

Should this be just a concrete method on T or an interface method of TB? In general, we prefer to add things to TB and I think that's the right answer here, too. This definitely has less utility in benchmarks, but as long as a benchmark is careful to write artifacts outside of the timed loop, I could see it being valuable. I'm not sure in fuzz tests, but I don't see any active harm in a fuzz test and it's fine if it's there and people just don't use it.

@aclements
Copy link
Member

Another question is what to do if an artifact directory already exists. My inclination is to never delete anything, and lean on the path uniquifier we need anyway in case a test runs multiple times. That also has the benefit of making go test -count=N behave the same as running go test N times.

@aclements
Copy link
Member

Here's what I've got:

The proposal is to add an API and command line flags for saving test artifacts. The following method would be added to testing.TB and implemented in T, B, and F:

package testing

type TB interface {
    ...

    // ArtifactDir returns a directory for the test to store output files in.
    // When the -artifacts flag is provided, this directory will be located
    // under the output directory. Otherwise, ArtifactDir returns a temporary directory
    // which is removed after the test completes.
    //
    // Each test or subtest within each test package has a unique artifact directory.
    // Repeated calls to ArtifactDir in the same test or subtest return the same directory.
    // Subtest outputs are not located under the parent test's output directory.
    func (t *testing.T) ArtifactDir() string
}

A new boolean -artifacts flag to go test and corresponding -test.artifacts flag to test binaries would enable saving of artifacts to the output directory configured by existing the -outputdir flag (which defaults to the working directory go test or the test binary).

The artifact directory would be named <test package>/<test name>(-<N>)? where each component would be mangled to make it path-safe if necessary. The <test package> would be the Go package name of the test, like strings or runtime_test, and the -N suffix would be present if necessary to distinguish multiple runs of the same test or subtest. Existing artifact directories would not be overwritten or deleted; they would just receive a unique suffix.

When artifacts are enabled, the first call to ArtifactDir in a test or subtest will emit a line to the test output consisting of "=== ARTIFACTS ", the test name, and the test artifact directory, separated by spaces:

=== ARTIFACTS TestName/subtest_name /path/to/test/artifacts

When -json is specified, this will appear as an entry with an Action of "artifacts", the usual Time, Package, and Test keys, and a "Path" key containing the artifact directory:

{"Time":"2025-01-15T13:39:27.75235-08:00","Action":"artifacts","Package":"path","Test":"TestName/subtest_name","Path":"/path/to/test/artifacts"}

@apparentlymart
Copy link

Is the "path uniquifier" essentially just a call to os.MkdirTemp with a pattern somehow derived from the test name?

That would then presumably make them be guaranteed unique across runs while still containing some information to help figure out which subdirectory belonged to which test, in which case I'd find that a fine compromise for allowing -artifacts to refer to a directory that already exists and is non-empty.

@seankhliao
Copy link
Member

note that test names can be unreasonably long, as in #71742

@neild
Copy link
Contributor Author

neild commented Mar 6, 2025

API in #71287 (comment) LGTM.

For the specific directory name, I propose:

<test package>/<test name>/<random>

<test package> is the complete Go package name passed through filepath.Localize.

<test name> is the test name, truncated if it is too long. (Let's say 200 characters max.) If truncation puts multiple tests in the same directory, so be it.

<random> is created by os.CreateTemp.

We document none of this and make no guarantees that it won't change. (Although Hyrum's Law probably means we can't change any of it.)

The use of CreateTemp both disambiguates multiple test runs and makes it a bit harder to depend on the exact path being stable. The supported way to find the artifact directory for a test is to parse the === ARTIFACTS line or the JSON equivalent. In the simple case of running a test once with an empty output directory, it's still easy to find the artifacts for a test.

@aclements
Copy link
Member

Is the "path uniquifier" essentially just a call to os.MkdirTemp with a pattern somehow derived from the test name?

I'd been imagining it would be an actual ascending decimal number (maybe zero-padded?), so it's clear what order the test instances ran in, but whatever.

@rogpeppe
Copy link
Contributor

rogpeppe commented Mar 7, 2025

A note of caution: test names can not only be very long, but can also contain arbitrary text which might interact poorly with the filesystem (e.g. https://go.dev/play/p/P0LbnZN5V0e)
I know that I've certainly got some tests with names that derived from an externally provided test table containing full sentences for names. It would be a shame if such a test were to fail to store its artifact data because of its name.

@AndrewHarrisSPU
Copy link

If test names seem tricky, a quick thought on another scheme:

<test package>/<test file name>/<pattern-random>
  • os.CreateTemp already takes a pattern argument that is prefixed before the unique element
  • the 'pattern' would need to be plumbed in by func (t *testing.T) ArtifactDir(prefix string) string ... I'm guessing the ArtifactDir calls occur in convenient enough locations that a helpful prefix is probably at hand? Also, the prefix seems useful if more than one kind of artifact is generated by a test.
  • test file name (stripped of _test.go, sanitized) is not as nice as a test name, but it's still a clue.

@apparentlymart
Copy link

When I said "somehow derived from" I was being intentionally vague about it but assuming that it would involve stripping out or replacing all but a safe set of characters, and then assuming that any resulting collisions would get taken care of by MkdirTemp selecting unique suffixes.

I expect that in practice those who are writing tests that generate artifacts will choose their test names based on what remains intelligible after that transform, and the names of any preexisting tests that don't generate artifacts don't matter for our purposes here.

The length limit is a good point I admittedly hadn't considered, but I think all those same assumptions still apply: there are not yet any tests making use of this new feature, and those who choose to make use of it will presumably choose test names that work well with how the paths get generated.

(I don't mean to say that it's okay to generate invalid path names, but rather that it seems okay to turn unreasonable test names into less useful directory names, and assume that those who care about having readable directory names will pick more reasonable test names.)

@greg-dennis
Copy link

The proposal is to add an API and command line flags for saving test artifacts. The following method would be added to testing.TB and implemented in T, B, and F:

Should it also be implemented in M, to support setup/teardown artifacts?

@aclements
Copy link
Member

aclements commented Mar 12, 2025

Running with @apparentlymart 's observation, I think we can leave the exact details of the path-cleaning transformation unspecified and subject to change and specifically document that if people want predictable artifact paths, they should chose simple test names. I think we need to specify that [a-zA-Z0-9_] are passed through and how / and = are escaped since those appear in subtest names, but leave anything beyond that unspecified. Is there a length we can guarantee, like anything up to 64 characters won't get truncated?

Or we tell people to just use the === ARTIFACTS, but there's no way everyone would actually follow that advice. 😄

@aclements
Copy link
Member

Here's a concrete proposal for the directory naming. I don't think we need to pin down the exact details to move the proposal forward, just the general shape.

The artifact directory would be named <test package>/<test name>/<random> where each component would be mangled to make it path-safe if necessary. The <test package> would be the Go package name of the test, like strings or runtime_test. Existing artifact directories would never be overwritten or deleted. The mangling would be left unspecified, but would be designed such that "nice" test names would map to "nice" directory names, meaning that [a-zA-Z0-9] are left as is, / and = are escaped in a predictable way, and long test names are truncated in a way that is highly likely to be unique (e.g., using a hash of the trimmed part).

@aclements
Copy link
Member

Have all remaining concerns about this proposal been addressed?

The proposal is to add an API and command line flags for saving test artifacts. The following method would be added to testing.TB and implemented in T, B, and F:

package testing

type TB interface {
    ...

    // ArtifactDir returns a directory for the test to store output files in.
    // When the -artifacts flag is provided, this directory will be located
    // under the output directory. Otherwise, ArtifactDir returns a temporary directory
    // which is removed after the test completes.
    //
    // Each test or subtest within each test package has a unique artifact directory.
    // Repeated calls to ArtifactDir in the same test or subtest return the same directory.
    // Subtest outputs are not located under the parent test's output directory.
    func (t *testing.T) ArtifactDir() string
}

A new boolean -artifacts flag to go test and corresponding -test.artifacts flag to test binaries would enable saving of artifacts to the output directory configured by existing the -outputdir flag (which defaults to the working directory go test or the test binary).

The artifact directory would be named <test package>/<test name>/<random> where each component would be mangled to make it path-safe if necessary. The <test package> would be the Go package name of the test, like strings or runtime_test. Existing artifact directories would never be overwritten or deleted. The mangling would be left unspecified, but would be designed such that "nice" test names would map to "nice" directory names, meaning that [a-zA-Z0-9] are left as is, / and = are escaped in a predictable way, and long test names are truncated in a way that is highly likely to be unique (e.g., using a hash of the trimmed part).

When artifacts are enabled, the first call to ArtifactDir in a test or subtest will emit a line to the test output consisting of "=== ARTIFACTS ", the test name, and the test artifact directory, separated by spaces:

=== ARTIFACTS TestName/subtest_name /path/to/test/artifacts

When -json is specified, this will appear as an entry with an Action of "artifacts", the usual Time, Package, and Test keys, and a "Path" key containing the artifact directory:

{"Time":"2025-01-15T13:39:27.75235-08:00","Action":"artifacts","Package":"path","Test":"TestName/subtest_name","Path":"/path/to/test/artifacts"}

@willfaught
Copy link
Contributor

If there are, or can be, multiple packages with the same name in your project, then wouldn’t you need to use a unique artifact path per package? If so, having the package name in the artifact path seems redundant.

@ianlancetaylor
Copy link
Member

On the other hand, package names are often distinct, and using the package name will separate out different tests and make them easier to see and find. Since there is a random string in the directory name, uniqueness is not required.

Or, we could use the package path rather than the package name. The package path must be unique.

@apparentlymart
Copy link

I'd agree that it seems quite rare for a single Go module to contain multiple packages with the same name, even though the toolchain technically allows that. AFAICR go test doesn't allow running tests across packages in multiple modules at once, so packages from other modules are presumably not relevant.

Again I find myself returning to the idea that this artifact feature is new and so no existing tests could possibly be using it, and when people start using it they are unlikely to intentionally use it in ways cause a confusing directory layout under the artifact directory, so to me it seems safe to assume that package names are "unique enough", and let the unique part of the path deal with the (presumed) rare case where there are two colliding package names.

Given that all of the packages in a particular module tend to share a prefix that is much longer than the package name itself, I think it would be annoying to use the entire package path as the directory name. Though if folks disagree with my assumption that package names are unlikely to collide within a single module, perhaps we could compromise and use whatever is left of the package path after trimming the module path prefix from the front of it. 🤔

@AndrewHarrisSPU
Copy link

A thin argument for package path over package name: It gives canonical locations relative to module root. Suppose all CI big or small will have opinions about what to do with artifacts, but they don't converge, the upper bound seems like staging things for further automation. Canonical paths relative to module root seems nicer in that sense.

@willfaught
Copy link
Contributor

I imagine in many repos, especially monorepos, package names like “models”, “services”, “auth”, “users”, etc. could be common.

It sounds like there’s a desire to enable users to find recent artifacts. Using the package path instead of the name would be more helpful with that.

Also, using an incrementing number or a Unix timestamp instead of a random string in the artifact path would make it easier to find the latest run’s artifacts.

I imagine for a monorepo with a single module at the root, in CI, you’d just want to do “go test -artifacts $ARTIFACTMOUNT ./…”, and let the go command deal with each package’s artifact path automatically.

@aclements
Copy link
Member

I think there are two use cases with conflicting needs here.

  1. I just want to test the package I'm working on. I really don't want to have to cd down half a dozen directories just to get to my artifacts. The path is also going to look like .../x/y/z/w/x/y/z/w, which seems like a violation of Go's "avoid stutter" naming guidelines, and leads to long paths that may cause file system issues. This can be avoided by also specifying an -output-dir that's outside the package path, but that's one more speed bump in something that should be convenient to use interactively.

  2. This is a CI system, or I want to test a whole module or subtree. I'm probably going to run go test -artifacts ./... (or all) from the root of the module/subtree. In that case, I agree you want the whole package path to avoid conflicts. But also there won't be stutter if you're running it from the root of the subtree you're testing.

This suggests a way to satisfy both cases: collect all of the package paths on the command line, plus the package path of "." (if in a module), trim the common leading prefix elements, and use what remains as the artifact directory. This can trim off the entire package path, so I propose adding _artifacts at the beginning, which will also avoid creating any directories that can look like packages to cmd/go. There are several possible variations of this idea, which I'll give below, but this is the most stable and predictable one I could think of.

Here are some examples of how this would play out. Suppose we have a module example.com/x with packages y/z and y/w in it.

Cur. package go test -artifacts XXX Result
example.com/x/y/z . _artifacts/<test name>
example.com/x/y/z ../w _artifacts/w/<test name>
example.com/x ./y/z _artifacts/y/z/<test name>
example.com/x ./... _artifacts/{y/z,y/w}/<test name>
example.com/x runtime _artifacts/runtime/<test name>
example.com/x runtime ./y/z _artifacts/{runtime,example.com/x/y/z}/<test name>
- runtime _artifacts/<test name>
- runtime cmd/compile _artifacts/{runtime,cmd/compile}/<test name>

Some alternatives I don't like as much:

  • Strip the common prefix without including the "." package. This is a bit unpredictable. In the example above, if you run go test ./... from example.com/x, it would strip off the y just because there doesn't happen to be another package in the root of the module.
  • Use something "relative" to the "." package. We don't have a strong notion of relative paths to packages, and it's unclear how you deal with ".." components if a package is "above" ".". Maybe you just drop the ".." elements, but now the results aren't necessary unique any more.
  • Behave differently depending on whether a 1) single, non-pattern package is given, or 2) either multiple packages are given or a package pattern. This is predictable, but also has a weird discontinuity if you add a second package to the command line.

@neild
Copy link
Contributor Author

neild commented Mar 26, 2025

I don't have strong opinions here, but my preference is for something simple. I'd personally rather have a long but predictable path than one which varies depending on whether I'm testing one package or two, or what my current working directory is.

Package path minus the module path seems like a good balance. For example, golang.org/x/net/dns/dnsmessage would be "_artifacts/dns/dnsmessage".

@aclements
Copy link
Member

Have all remaining concerns about this proposal been addressed?

The proposal is to add an API and command line flags for saving test artifacts. The following method would be added to testing.TB and implemented in T, B, and F:

package testing

type TB interface {
    ...

    // ArtifactDir returns a directory for the test to store output files in.
    // When the -artifacts flag is provided, this directory will be located
    // under the output directory. Otherwise, ArtifactDir returns a temporary directory
    // which is removed after the test completes.
    //
    // Each test or subtest within each test package has a unique artifact directory.
    // Repeated calls to ArtifactDir in the same test or subtest return the same directory.
    // Subtest outputs are not located under the parent test's output directory.
    func (t *testing.T) ArtifactDir() string
}

A new boolean -artifacts flag to go test and corresponding -test.artifacts flag to test binaries would enable saving of artifacts to the output directory configured by existing the -outputdir flag (which defaults to the working directory go test or the test binary).

The artifact directory would be named <test package>/<test name>/<random> where each component would be mangled to make it path-safe if necessary. The <test package> would be the Go package name of the test, like strings or runtime_test. Existing artifact directories would never be overwritten or deleted. The mangling would be left unspecified, but would be designed such that "nice" test names would map to "nice" directory names, meaning that [a-zA-Z0-9] are left as is, / and = are escaped in a predictable way, and long test names are truncated in a way that is highly likely to be unique (e.g., using a hash of the trimmed part).

When artifacts are enabled, the first call to ArtifactDir in a test or subtest will emit a line to the test output consisting of "=== ARTIFACTS ", the test name, and the test artifact directory, separated by spaces:

=== ARTIFACTS TestName/subtest_name /path/to/test/artifacts

When -json is specified, this will appear as an entry with an Action of "artifacts", the usual Time, Package, and Test keys, and a "Path" key containing the artifact directory:

{"Time":"2025-01-15T13:39:27.75235-08:00","Action":"artifacts","Package":"path","Test":"TestName/subtest_name","Path":"/path/to/test/artifacts"}

@nightlyone
Copy link
Contributor

I believe the cleanup part is underrepresented. There are 4 different behaviours necessary

  1. Your test fails and you want to inspect why using artifacts
  2. Your test succeeds and you want to keep the good artifact for reference only in that case
  3. You always want to have the artifact stored
  4. You want to never keep test artifacts around but control where they get stored.

Point 1 and 2 are relevant for this proposal but not addressed.

Point 4 is addressed by testing.TB.Tempdir and the matching environment variables for the system.

Point 3 is addressed spot on by the current proposal.

Especially addressing point 1 would provide alternatives for when using testing.TB.Tempdir would fit but isn't possible, because we can't keep things around on failure.

@aclements aclements moved this from Active to Likely Accept in Proposals Apr 2, 2025
@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.
— aclements for the proposal review group

The proposal is to add an API and command line flags for saving test artifacts. The following method would be added to testing.TB and implemented in T, B, and F:

package testing

type TB interface {
    ...

    // ArtifactDir returns a directory for the test to store output files in.
    // When the -artifacts flag is provided, this directory will be located
    // under the output directory. Otherwise, ArtifactDir returns a temporary directory
    // which is removed after the test completes.
    //
    // Each test or subtest within each test package has a unique artifact directory.
    // Repeated calls to ArtifactDir in the same test or subtest return the same directory.
    // Subtest outputs are not located under the parent test's output directory.
    func (t *testing.T) ArtifactDir() string
}

A new boolean -artifacts flag to go test and corresponding -test.artifacts flag to test binaries would enable saving of artifacts to the output directory configured by existing the -outputdir flag (which defaults to the working directory go test or the test binary).

The artifact directory would be named <test package>/<test name>/<random> where each component would be mangled to make it path-safe if necessary. The <test package> would be the Go package name of the test, like strings or runtime_test. Existing artifact directories would never be overwritten or deleted. The mangling would be left unspecified, but would be designed such that "nice" test names would map to "nice" directory names, meaning that [a-zA-Z0-9] are left as is, / and = are escaped in a predictable way, and long test names are truncated in a way that is highly likely to be unique (e.g., using a hash of the trimmed part).

When artifacts are enabled, the first call to ArtifactDir in a test or subtest will emit a line to the test output consisting of "=== ARTIFACTS ", the test name, and the test artifact directory, separated by spaces:

=== ARTIFACTS TestName/subtest_name /path/to/test/artifacts

When -json is specified, this will appear as an entry with an Action of "artifacts", the usual Time, Package, and Test keys, and a "Path" key containing the artifact directory:

{"Time":"2025-01-15T13:39:27.75235-08:00","Action":"artifacts","Package":"path","Test":"TestName/subtest_name","Path":"/path/to/test/artifacts"}

@neild
Copy link
Contributor Author

neild commented Apr 2, 2025

Your test succeeds and you want to keep the good artifact for reference only in that case

I don't think this case is interesting. The normal outcome is for a test to succeed. If you want to store artifacts when the test succeeds, I don't know why you wouldn't want to when it fails and presumably needs debugging. In the worst case, you have an artifact you'll need to delete.

Your test fails and you want to inspect why using artifacts

I think this case actually applies to more than just artifacts.

It's not uncommon that when I'm debugging a test failure, I find that I want to examine its temporary files. This is a bit annoying because t.TempDir directories are always deleted after a test runs.

With the artifacts proposal, I could change the test to use t.ArtifactDir instead of t.TempDir and run the test with -artifacts. I might get artifacts from tests I don't care about, but that's a minor problem.

This might imply that if a test creates temporary files that might be useful in debugging failures, it should always use t.ArtifactDir rather than t.TempDir. I don't know if that's a problem, but it'd be unfortunate if we saw people pushed to change existing usages of t.TempDir to ArtifactDir.

Maybe we should have a test flag that disables deleting temporary directories when a test fails. This would apply to both t.TempDir and, when -artifacts is not set, t.ArtifactDir.

I think that's a separate proposal than this one, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal Proposal-FinalCommentPeriod
Projects
Status: Likely Accept
Development

No branches or pull requests