Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: prohibit in-package tests from extending a package's API #29258

Open
rogpeppe opened this issue Dec 14, 2018 · 31 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Milestone

Comments

@rogpeppe
Copy link
Contributor

Currently in-package tests are free to add exported identifiers to the package that they are testing.
Specifically an in-package test:

  • can define new variables
  • can define new types
  • can implement methods on non-test types

This makes tooling harder, because any given package can have several possible variants,
depending on which package's tests are being run.

Consider the following example:

-- go.mod --
module example

-- a/a.go --
package a

import (
	"fmt"

	"example/b"
)

var A fmt.Stringer = b.B{}

-- b/b.go --
package b

type B struct{}

-- b/b1_test.go --
package b

func (B) String() string { return "b" }

-- b/b2_test.go --
package b_test

import (
	"testing"

	"example/a"
)

func Test(t *testing.T) {
	if got, want := a.A.String(), "b"; got != want {
		t.Fatalf("got %q want %q", got, want)
	}
}

Running go test example/b succeeds because when b is being tested, the internal tests extended the B type so it implements fmt.Stringer. But example/a won't actually compile on its own because B doesn't implement fmt.Stringer without the test code added by the internal tests in example/b!

It's as if tests exist in a closely related but different version of the universe in which every package might not be quite the same as its usual version. And that version is potentially different for every package.

This "parallel universe" property of testing makes Go types harder to reason about and the Go tooling less efficient, because it's not possible to type-check a package once and for all in the general case.

Where does this problem come from?

Currently in Go, test code can be internal to a package, extending that package for the duration of the test and allowing the test code access to unexported identifiers within the package, or it can be external, with access to exported identifiers only. A hybrid approach is also possible, where tests are largely external, but some package-internal test code, often in an export_test.go file, is used to provide access to selected parts of the internal API.

It's the final case that is problematic, because external test code is free to depend on other packages which themselves depend on the package being tested which has been extended with extra test code.

All three approaches are common in existing Go code.

I believe that any solution to this issue should continue to support almost all existing code (including the possibility of automatic code migration with gofix), otherwise substantial parts of the ecosystem will not be able to move forward with new Go features.

Proposed solution

I propose that it should not be possible to write tests in the same package as that being tested. That is, all tests must be in the external test package (e.g. foo_test to test package foo).

A test file can use a special form of import statement to request access to unexported identifiers from the package being tested. If they use this special form, code within that file (only) may access unexported identifiers within the package without compiler errors.

A possible (straw man) spelling for the syntax is:

import "."

That is, import the package in the current directory.

This form is currently invalid because relative import paths are not allowed, and "." is not a valid package name by itself, so wouldn't overlap with existing import path syntax. As there is only one package that can be imported in this way, there is no ambiguity problem. It also satisfies issue 29036, as the imported package name can be automatically derived from the current file's package identifier by stripping the _test suffix.

Another possibility might be to add an extra keyword, recognized only within an import block; for example:

import "mypackagepath" internal

Whatever the form for the special import syntax, this solution seems to solve the initial problem. It allows both all-internal tests (always use import "."), all-external tests (never use import ".") and hybrid tests ("internal" test code can define selected functionality to be used by external tests, similarly to how export_test.go files are used today). It also reduces the overall complexity of the way that tests work in Go.

I believe that it should also be possible to automatically convert almost all existing code to the new rules. Some care would need to be taken to rename symbols that appear in both internal and external test code, but that's not an hard issue to solve. Perhaps that issue is rare enough that manual intervention could be required. More investigation is needed to see how common it is.

As an example, some test code that exposes selected functionality to external tests might look like this:

-- b/b.go --
package b

type B struct {
	internalField string
}

func GetBValue() B {
	return B{
		internalField: "b",
	}
}
-- b/b1_test.go --
package b_test
import "."		// allow access to unexported identifiers within this file.

func BInternalField(x b.B) string {
	return x.internalField
}
-- b/b2_test.go --
package b_test

import (
	"testing"

	"example/b"
)

func Test(t *testing.T) {
	x := b.GetBValue()
	if got, want := BInternalField(x), "b"; got != want {
		t.Fatalf("got %q want %q", got, want)
	}
}

Other possible solutions

We could disallow all tests that rely on direct access to unexported identifiers i.e. allow external tests only. This is an attractive option for some, but the change would be very disruptive for much code. I do not believe that it would be possible to migrate existing internal tests automatically, so I think this possibility is excluded.

We could continue to allow both internal and external tests, but treat internal test code as being in its own package, with access allowed to the tested package's unexported identifiers (and all symbols from the package available in global scope), but otherwise separate from it. External tests could use some kind of special import syntax (for example import "thispackage@test") to access symbols defined in the internal tests. This was my original thought, but the model seems more complex than necessary - we have an extra pseudo-package for tests and a special import path syntax.

We could prohibit internal test code from defining methods on types defined by the testing package. This solves some of the more obvious problems with the current situation, but the "parallel universe" issue is still present, and tooling probably would not become significantly simpler.

@gopherbot gopherbot added this to the Proposal milestone Dec 14, 2018
@rogpeppe rogpeppe added the v2 A language change or incompatible library change label Dec 14, 2018
@Merovius
Copy link
Contributor

How are tests any different from build tags in general here? If this is a problem, shouldn't we also disallow having different APIs under different sets of build tags? How would syscall work under those assumptions? And how would you check for this (AIUI this would require solving SAT ;-) )?

i.e. I agree that you can solve the test-portion of this, but I don't think you ever can get to a point where you have any guarantees of this kind for tooling.

@rogpeppe
Copy link
Contributor Author

How are tests any different from build tags in general here?

Tests are different from build tags in general because there are no build tags for tests (deliberately).

With a given set of build tags, anything might change, but that's inevitable, given the way that build tags work. The test thing just adds another dimension to that. Even if you do know the set of build tags, you can't precompile a given package.

Consider this: with a given build environment (including tags), it is currently not possible to have a simple map from each package to type-checked information on that package.

With this proposal, that would be possible, making it easier to reason about and write tools that use that information.

@Merovius
Copy link
Contributor

Tests are different from build tags in general because there are no build tags for tests (deliberately).

I was referring to the lined out problems of the current situations - tooling doesn't know what is and isn't part of the API because it depends on how you call the Go tool. If your tool relies on that knowledge, and that knowledge is principally unavailable because of build tags, how does this proposal actually solve anything? Your tool still won't be able to provide any better correctness guarantees.

it is currently not possible to have a simple map from each package to type-checked information on that package. With this proposal, that would be possible, making it easier to reason about and write tools that use that information.

APIs defined in test-files can not be used by importers (as you point out, test is not a build tag). Why would you need such a map? And how is a map keyed by "import path + test(yes/no) + <build_tags>" significantly worse than a map keyed by "import path + <build_tags>" for that purpose?

I honestly don't understand the problem you are trying to solve here. Or rather: I understand the problem and I suffered from it before, but I don't see why tests specifically should be addressed, while I consider the build-tag problems to be far more severe.

(BTW: A way to address the build-tag problem would be to disallow exported declarations in files guarded by build tags. But I'm not totally sure you'd want to)

@natefinch
Copy link
Contributor

I'd definitely be in favor of disallowing exported identifiers in test files.

For build tags, I wish that were possible, because it seems like a bad design to have different APIs for different platforms, but at the same time, then you have APIs that essentially lie about what they support for some platforms.

I wish there were a solution for godoc that could merge all APIs for a package and indicate which are platform-specific.

@natefinch
Copy link
Contributor

I do like the idea of keeping a package sealed off from testing code, so that the package is the package, and there's none of this kind of wacky thing where when you test, now the package has different code in it. Adding code to the code that you're testing at test time seems like a fantastically bad idea in general. By definition, then you're not actually testing the production code.

At the same time, I find internal tests to be immensely useful in my own work, providing fast, reliable, focused assurances that the code does what I expect it to do without having to change the API to do that testing.

Like Axel, I'm not sure about the real benefit here. It sounds like it's intended to make tooling faster and simpler, which I am all for, but are tests really a problem?

I'd like to hear more about the problem this is solving and what the benefits are.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Dec 14, 2018

At the same time, I find internal tests to be immensely useful in my own work, providing fast, reliable, focused assurances that the code does what I expect it to do without having to change the API to do that testing.

This proposal still allows you to write tests that access internal package details. I'm not sure what you're driving at here.

@Merovius
Copy link
Contributor

Additions:

APIs defined in test-files can not be used by importers

There is one exception to this: foo_test can import foo and then use exported APIs created in _test.go files of foo - which is the currently recommended way to access unexported APIs from test-packages.

Adding code to the code that you're testing at test time seems like a fantastically bad idea in general. By definition, then you're not actually testing the production code.

FWIW, a case where I actually added exported methods in tests to existing types is convenience helpers for constructing nested data structures. So, for example, I have a binary tree implementation and implement fmt.Stringer and fmt.Scanner to work with ((a b) c) or the like for readability in tests. The methods have to be exported to be callable for fmt, but I don't want to commit to the format and make it part of my actual API.

Lastly: If we don't want to reduce the utility of testing (i.e. still allow accessing internal identifiers), ISTM that this would require a language change - the prohibition of accessing unexported identifiers of foreign packages is part of the language and the compiler would need to both generate the necessary symbols when compiling the package-under-test and know to allow accessing unexported identifiers in certain situations. This would also mean, that we'd have to think about how this affect third-party build tools (e.g. bazel, gb…). It definitely makes this not much of a simplification (quite the opposite). Note, that the current (tooling-only) mechanism to access unexported identifiers from a _test-package is exactly what this proposal wants to forbid.

@natefinch
Copy link
Contributor

This proposal still allows you to write tests that access internal package details. I'm not sure what you're driving at here.

I wanted to forestall any confusion in case someone might think that I wanted to remove internal testing entirely.

@rogpeppe
Copy link
Contributor Author

APIs defined in test-files can not be used by importers (as you point out, test is not a build tag). Why would you need such a map? And how is a map keyed by "import path + test(yes/no) + <build_tags>" significantly worse than a map keyed by "import path + <build_tags>" for that purpose?

Such a map can't be keyed just by (importPath, isTest, buildTags) tuple, because test isn't a bool - it's a vector. It's more like (importPath, packageUnderTest, buildTags) - i.e. each package can have an indefinite number of versions, not just two.

To demonstrate, consider the following module:

-- go.mod --
module example
-- a/a.go --
package a

import (
	"fmt"
	"reflect"

	"example/b"
	"example/c"
)

var (
	stringerType = reflect.TypeOf(new(fmt.Stringer)).Elem()
	BIsStringer  = reflect.TypeOf(b.B{}).Implements(stringerType)
	CIsStringer  = reflect.TypeOf(c.C{}).Implements(stringerType)
)
-- b/b.go --
package b

type B struct{}

-- b/b1_test.go --
package b

func (B) String() string { return "b" }

-- b/b2_test.go --
package b_test

import (
	"testing"

	"example/a"
)

func Test(t *testing.T) {
	if got, want := a.BIsStringer, true; got != want {
		t.Fatalf("B: got %v want %v", got, want)
	}
	if got, want := a.CIsStringer, false; got != want {
		t.Fatalf("C: got %v want %v", got, want)
	}
}
-- c/c.go --
package c

type C struct{}
-- c/c1_test.go --
package c

func (C) String() string { return "c" }
-- c/c2_test.go --
package c_test

import (
	"testing"

	"example/a"
)

func Test(t *testing.T) {
	if got, want := a.BIsStringer, false; got != want {
		t.Fatalf("B: got %v want %v", got, want)
	}
	if got, want := a.CIsStringer, true; got != want {
		t.Fatalf("C: got %v want %v", got, want)
	}
}

Note that Test in c2_test.go and Test in b2_test.go both expect different results.
Now let's see how this looks under the go/packages API. Here's some code to print some info about all the packages within the example module:

package main

import (
	"fmt"
	"log"
	"strings"

	"golang.org/x/tools/go/packages"
)

func main() {
	pkgs, err := packages.Load(&packages.Config{
		Mode:  packages.LoadTypes,
		Dir:   ".",
		Tests: true,
	}, "./...")
	if err != nil {
		log.Fatalf("cannot load packages: %v", err)
	}
	packages.Visit(pkgs, func(pkg *packages.Package) bool {
		if strings.HasPrefix(pkg.PkgPath, "example/") {
			fmt.Printf("%s\n", pkg.ID)
		}
		return true
	}, nil)
}

When run in the example module directory, this prints:

example/b
example/c
example/a
example/b [example/b.test]
example/b_test [example/b.test]
example/a [example/b.test]
example/b.test
example/c [example/c.test]
example/c_test [example/c.test]
example/a [example/c.test]
example/c.test

Note that there are three versions of example/a, one for each package that depends on an internal module that's under test and has internal test code. In a larger code base where we're testing a lower level package with some much higher level testing infrastructure that depends on it, the number of extra packages could be considerable.

If I was writing some vet-like tool that relied on type information, this complicates matters; for example, which versions of example/a do I report errors in? When compiling Go packages, you may have seen duplicate error message being printed. This is the reason that happens.

@rogpeppe
Copy link
Contributor Author

AFAICS if this proposal was implemented, there would be no need for any of the example/a [example/b.test]-style package identifiers - that's a complexity introduced entirely by the current system.

@Merovius
Copy link
Contributor

That's an interesting example. Effectively a circular import after all. Weird, but makes sense :)

If I was writing some vet-like tool that relied on type information, this complicates matters; for example, which versions of example/a do I report errors in?

I would argue: The one that is the root of the analysis. I.e. running go test in example/b will only encounter one version of example/b and thus only one version of example/a. It's fair though, that you can have multiple roots and that causes ambiguity.

Alright, I see how this would simplify one use case for tools :) Still am of the opinion it's not worth the downside of having to either limit testing to exported APIs or having to change the language.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Dec 14, 2018

Lastly: If we don't want to reduce the utility of testing (i.e. still allow accessing internal identifiers), ISTM that this would require a language change - the prohibition of accessing unexported identifiers of foreign packages is part of the language and the compiler would need to both generate the necessary symbols when compiling the package-under-test and know to allow accessing unexported identifiers in certain situations.

Yes, the compiler would need to generate unexported symbols when the package has tests that use the import "." syntax, but gc at least already generates those symbols; I don't know if gccgo elides unused unexported identifiers before link time or not.

And, yes, the compiler would need to know that unexported symbols could be accessed in this particular case, and that is indeed a language change. Personally I think that's better than the current somewhat hacky (and problematic in the larger view of things) situation, but yours is definitely a valid perspective - there is a trade-off here.

This would also mean, that we'd have to think about how this affect third-party build tools (e.g. bazel, gb…). It definitely makes this not much of a simplification (quite the opposite).

I'm not sure. It might well end up simplifying things in those tools too. I wouldn't rush to conclusions here.

@rogpeppe
Copy link
Contributor Author

Alright, I see how this would simplify one use case for tools :)

Note that it isn't just one use case. It's the way that the whole model works.

Anyone that wants to use the go/packages tooling really should have some understanding of this. It was certainly quite surprising to me when I discovered how this works, and I suspect it would also be surprising to almost everyone else too.

@kardianos
Copy link
Contributor

@rogpeppe This proposal seems to have a few problems:

  1. This would break every piece of testing code in the wild.
  2. Functionally, I have never had this problem break my code, or heard of it ever being a pain point.
  3. This is not unique to tests. For instance "syscall" type packages will have different APIs on purpose, and it is somewhat common for windows build tags to have different symbols, some perhaps exported, then other platforms.
  4. Right now go/packages as far as I can tell, still requires a specific GOOS/GOARCH/tests/tags to process files. I view this as a problem (I won't go into details, others disagreed). I think this is a much more fundamental issue. I think this proposal is an attempt to paper over this design issue. I think a go/packages needs to expose all go files in a single query, with some helpers to get a build-able set for tools that need to resolve symbols.

Tools that work on code (API compatibility checkers, symbol rename) must be go tag (including test tag) aware.

Or you will run into further "problems" like this down the line. Yes, it will be more complicated, and will probably involve significant changes to a parser lexer to handle for many tools. But I think it would be great to handle Rename Symbol across multiple mutually exclusive build tags.

@bcmills
Copy link
Contributor

bcmills commented Dec 14, 2018

@kardianos

This would break every piece of testing code in the wild.

Could you give some concrete examples? I can't think of a single test I've ever interacted with that would break.

This is not unique to tests.

It is unique to tests in that the API of the package can change without changing the build configuration — that is, without changing the GOOS, GOARCH, build tags, or experiment flags — just by looking at the package from the perspective of some other package in the same build.

@kardianos
Copy link
Contributor

kardianos commented Dec 14, 2018

@bcmills I must be misunderstanding something then, or there is some background context I'm missing.

The first sentence under "Proposed Solution" says:

I propose that it should not be possible to write tests in the same package as that being tested. That is, all tests must be in the external test package (e.g. foo_test to test package foo).

I regularly have tests in package foo (file foo_test.go), not package foo_test. If I understand this proposal, that would no longer be possible (without and added import "."). Maybe the implication was to use gofix to include import "." in all such current situations?

It is unique to tests in that the API of the package can change without changing the build configuration

This sounds like a definition thing. But in my mind, testing is just another build configuration.

However, the crux of my issue is I think the following is a poor solution:

Consider this: with a given build environment (including tags), it is currently not possible to have a simple map from each package to type-checked information on that package.

This shouldn't be necessary for tools to setup in the first place. Secondly, it is possible by defining a test flag as a part of the build environment, though granted it does at least double the number of configurations needed, but certainly possible.

From the sound of it, I'm probably missing a bunch of context. But the premises of this need continues to concern me.

@dgryski
Copy link
Contributor

dgryski commented Dec 14, 2018

Related: #6204

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Dec 15, 2018

Like Axel, I'm not sure about the real benefit here. It sounds like it's intended to make tooling faster and simpler, which I am all for, but are tests really a problem?

To get an idea of the extent of this issue in a real situation, I looked at a portion of an existing code base to see what the impact of the current scheme is. I looked at the apiserver subdirectory inside github.com/juju/juju.

Inside that directory, there are 121 packages comprising 159317 lines of code in total, of which 54% is testing code. It takes 6m55s for me to run go test -run '^$' github.com/juju/juju/apiserver/....

I ran the Load/Visit/print code listed above on that directory. This was the result (after sorting): http://paste.ubuntu.com/p/3PGzcTYM3v/

The duplication count due to internal test packages is quite high. Here's the list of packages and their duplication counts (how many versions of that package are required when building tests): http://paste.ubuntu.com/p/Pg5sggZpTp/

In total there are 172 extra packages. If we multiply the number of lines of code in each package by its duplication count (a crude measure, but perhaps a plausible proxy measure for the extra work done in the compiler), we see that an extra 263730 lines of code are used. If we include that as part of the original count, we get 423047 lines, or 2.65 times the original line count.

This it seems to me that the current scheme may be responsible for a significant amount of extra compilation time on larger projects.

More measurement would be needed to see how much time and/or memory this might actually save, but perhaps it might go some way towards reducing the overall overhead of running tests.

PS I tried the same thing on the whole juju code base. Some of the packages have duplication counts of up to 35 when the whole module is considered, and the total extra package count is 2382 (there are 621 packages in the code base itself).

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Dec 15, 2018

@kardianos

  1. This would break every piece of testing code in the wild.

Not every piece of testing code, but yes, it would require that all internal test code be changed so that identifiers become appropriately package-qualified. With a few relatively rare exceptions (for example the trick that @Merovius uses above), I believe this could be done automatically. It might arguably lead to a net-positive gain in code readability afterwards.

There are solutions that wouldn't require so much churn (for example the second suggestion under "Other possible solutions" in the proposal would only need changes to code that accesses internal test identifiers from external test code), but the model becomes more complex when it could become simpler.

  1. Functionally, I have never had this problem break my code, or heard of it ever being a pain point.

The proposal isn't about code breakage. It's about overall simplification of the Go compilation model.

Tools that work on code (API compatibility checkers, symbol rename) must be go tag (including test tag) aware.

There's no such thing as a "test tag". Whether you're testing is orthogonal to the current set of tags.

This shouldn't be necessary for tools to setup in the first place. Secondly, it is possible by defining a test flag as a part of the build environment, though granted it does at least double the number of configurations needed, but certainly possible.

As I hope I've made clear in my replies above, you can't just use a single test flag - the number of configurations can be indefinitely high (which is why we have up to 15 copies of a given package in the real-world example above).

@Merovius
Copy link
Contributor

FWIW for duplication during tests the issue isn't only defining new APIs in _test.go files, but even just the concept of conditional compilation during tests. That is:

And, yes, the compiler would need to know that unexported symbols could be accessed in this particular case

Even if foo_test.go does not declare exported identifiers, it will result in a different compiled artifact that the compiler would conservatively have to recompile. i.e. for all of this to work, go test foo and go build foo would actually have to end up creating the same artifact containing the same code.

It's not 100% clear, but I assume you want to forbid _test.go files that contain a non-_test package declaration altogether, which would address this. But it does still mean that the compiler shouldn't do anything different when compiling a package-under-test.


FWIW, I'm not saying that we should do this, but another way to address all of these issues is to make test an actual build-tag. That has, of course, other downsides. But just with regular build tags, you would then get the property that they are specific to the execution of the go tool, instead of specific to a package. Again, not saying we should do this, just trying to open up the solution space :)

@rogpeppe
Copy link
Contributor Author

It's not 100% clear, but I assume you want to forbid _test.go files that contain a non-_test package declaration altogether, which would address this.

Yes, that's the proposal. I thought that should have been clear from: "I propose that it should not be possible to write tests in the same package as that being tested. That is, all tests must be in the external test package (e.g. foo_test to test package foo)."

But it does still mean that the compiler shouldn't do anything different when compiling a package-under-test.

Yup. Why would it need to?

FWIW, I'm not saying that we should do this, but another way to address all of these issues is to make test an actual build-tag. That has, of course, other downsides. But just with regular build tags, you would then get the property that they are specific to the execution of the go tool, instead of specific to a package.

That's definitely a possibility, but then with the test tag on, all internal test APIs would be exposed to all packages. That makes them part of the public API, which is probably not desirable.

@Merovius
Copy link
Contributor

Yup. Why would it need to?

I understood that to be implied by "the compiler needs to know that unexported identifiers are accessed in this case" (this case being "having a test that accesses unexported identifiers"). Either way, at least it's clarified now :)

mvdan added a commit to mvdan/gogrep that referenced this issue Dec 16, 2018
In go/packages, this currently means multiplying the loading and
typechecking work most of the time. Not only because of the _test.go
files and their extra dependencies, but also because we need to load and
typecheck internal test packages.

However, most queries are geared towards non-test code, so it's best not
to assume that the user wants to include test code. This way, the tool
is simpler and faster by default. It's also worth noting that most
static analysis tools already skip tests by default.

Reduces 'go test' time from ~2.4s to ~0.4s on my laptop.

We might want to reconsider the performance cost if
golang/go#29258 is implemented, or if gogrep
starts being used with a language server.
@kardianos
Copy link
Contributor

@rogpeppe

There's no such thing as a "test tag". Whether you're testing is orthogonal to the current set of tags.

I agree you are technically correct, there is not "test tag". But currently for the purposes of build caching, the test flag is an input into the cache key. I understand your primary motivation is to change this so tests compile independently of the rest of the package.

Have you done analysis on how allowing a import "." would impact complication of internal symbols? Today such symbols could be compiled away/in-lined entirely. But with this change I suspect any package level variable would be required to remain. Would this impact package symbols carried from imported packages that are only referenced from internal symbols?

This shouldn't be necessary for tools to setup in the first place. Secondly, it is possible by defining a test flag as a part of the build environment, though granted it does at least double the number of configurations needed, but certainly possible.

As I hope I've made clear in my replies above, you can't just use a single test flag - the number of configurations can be indefinitely high (which is why we have up to 15 copies of a given package in the real-world example above).

That's exactly my point. Thank you.

Currently, the number of build configurations can be indefinitely high in the current go/packages design. Adding -test to the build configuration will just make that higher. I agree.

So is the problem the various combinations that build tags and the test flag can be combined to get different symbols and -test is the most common offender, or is the problem that the current lexer/parser used for analysis doesn't handle multiple symbol versions for analysis? (I understand modifying the lexer/parser for analysis would be hard, but it would not contain pathological edge cases of explosion of build input combinations.)

I intend to not argue further on this point, I think my points been made. Thank you for listening, I hope I wasn't too long winded.

@Merovius
Copy link
Contributor

Merovius commented Dec 16, 2018

So is the problem the various combinations that build tags and the test flag can be combined to get different symbols and -test is the most common offender

I guess the problem is that a build tag applies to a complete call of a tool (you have to provide them in packages.Config), whereas the test-"flag" applies only to specific packages. If you start an analysis (or go test run…) at more than one package at the same time, you can have multiple sets of configs in the same run - but still only ever one set of build tags.

That's why I mentioned above that making test a full build flag (fully aware of the downsides) also solves this problem.


One thing that IMO makes this somewhat more confusing is the semantics of the packages-API. Load returns a slice of packages. If you range over that list and call Visit with each one in turn¹, every package will appear in the tree at most once - that is, Visiting each the *Packages individually provides you with an import-tree rooted at that package. So a tool that does that, when provided with a pattern (or list of packages), will behave the same as if you'd run it for each matching pattern individually (but be more efficient). However, the API is suggestive that you should pass the returned slice to Visit directly, which will then deduplicate packages and "lose" the information about what the root of the analysis was (instead just iterating over the transitive closure).

So, ISTM that we could at least document explicitly, that if you want the "every passed package its own analysis" semantics, you should range over the returned slice, whereas if you want the "anything in the transitive closure of dependencies" semantics, you should pass the slice to Visit (and be aware of this problem).


[1] Modifying Rog's example:
func main() {
	pkgs, err := packages.Load(&packages.Config{
		Mode:  packages.LoadTypes,
		Dir:   ".",
		Tests: true,
	}, "example/...")
	if err != nil {
		log.Fatalf("cannot load packages: %v", err)
	}
	for _, pkg := range pkgs {
		if !strings.HasPrefix(pkg.ID, "example/") {
			continue
		}
		fmt.Printf("%s\n", pkg.ID)
		packages.Visit([]*packages.Package{pkg}, func(pkg *packages.Package) bool {
			if strings.HasPrefix(pkg.PkgPath, "example/") {
				fmt.Printf("\t%s\n", pkg.ID)
			}
			return true
		}, nil)
	}
}

Output:


```example/b                            
	example/b
example/c
	example/c
example/a
	example/a
	example/b
	example/c
example/b [example/b.test]
	example/b [example/b.test]
example/b_test [example/b.test]
	example/b_test [example/b.test]
	example/a [example/b.test]
	example/b [example/b.test]
	example/c
example/b.test
	example/b.test
	example/b [example/b.test]
	example/b_test [example/b.test]
	example/a [example/b.test]
	example/c
example/c [example/c.test]
	example/c [example/c.test]
example/c_test [example/c.test]
	example/c_test [example/c.test]
	example/a [example/c.test]
	example/b
	example/c [example/c.test]
example/c.test
	example/c.test
	example/c [example/c.test]
	example/c_test [example/c.test]
	example/a [example/c.test]
	example/b

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Dec 17, 2018

Another thing I just thought of.

In the example I gave above, just building all the test binaries without running any tests took a long time (almost 7 minutes). I could imagine that as an optimisation, the Go compiler could potentially build several packages' tests into a single binary to speed up overall build overhead. This would be hard currently with many versions of the same package, but much more straightforward with the proposed scheme, where there's only one package for any given set of build tags.

As a very rough estimation of some potential savings from this approach, we can estimate that the current time-per-test-binary-build is about 14s (6m55s for 121 packages parallelized over 4 processors). if we crudely assume that a single binary takes 12% longer to build (it adds 121 more packages to the 995 packages used by all packages being tested), then the overall build time would drop from 6m55s to ~15s.

I'm sure the saving wouldn't be as great as that, but i suspect it could be considerable

@UFOXD

This comment has been minimized.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 8, 2019
@ianlancetaylor ianlancetaylor changed the title proposal: prohibit in-package tests from extending a package's API proposal: Go 2: prohibit in-package tests from extending a package's API Jan 8, 2019
mvdan added a commit to mvdan/unparam that referenced this issue Jan 24, 2019
First, this makes the tool way faster, as most developers write tests
within the same package. This greatly slows down package loading, as we
must load each package twice.

If golang/go#29258 is ever accepted and
implemented, that cost might be reduced.

Second, most warnings within test files are not very useful. Even if
most aren't false positives, it's common for developers to not care as
much about test code quality.

Finally, loading the tests can lead to different results, as test files
can also use the functions which may or may not give warnings. More
often than not, the developer cares about how non-test functions are
used from within non-test code.

For all these reasons, disable test loading by default. It can still be
enabled manually.
@ainar-g
Copy link
Contributor

ainar-g commented Jan 27, 2019

testing/quick defines and interface type, Generator, which I use in my personal projects. Under this proposal, this interface is either impossible, or becomes a part of the package's public API, or forces the programmer to write unnecessary wrapper types. And this is not unique to testing/quick. Any test helper package that uses interfaces will face the same problem.

@Merovius
Copy link
Contributor

Merovius commented Jan 27, 2019

@ainar-g You can still implement Generator, you just can't do it for the package-defined types. But the _test package can define its own type with an exported method. So, this would work fine:

package foo_test

type TestFoo struct { foo.Foo }

func (TestFoo) Generate(rand *rand.Rand, size int) reflect.Value {
    foo := generateFoo(rand, size)
    return TestFoo{foo}
}

func CheckFoo(f TestFoo) bool {
    return isOkay(f)
}

(disclaimer: I don't use testing/quick and might be misusing the API)

@ainar-g
Copy link
Contributor

ainar-g commented Jan 27, 2019

@Merovius This is exactly the kind of “unnecessary wrapper types” that I've mentioned in my original message.

@Merovius
Copy link
Contributor

@ainar-g Well, it's not unnecessary, is it? :) But I get your point, I apparently read over that sentence fragment.

@nemo-cpt
Copy link

nemo-cpt commented Jan 28, 2022

I'd like to upvote this proposal. With huge CI builds of 1000s of go modules under test at once the overhead of rebuilding and caching each module in 100s configurations depending on the test it built for becomes quite significant. We use proprietary cross-language build system that supports go, but the complexity we face may be demonstrated by this Google's bazel code:

https://github.com/bazelbuild/rules_go/blob/master/go/private/rules/test.bzl#L445

Basically for every dependency of every test target we should decide whether it has transitive path to package under test and if so rebuild it (and cache). We cannot rebuild only immediate dependants. Like Googlers say

Libraries imported by the external test that transitively import the
library under test must be recompiled too, or the linker will complain that
export data they were compiled with doesn't match the export data they
are linked with.

One may say that we ought to use native go toolchain, but it wouldn't solve a problem. There is no magic in native build: all the same variants of all the same packages are to be built and cached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests