Skip to content

proposal: cmd/vet: warn about cross package struct conversion #45051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
carnott-snap opened this issue Mar 16, 2021 · 23 comments
Closed

proposal: cmd/vet: warn about cross package struct conversion #45051

carnott-snap opened this issue Mar 16, 2021 · 23 comments

Comments

@carnott-snap
Copy link

As a concrete solution to #44893, as discussed in golang-nuts, I would like to see a "vet check that would warn about attempts to convert a struct defined in a different package to a locally defined struct". This would mirror an existing "vet check for composite literals that verifies that if a composite literal is used with a struct defined in a different package, then the fields are explicitly named". (wording credit goes to @ianlancetaylor)

@gopherbot gopherbot added this to the Proposal milestone Mar 16, 2021
@mdempsky
Copy link
Contributor

This seems a bit silly to me, but I'll agree with @ianlancetaylor that if cmd/vet already has a warning for cross-package unkeyed struct literals, that seems like precedent for a similar warning here.

I'd suggest instead of warning about "converting a struct defined in a different package to a locally defined struct" to instead warning about "converting from struct type T to struct type U where T and U's underlying struct type literals appeared in different packages."

In particular, this would cause a warning when converting from pkgA.StructType to pkgB.StructType, not just from pkgA.StructType to MyLocalStructType.

Also, it would exempt conversions of pkgA.StructType to types declared like type MyStructTypeCopy pkgA.StructType, since in this case it will always have identical field layout.

@carnott-snap
Copy link
Author

I'd suggest instead of warning about "converting a struct defined in a different package to a locally defined struct" to instead warning about "converting from struct type T to struct type U where T and U's underlying struct type literals appeared in different packages."

I prefer this wording too.

Also, it would exempt conversions of pkgA.StructType to types declared like type MyStructTypeCopy pkgA.StructType, since in this case it will always have identical field layout.

Will this be problematic to detect based on the AST?

@mdempsky
Copy link
Contributor

Will this be problematic to detect based on the AST?

No. Given the *types.Struct for types T and U, the analyzer just needs to check that Field(0).Pkg() evaluates to the same package for both of them.

This won't detect empty structs, but I don't think that's worth worrying about.

@ericlagergren
Copy link
Contributor

ericlagergren commented Mar 16, 2021

I don't see why vet should complain about perfectly valid code. The unkeyed literal check can be easily fixed, but this proposed check cannot.

IOW, if I want to augment my local T2 with new struct tags or methods (perhaps for serialization purposes) and then this vet check is added, I can effectively no longer do so.

I don't quite see the difference between this proposal and a proposal that makes vet complain if you convert otherpkg.T1 to local T2 where T1 and T2 are both (e.g.) int32. In theory, otherpkg could change the type of T1 to int64, breaking your conversion.

@carnott-snap
Copy link
Author

These two do not seem comparable; adding fields is explicitly called out as allowed in go1compat:

Struct literals. For the addition of features in later point releases, it may be necessary to add fields to exported structs in the API. Code that uses unkeyed struct literals (such as pkg.T{3, "x"}) to create values of these types would fail to compile after such a change. However, code that uses keyed literals (pkg.T{A: 3, B: "x"}) will continue to compile after such a change. We will update such data structures in a way that allows keyed struct literals to remain compatible, although unkeyed literals may fail to compile. (There are also more intricate cases involving nested data structures or interfaces, but they have the same resolution.) We therefore recommend that composite literals whose type is defined in a separate package should use the keyed notation.

Changing the underlying type of a custom type is explicitly called out as a breaking change in apidiff and implicitly by go1compat:

It is intended that programs written to the Go 1 specification will continue to compile and run correctly, unchanged, over the lifetime of that specification.

@ericlagergren
Copy link
Contributor

Sure. But circling back to my primary point: it's a perfectly valid use of the language to convert T1 to pkg.T2 where both T1 and pkg.T2 are structs. And unless it's optional, a go vet check would make that language feature mostly unusable. And unlike the keyed literal check, this can't be fixed "for free". I.e., without manually copying each field of pkg.T2 to T1.

@carnott-snap
Copy link
Author

As long as both types share an underlying type, conversions are fine. Structs are just the one underlying type where changing the definition (adding a new field) is defined as compatible. IMO, this seems like an oversight, because this can and will break if you do not control both types.

I agree, it sucks that you cannot do this trick for marshaling, because the only alternative is implementing XxxMarshal and then forward EVERY field. (See golang-nuts thread.) Are there other use cases that this will impair?

@ericlagergren
Copy link
Contributor

There are any number of reasons outside of marshaling. Perhaps you want to change the type's method set. Or mask off methods. Or perhaps you control both packages and this is a non-issue. Or perhaps you've got two packages that generate overlapping an overlapping set of types with cgo godefs.

Anyway, it seems like the burden is on go vet here, not the valid yet brittle code—right?

@timothy-king
Copy link
Contributor

I agree that there is a precedent for adding this check from x/tools/go/analysis/passes/composite. This and the precedent are slightly debatable for satisfying the correctness requirement in https://golang.org/src/cmd/vet/README.

Vet's checks are about correctness, not style. A vet check must identify real or
potential bugs that could cause incorrect compilation or execution. A check that
only identifies stylistic points or alternative correct approaches to a situation
is not acceptable.

I lean more towards this is a maintenance over time/'identifying an alternative correct approach' issue and less of a correctness issue. I can understand that others might come to a different conclusion though. I am also not yet convinced this will satisfy the vet Precision requirement either. (In fairness, I also have some doubts that the precedent composite satisfies the precision criteria too, but it is on me to go check this out).

I would be okay with a user centric view of correctness and precision for identifying true positives: does someone act on this if it is reported instead of ignoring it. I would like to see some data about how frequently users act on such warnings.

We currently have different levels of checkers with vet: those run by go test, those run by go vet, and those disabled by default. The latter can have more false positives than the first two. Note there is nothing stopping this Analyzer from being developed outside of vet.

"converting from struct type T to struct type U where T and U's underlying struct type literals appeared in different packages."

For p.T and q.U, I am worried that this can be considered a false positive when p and q are maintained together. Say they are myvanity.com/projectFoo/{p,q} and are in the same module. Ownership is not a concept directly present in the code or module graph, but it does change my expectation about how helpful such a warning is. I doubt we are helping anybody by warning about in the p and q example, and I really doubt we would be helping someone on if we changed one of the packages to myvanity.com/projectFoo/internal/r. As long as the packages in the module compile, the compiler keeps this enforced over time. Different modules seems like a plausible boundary. (Note: Module info is not yet supported in the x/tools/go/analysis framework.) Is it worth warning me about if there are 2 packages in 2 different modules if I maintain both of them? Maybe? I can see someone saying either "yes, please warn me" or "no, it's fine".

@tmthrgd
Copy link
Contributor

tmthrgd commented Mar 17, 2021

For p.T and q.U, I am worried that this can be considered a false positive when p and q are maintained together. Say they are myvanity.com/projectFoo/{p,q} and are in the same module.

It might be relevant to consider this change alongside #43864.

@zikaeroh
Copy link
Contributor

I have a project which has a config struct in one package, but then redefines the struct in the main package so it can add struct tags for flag parsing to offer a CLI. A vet check like this would frustrate me.

I fully understand the composite check for struct literals; you have no warning that your code may have broken. But in order to do a struct conversion, you must have written out the full struct type already with all types matching (only with tags differing), so the conversion failing at compile time is the warning that says "this doesn't work anymore". I don't think there needs to be anything stronger than that.

@carnott-snap
Copy link
Author

I have a project which has a config struct in one package, but then redefines the struct in the main package so it can add struct tags for flag parsing to offer a CLI. A vet check like this would frustrate me.

Not sure how the composite literals check is implemented today, but imo, if it is behind an internal package, vet should let you do whatever you want.

I think there is something to be said about module level checks, but that is impossible today, and affects composite literals too.

I fully understand the composite check for struct literals; you have no warning that your code may have broken. But in order to do a struct conversion, you must have written out the full struct type already with all types matching (only with tags differing), so the conversion failing at compile time is the warning that says "this doesn't work anymore". I don't think there needs to be anything stronger than that.

This check is not to protect two types controlled in the same module, but between authors or module versions. There is no guarantee that the owner of a struct will or will not add a field, and a simple minor upgrade could break your build. Plus this type of upgrade could be forced by a dependency that imports the newer version taking any " choice out of your hands.

@zikaeroh
Copy link
Contributor

Not sure how the composite literals check is implemented today, but imo, if it is behind an internal package, vet should let you do whatever you want.

I think there is something to be said about module level checks, but that is impossible today, and affects composite literals too.

Consider the case of a project that's both a library and CLI; one definition is exported (not internal), the other is in main or some other package used by main. This check would fail under that circumstance; internal isn't the right tool to silence these errors.

This check is not to protect two types controlled in the same module, but between authors or module versions. There is no guarantee that the owner of a struct will or will not add a field, and a simple minor upgrade could break your build. Plus this type of upgrade could be forced by a dependency that imports the newer version taking any " choice out of your hands.

To be fair, the title of the issue and its content say "cross package check". 🙂

I am sympathetic to this check helping prevent problems when modules are updated and one dep has done this to another. A cross-module check is more reasonable, but I'm still very wary about existing uses where the intent was just to be switching around tags. I'd be interested in some "query all of github" checks to find cases where this might help and to categorize reasons why people use these types of conversions.

(I also assume this would not meet the bar to be a vet check run in go test; if that happens, any code doing this is dead in the water.)

@carnott-snap
Copy link
Author

Consider the case of a project that's both a library and CLI; one definition is exported (not internal), the other is in main or some other package used by main. This check would fail under that circumstance; internal isn't the right tool to silence these errors.

You can actually use internal to ensure that both symbols and the conversions always happen in one package, thus pass the vet check, if you use a func and reexport the public symbol.

package internal

type Exported struct{ ... }

type Unexported struct{ ... }

// To could also be defined as a method.
func To(e Exported) Marshalable { return Marshalable(e) }
package main

func main() {
        json.Marshal(internal.To(internal.Exported{}))
}
package export

type Struct = internal.Exported

To be fair, the title of the issue and its content say "cross package check". 🙂

I am sympathetic to this check helping prevent problems when modules are updated and one dep has done this to another. A cross-module check is more reasonable, but I'm still very wary about existing uses where the intent was just to be switching around tags. I'd be interested in some "query all of github" checks to find cases where this might help and to categorize reasons why people use these types of conversions.

I say module because that is the way I think about the problem space logically, but existing vet tooling cannot support any of this nuance. If we decide to wait, I understand, but I want to stem the current bleeding.

(I also assume this would not meet the bar to be a vet check run in go test; if that happens, any code doing this is dead in the water.)

I do not have a strong opinion, and was not aware that vets could break tests, but it sounds like popular sentiment is opposed.

@zikaeroh
Copy link
Contributor

zikaeroh commented Mar 17, 2021

I can't say I'd be thrilled to have to define my types in that way to avoid go vet being mad at me, sorry. (Nor does it address my "CLI wants to add tags to a config struct in another package" use case.)

I do not have a strong opinion, and was not aware that vets could break tests, but it sounds like popular sentiment is opposed.

go test runs a subset of go vet checks that are known to be "very accurate". An example is the recently added "unbuffered channel used with signal.Notify" checks that will error when you run go vet, but not say anything on go test. go vet can certainly have more checks than go test (but the bar is still high, just not as high as go test's subset).

@carnott-snap
Copy link
Author

carnott-snap commented Mar 17, 2021

I can't say I'd be thrilled to have to define my types in that way to avoid go vet being mad at me, sorry.

It is not required that you use vet, as long as this is not a go test lint, and you can always disable the analysis.Analyzer.

Also this level of compartmentalisation seems to make your types more encapsulated, as opposed to striping over (at least) two packages.

@timothy-king
Copy link
Contributor

On some level, the proposed check is warning the consumer of a package that they are exposing themselves to a broken compilation if a field is added by the creator of a package.

apidiff targets the creator of a package and chooses not to warn when an exported field is added. From apidiff:

The apidiff tool in this directory determines whether two versions of the same package are compatible. The goal is to help the developer make an informed choice of semantic version after they have changed the code of their module.
...
But apidiff ignores five ways in which code may fail to compile after a change.
...
Embedding and Shadowing
Adding an exported field to a struct can break code that embeds that struct, because the newly added field may conflict with an identically named field at the same struct depth.

It also states:

For a tool to be useful, its notion of compatibility must be relaxed enough to allow reasonable changes, like adding a field to a struct,

My takeaway is that apidiff chose not to warn producers this is a breaking change. It could still make sense to warn consumers given some additional context about the client code.

but it sounds like popular sentiment is opposed.

Data could really help persuade people. A small scale experiment to establish how frequently these warnings are considered helpful would be a good start. After this a good next step would be to collect data on how often this pattern arises.

@rsc
Copy link
Contributor

rsc commented Apr 7, 2021

How often does this come up? And when it does come up, do you want it flagged?
If I saw this in code I would think "that can't possibly be an accident, they must mean to keep those aligned."

The keyed literal check is quite different. It's enforcing the specific suggestion in https://golang.org/doc/go1compat, which if we had it to do over again we would probably force in the language. I'm less sure about disallowing a struct conversion like this.

Note also that the struct conversion is only possible at all when all the fields are exported. Again, how often does this come up? We would need to see some proof that the answer is not "approximately never".

@rsc
Copy link
Contributor

rsc commented Apr 7, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@urandom2
Copy link
Contributor

urandom2 commented Apr 7, 2021

What kind of evidence would you be looking for? I can say it has happened, and I think the marshaling use case is compelling.

@Merovius
Copy link
Contributor

Merovius commented Apr 7, 2021

@arnottcr You could implement the check you intend and look at the Go corpus by @rsc to find instances of this happening and evaluating what amount of those are false-positives. Probably taking into account the exceptions @mdempsky mentioned here. I would also be interested in how many of these cases a) are in the same module or b) are only changing tags (specifically json tags and similar).

FWIW, my opinion on this is the same as I expressed in that thread: I don't think it is reasonable to consider it universally deprecated to convert struct types this way, even if it technically makes adding a field a breaking change. Technically, every publicly visible API change is a breaking change, so just using just "can this change break code" as a metric to determine this is not practical. So, in practice, the decision is far more nuanced and I don't think it is easily detectable by machines, if something does or does not constitute a "reasonably compatible" change.

@rsc
Copy link
Contributor

rsc commented Apr 14, 2021

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Apr 21, 2021

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc closed this as completed Apr 21, 2021
@golang golang locked and limited conversation to collaborators Apr 21, 2022
@rsc rsc moved this to Declined in Proposals Aug 10, 2022
@rsc rsc added this to Proposals Aug 10, 2022
@rsc rsc removed this from Proposals Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants