Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: Interface and []byte types become package scoped #32283

Closed
itsjamie opened this issue May 28, 2019 · 10 comments
Closed

proposal: Go 2: Interface and []byte types become package scoped #32283

itsjamie opened this issue May 28, 2019 · 10 comments
Labels
FrozenDueToAge Proposal v2 A language change or incompatible library change
Milestone

Comments

@itsjamie
Copy link
Contributor

Background

Go currently identifies types that are created and exported by a package and prefixes their usage in external packages by the package name. One detractor of modules has been the possibility of two packages communicating through an io.Reader or []byte memory area causing issues between two versions of a package being used on the reader and writer leading to unexpected behaviour.

This has been presented as the reasoning behind why a SAT solver, and therefore singular version builds lead to more predictable and less error-prone during runtime.

Abstract

I propose these related changes to the type system in Go.

  • Exported variable types include their exporting packages SIV version during compile-time, as in mypkgv2.[]byte.
  • Exported interface types automatically alias their exporting packages SIV version during compile-time, as in io.Reader is aliased to mypkgv2.Reader.

This information is used during compile time at the package boundaries to ensure the consumer is importing the expected version.

Importers would hold the responsibility of re-exporting the automatically generated alias if it was used by a tertiary importer, otherwise, it would appear as the original type to reduce the virality by default.

Rationale

This is intended to fix the issue of untyped boundaries across packages. Allowing multiple versions of the same package is great, and for the majority of exported types, allows for clear compile time errors that the consumer hasn't been updated to match the importer, but there are issues at package edges with the above, and this was one thought on how this could be approached in a semantic import world that allows for multiple versions to making incremental upgrading easy, a laudable goal.

@gopherbot gopherbot added this to the Proposal milestone May 28, 2019
@kardianos
Copy link
Contributor

@itsjamie I've read your proposal a few times, but I honestly don't understand what you are proposing. A package can only be included into a compile a single time. A different major version is effectively a different package. This is represented today by separate import paths.

@rsc rsc added the v2 A language change or incompatible library change label May 28, 2019
@itsjamie
Copy link
Contributor Author

itsjamie commented May 28, 2019

@kardianos, the idea behind the proposal was to understand the issue from folks who are used to working with languages that require you only use a single version of a dependant in the build. Which as you pointed out is the incorrect mental model in a module + SIV world, since each version is effectively it's own package, therefore there shouldn't be this connection made between the two.

However, if you start from the perspective that a mypkg/v1.Struct !== mypkg/v2.Struct, it would be useful during migration between the two packages for a linting message warning you of locations where []byte, and interfaces such as io.Reader , and io.Writer from mypkg/v1 are also being used in conjunction with calls to methods of the same name of mypkg/v2. Since code that previously consumed data from v1 that was delivered via []byte, or io.Writer, will likely no longer correctly work with code from mypkg/v2. So it's preparing someone for failure if they don't have unit tests that exercises all the I/O edges. Especially where the import path will oftentimes be automatically updated, so no eyes will touch the code.

My understanding of SIV + aliases was to allow for easier incremental changes by introducing major versions as stepping stone points for API changes without requiring you to rename your module, however, it's likely that a change to the output is hidden in the type-system via I/O contracts that don't include the package type.

I understand this would be a special case, and is more deserving of perhaps a linting rule rather than full-on type system support. I was looking to see if there was an approach that covered this particular aspect. Reducing the chance of this foot-gun to developers because of how folks approach versioning given previous works.

The straw man issue presented to me was a database driver that used its own scratch memory for writing information on the active query. The user had seen two versions of the same driver write to this area in slightly different ways, causing for incorrect information to be written to the database and went unfound because it was a field that recorded information for later debugging, which when it was needed, was found to be corrupted. The issue was that a []byte equivalent memory region was allocated for both of these packages, and the user was responsible for ownership of that allocation, and the usage stomped over each other.

@kardianos
Copy link
Contributor

However, if you start from the perspective that a mypkg/v1.Struct !== mypkg/v2.Struct, it would be useful during migration between the two packages for a linting message warning you of locations where []byte, and interfaces such as io.Reader , and io.Writer from mypkg/v1 are also being used in conjunction with calls to methods of the same name of mypkg/v2.

@itsjamie Nothing prevents you from creating this linter today. How far have you taken your prototype linter that do these checks? What are your blockers?

Please keep in mind the Go compilers don't do linting, so this is a firmly in the realm outside of the core Go tools.

The straw man issue presented to me was a database driver that used its own scratch memory for writing information on the active query. ...

This is not a good straw man. Who presented this straw man to you? Perhaps I could correct their understanding of how database drivers generally work and why it wouldn't be an issue (I maintain several).

@itsjamie
Copy link
Contributor Author

itsjamie commented May 29, 2019

Doing it as a lint step is not ideal and was an acknowledgement that it not necessarily need be a compile time error but could fit into the automatic rewrite flow of import paths for ‘go mod’ as a warning that I/O edges had changed.

This proposal was to add package typing to these I/O types so a user could tightly bind their usage with the importer as a first step to ensure future upgrades at least check to make sure these contracts didn’t change.

I haven’t implemented a prototype linter to identify these cases, to answer your question though I don’t see any blocker to the implementation and running it during a import path rewrite.

I believe it would be a net benefit to allow adding package identifiers to these types to encode in the type system what produced the bytes or what type of bytes a consumer can consume successfully. I hope I clarified the proposal for you!

@beoran
Copy link

beoran commented May 29, 2019

I see what you are getting at, but you can already do what you want if you don't use []byte, io.Reader, io.Writer nor, interface{}, but define more specific types instead, and change those when you move from v1 to v2. Like that, if your IO API or layout changes, you can signal this by changing the types involved. The user of your package will then get a compile error when they upgrade and still try to use your old I/O types.

@kardianos
Copy link
Contributor

@itsjamie Thank you for the clarifications.

Here are some issues currently:

  • Go type aliases (that you have proposed) won't cause a compile error as they are not different types.
  • Go compilers only compile, a subset of Go vet is run on "go test" if it is always correct, go linters have no need to be 100% correct, but they are not run automatically. The checks you propose are not compile time errors (type aliases won't cause errors), and they would not be 100% accurate in their check, so this check would be a lint.
  • Implementation details are vague and the actual benefit untested.

I highly recommend you prototype your proposal. I believe source code analysis available today is sufficient to perform the checks you propose.

@ianlancetaylor
Copy link
Contributor

What does SIV mean?

@ianlancetaylor
Copy link
Contributor

It would help to see a small example. I don't understand the suggestion here. To the extent that I do understand it, it seems to me that it would break almost all existing packages.

@itsjamie
Copy link
Contributor Author

Given that most don't understand I think it's clear: The proposal is not well-formed or doesn't address an issue that is widespread. I think it needs some additional thought as to how it could be adequately described and implemented. I think I will close this proposal and refocus on if there is a clearer way to describe this.


However, to answer your questions @ianlancetaylor.

What does SIV mean?

SIV being Semantic Import Versioning defined here https://research.swtch.com/vgo-import.

It would help to see a small example.

In the article, it's recommended that the package Moauth becomes Pocoauth. The article later goes on to discuss the possibilities for automatic API updates between a semantically versioned v1 -> v2, for the parts of the API that are similar. This is what the proposal was trying to aid, to reduce the locations where an API contract currently would state it hadn't changed, even though it had. To help aid automated fixing. Doing this by adding type information to byte structures and other I/O edges. So if someone wanted to offer a v1 -> v2 call of this API, the wrapping function would need to explicitly type convert the byte to a Pocoauth.[]byte or, Pocoauth would need to alias Moauth.[]byte explicitly signalling that the bytes were the same.

The example I put forward using the example packages from the article would be:

Moauth provides the ability in its API to read a JWE from an io.Reader, and write the resulting decrypted payload to a []byte. The idea of the API being decoding a JSON Web Token from an HTTP request body, and passing the decrypted payload to be unmarshalled into a struct by the client library which implements Moauth/Pocoauth.

func GetDecodedJWE(body io.Reader) []byte

The initial implementation only sends the claims section of the decoded JSON Web Token on. However in Pocoauth they decided to make the backwards-incompatible change to this API of instead sending the entire decoded JWT for this API. However, an automatic API update between Moauth -> Pocoauth would not know based on the types that anything had changed, and downstream processors of this []byte would fail after an API upgrade that seemingly had no changes to the type structure.

The automatic go:fix nirvana proposed holds promise. I worry about tertiary importers (Unity by Ugo in the example) which rely on the same data being exposed by the Moauth/Pocoauth hypothetical API.

@beoran
Copy link

beoran commented Jul 10, 2019

As I stated before, in your example, the easiest way to make sure GetDecodedJWE is used correctly in v2 is to change it's signature like this:

type EntireDecodedJwt struct { 
  Buffer []byte
}

func GetDecodedJWE(body io.Reader) EntireDecodedJwt

Like that, in V2, callers will get a syntax error when they try to use the result of the function, because it has changed and can't be used like a normal []byte anymore.

@golang golang locked and limited conversation to collaborators Jul 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

6 participants