Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: remove struct tags #20165

Closed
bradclawsie opened this issue Apr 28, 2017 · 67 comments
Closed

proposal: spec: remove struct tags #20165

bradclawsie opened this issue Apr 28, 2017 · 67 comments
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@bradclawsie
Copy link

bradclawsie commented Apr 28, 2017

This is a Go2 Proposal

Struct tags are one of Go's great hacks, but they are being exploited more and more to embed DSLs into Go that cannot be checked by the compiler.

Struct tags create implicit constructor-style functionality. Go does not require constructors for structs but we could use interfaces to mandate this functionality in specific scenarios (serialization/deserialization, etc).

The alternative is end-runs around Go's simplicity like the validator library, which injects a private DSL that has no compatibility guarantees into the running state of Go programs.

Also since struct tags effectively create the equivalent of eval in Go code, they also represent an attack surface.

@bradclawsie bradclawsie changed the title Remove struct tags from Go Go2 Proposal: Remove struct tags from Go Apr 28, 2017
@bradfitz bradfitz added v2 A language change or incompatible library change LanguageChange labels Apr 28, 2017
@bradfitz
Copy link
Contributor

Obviously we can't remove struct tags from Go 1.x, so I'll throw this into the Go 2 bin for consideration later.

@bradfitz
Copy link
Contributor

Ah, and you retitled it at the same time as my label change.

@griesemer griesemer changed the title Go2 Proposal: Remove struct tags from Go Proposal: Remove struct tags from Go Apr 28, 2017
@gopherbot gopherbot added this to the Proposal milestone Apr 28, 2017
@ianlancetaylor
Copy link
Contributor

It would help to suggest some alternatives to struct tags for the things they are currently used for, such as the uses in encoding/json.

@bradclawsie
Copy link
Author

bradclawsie commented Apr 28, 2017

The functionality exists already with interfaces if we choose to use them more broadly. Removing struct tags does not require other language changes to Go in order to meet the functionality.

Any type that can be serialized or deserialized (not just json, but also yaml, toml, csv, etc) could be required to satisfy an interface available in the standard library. For example:

type Serializable interface {
   Construct...
}

Hopefully the intent is clear. The standard library has a general lack of useful interfaces and this is but one example.

When it becomes very common to inject runtime DSL functionality into a program through a string, it seems fairly obvious that this is alerting us to a deficiency in the language design. We need to figure out how to bring this back into the core language.

@cespare
Copy link
Contributor

cespare commented Apr 28, 2017

@bradclawsie I don't follow how using an interface is a replacement for the specific functionality that struct tags provide for JSON, XML, etc. (specifying field names, omitempty, ...)

@bradclawsie
Copy link
Author

bradclawsie commented Apr 28, 2017

@cespare Everything about serialization and deserialization can be moved to real Go functions, including any decisions about fields to omit. Struct tags are just an out-of-language shortcut that permits ad-hoc DSLs to be inserted into Go code.

@urandom
Copy link

urandom commented Apr 28, 2017

currently, struct tags are of great help when you want to quickly do things like (un)marshal stuff, among other things. while there are negatives as you listed, relying on interface implementations would introduce quite a bit of strain to developers. imho, a better solution would be to replace them with proper annotations that the compiler can actually check (and maybe such annotations can be expanded to more things than merely struct tags).

E.g.:

type foo struct {
     Bar string `json:",omitempty"`
}

would become:

type foo struct {
     Bar string @json.OmitEmpty
}

At the very least, the compiler can at least see whether an annotation exists or not, and it can come with better documentation as well.

@bradclawsie
Copy link
Author

@urandom I really like your suggestion. The usability of the current approach is retained, but because the annotations you propose would actually be Go syntax, the compiler can inspect them (they are no longer ad-hoc DSL vectors with an attack surface).

There is also the potential for developers to be able to implement user-defined annotation support.

There would have to be a broader proposal to integrate annotations in general in to the language.

Thanks!

@faiface
Copy link

faiface commented Apr 29, 2017

@bradclawsie I feel like what you're suggesting is that reflection should be removed from the language, since you want to implement reflection stuff through interfaces. Am I correct?

@bradclawsie
Copy link
Author

@faiface I am not asking for reflection to be removed from Go. I am proposing removing struct tags (whose contents are opaque and "not Go") and moving their functionality into the language itself.

@faiface
Copy link

faiface commented Apr 29, 2017

@bradclawsie Ok, I see that you liked the idea of "tags with syntax". I'm not against that too. Getting back to your original proposal, I still can't see how you'd possibly replace struct tags with interfaces.

@bradclawsie
Copy link
Author

bradclawsie commented Apr 29, 2017

@faiface Tags create implicit functionality when serializing or deserializing. There's no reason why you couldn't just require the user to encode these in Go. You would enforce that with an interface. Go has this half-done in certain cases with Marshal and Unmarshal but doesn't formalize the requirement with an interface.

That said, @urandom has a better suggestion...it provides a similar functionality while preserving the conciseness of tags.

@faiface
Copy link

faiface commented Apr 29, 2017

@bradclawsie The thing about moving this to interfaces is that it would remove majority of the convenience of reflection. Reflection is so cool in Go, because users don't have to do almost anything to use functionality which uses reflection. All I have to do to encode a struct is to create it (and maybe type a few tags). With this, I'd have to implement a few methods, well, a lot more code, a lot less convenient.

@bradclawsie
Copy link
Author

@faiface I agree my suggestion is less convenient, but Go already is focused on explicitness over cleverness everywhere else in the language. A more convenient alternative is @urandom 's annotations suggestion.

@faiface
Copy link

faiface commented Apr 29, 2017

@bradclawsie Well, we have reflection in Go. I think it's one of it's greatest strengths. Removing tags altogether would make reflection a lot weaker. However I agree, that making them more integrated to the language makes sense.

@ianlancetaylor
Copy link
Contributor

@urandom 's suggestion is interesting but is incomplete. We must not require the compiler to actually know what @json.OmitEmpty means. That does not scale, and would mean that new packages would not be able to take advantage of this technique. So json.OmitEmpty must be defined in some way in the encoding/json package, so that when that package is imported the annotation is permitted in the importing package. How does that work?

@as
Copy link
Contributor

as commented Apr 30, 2017

Also since struct tags effectively create the equivalent of eval in Go code, they also represent an attack surface.

Can we see a practical example of how a struct tag increases attack surface?

@bradclawsie
Copy link
Author

@ianlancetaylor ; @urandom may have a better idea, but presumably there would have to be a new mechanism for user defined attributes. Otherwise it could be done with interfaces, but that carries with it the cost of boilerplate.

@bradclawsie
Copy link
Author

@as struct tag functionality is invoked automatically (on serialization/deserializing) and is hidden behind an opaque interface (the DSL of the tag text, which can be anything). The Go compiler will make no attempt to protect the developer.

To begin to address the potential for mayhem, you would need to specify the language of tag strings at a minimum.

@as
Copy link
Contributor

as commented Apr 30, 2017

Yes, but is there any basis to expect vulnerabilities in struct tags? They are not input fields from clients of the process, only the developer has control over their contents. Am I missing something?

@bradclawsie
Copy link
Author

We should expect and defend against malicious code constantly. The expectation is no different than that for regular Go libraries, but the ability to defend is absent: the format for struct tags is ad hoc

@as
Copy link
Contributor

as commented May 1, 2017

We should consider formalizing the definition of security before changing the language to achieve it. The proposal does not mention tangible security threats. Security is frequently a trade off for efficiency, simplicity, and ease-of-use. I am simply asking how an adversary can exploit struct tags to achieve something considered a security breach by some common criteria.

@bradclawsie
Copy link
Author

We're getting off track. I mentioned attack surface as an implication of tags, but the primary thrust of this proposal is to change the Go language so that the functionality embedded inside tags can be made part of the language itself. I would prefer we focus on that.

@as
Copy link
Contributor

as commented May 1, 2017

It is not my intention to hijack the proposal, however requirements​ preceeed design. What are the other advantages of removing tags if not security?

@creker
Copy link

creker commented May 2, 2017

@garethwarry

a tag is simply a key-value pair of strings

Yes and that's exactly the problem. Any typo or simply wrong tag will be detected only at runtime when your code doesn't work. And look again at C# attributes. They resemble the same key-value structure but compiler can actually catch errors. Values can be of any type and that provides safety. Complexity here actually solves a problem.

overloaded methods for validating values

Where did you get that? C# probably allows that (never done or saw that myself) but, again, that was just an example.

By convention

Tags are just happens to be key-value pairs. The spec doesn't limit them to it. Even that is not specified. Not to mention everything else. The spec actually permits any string in tags any string is permitted as a tag

@spiritedsnowcat
Copy link

@creker, Do your own error validation. Adding language complexity and reducing flexibility just for the sake of daft programmers needing validation done for them implicitly is a horrible idea. We don't need cookie-cutter solutions from Microsoft for C# programmers who don't know how to error check without try-catches and type validation everywhere. We need a language that allows real programmers to be expressive and get work done. Key-value string pairs is exactly what the languages needs for convenience. Again, if your application requires error validation for tags, then implement it. Otherwise, adding such an unnecessary feature just breaks compatibility and adds complexity. This is a "NO" vote from me.

@mdempsky
Copy link
Member

mdempsky commented May 2, 2017

Yes and that's exactly the problem. Any typo or simply wrong tag will be detected only at runtime when your code doesn't work.

If that's the only problem, then we could more easily address it by extending​ cmd/vet or some other linter to warn when tags don't follow convention.

@spiritedsnowcat
Copy link

@mdempsky, golint already does this. It recommends exactly what I quoted above. Any literal string that doesn't follow the key-value pair format is rejected.

@bradclawsie
Copy link
Author

bradclawsie commented May 2, 2017

Golint is a style recommender for Go1 that has nothing to do with a discussion on changing the Go specification for Go2. You keep bringing up "convention"...the point of this proposal is to change the spec, not merely a recommendation. Go's simplicity starts with the basic fact that you can write a Go compiler by reading the spec.

@spiritedsnowcat
Copy link

spiritedsnowcat commented May 2, 2017

@bradclawsie, practice is just as important as syntax. When you fail to consider how good practice will come from syntax, then you fail to design a functional language. Look at Duff's device, for example. It compiles, but it's probably the worst use of language I've ever seen from C. Golint recommends good practice from Go, and should not be blown off as off-topic (as you've blown off multiple very good points being made). It recommends how tags are used - which is key-value just as the reflect package also documents.

@bradclawsie
Copy link
Author

I use golint and think it has some value, but it has nothing to do with this discussion. This is a Go2 proposal and has nothing to do with third-party tools for Go1.

@spiritedsnowcat
Copy link

spiritedsnowcat commented May 2, 2017

@bradclawsie,Your bloody topic is about struct tags, and you're saying we shouldn't look at best practices and golang documented uses for struct tags and should instead look at C# as a comparison? Are you high?

@bradclawsie
Copy link
Author

My proposal has nothing to do with Go1 best practices. It is about changing the Go2 spec. Please keep the discussion civil.

@creker
Copy link

creker commented May 2, 2017

@mdempsky

If that's the only problem, then we could more easily address it by extending​ cmd/vet or some other linter to warn when tags don't follow convention.

And how does it solve the problem when there's a typo in key or value? How does it solve plain errors in syntax that library expects? No tool can check that. A good way to start with solving that would be to make key-value pairs actually be the spec and not some convention in a library and allow values to be of any type, not just strings. That way at least some verification could be done at compile time and would make tags easier to work with as they wouldn't require parsing at runtime.

@creker
Copy link

creker commented May 2, 2017

@garethwarry, calm down. The proposed tools has nothing to do with the topic and don't solve the problem it highlights.

@creker
Copy link

creker commented May 2, 2017

@garethwarry

Do your own error validation. Adding language complexity and reducing flexibility just for the sake of daft programmers needing validation done for them implicitly is a horrible idea. We don't need cookie-cutter solutions from Microsoft for C# programmers who don't know how to error check without try-catches and type validation everywhere. We need a language that allows real programmers to be expressive and get work done. Key-value string pairs is exactly what the languages needs for convenience. Again, if your application requires error validation for tags, then implement it. Otherwise, adding such an unnecessary feature just breaks compatibility and adds complexity. This is a "NO" vote from me.

Again, calm down. You're not helping with that kind of unconstructive rant. You could have just downvoted instead of this. You're still missing the point and now just insulting other programmers. If compile-time validation is a cookie-cutter solution, then we should just ditch Go altogether and return to interpreted languages where everything is checked at runtime. There's your expressiveness. But no, for some reason we are talking about very strict statically typed language. And proposal is about extending that strictness to struct tags.

@as
Copy link
Contributor

as commented May 2, 2017

The C# annotation example is hideous and a good example of the cruft Go is trying to avoid. Take note of the plan9-like dialstrings in net instead of the monotonous protocol functions in other languages: sometimes safety isn't as important as simplicity. There doesn't seem to be a single practical example of how struct tags can cause faults in a program. How many bugs are encountered as a result of mistyped tag strings in stdlib or other GitHub packages? Without these answers I would deem this a dubious feature to add to the language unless a significant number of users are burned by the unchecked nature of struct tags.

@urandom
Copy link

urandom commented May 2, 2017

@as
The C# example annotations look the same as the current struct tags, expect that they can be applied to a lot of things, and the compiler can prove that they exist. You might not like that they are surrounded by '[]', but that's just the syntax they've chosen.

@spiritedsnowcat
Copy link

spiritedsnowcat commented May 3, 2017

@urandom, I don't get where that came from.
And I completely agree with @as.

@bradclawsie
Copy link
Author

@as My original point in this proposal wasn't to address a rash of errors people are complaining about, but to have the Go language as it is commonly used to be fully specified. I understand people may have a good experience with tags as they are commonly employed, but I have yet to hear a compelling argument as to why they should be unspecified. I believe that most points brought up here concede the point that the functionality is useful...the issue is that the compiler doesn't care what is in these strings.

Note also that this is a Go2 proposal so I don't think moving people out of their comfort zone should be a concern...if Go2 ever becomes a reality (which I doubt), you should expect many more jarring changes than this.

In any case I believe the next step is concrete proposals as others have pointed out. I will work on that but I'm not in any rush...Go2 is not coming any time soon.

@mier85
Copy link

mier85 commented May 12, 2017

To summarize a few things that I've read:

  • struct tags do not really pose a security thread.
  • dynamic code can be injected by the devs anywhere inside the code if that is what they want to do.
  • in case we have defined annotations we might get the compiler to validate the key of a struct tag, but it doesn't help you at all with the value ( you misspelled json? compiler says: please fix it. <=> you misspelled the name of the field that is expected as json name? you are alone.) What happens more often?
  • reflection can also be done on the names of fields and if you really want that, you can even abuse that.

As a conclusion: Struct tags can be abused, but mostly they are just convenient. ( as every language feature could be abused one way or another). Struct tags may not be pretty, but it seems to be not an easy task to make them prettier without losing their capabilities.

Whenever you parse dynamic content, you won't have compiler guarantees, no matter how explicit your code is. So does it worth it?

@bradclawsie
Copy link
Author

bradclawsie commented May 12, 2017

@mier85 Agree with most of your points but the value of the language spec is not to catch and reduce typos. It is to assist tool writers. Even if people tend not to mis-spell json strings, tags are still opaque to the compiler. Indeed by the current spec, a compiler could simply ignore tags entirely and still satisfy the Go1 compatibility guarantee, which by extension means that much of the behavior of serialization/deserialization is also not specified...

@ianlancetaylor
Copy link
Contributor

I know that was hyperbole, but I don't think the compiler could ignore tags entirely: that would break programs that use reflect.StructField.Tag, which is explicitly referenced in the language spec.

@bradclawsie
Copy link
Author

@ianlancetaylor Right, I think what this is boiling down to is the adequacy of "By convention" in https://golang.org/pkg/reflect/#StructTag

So getting back to my original claim, a part of Go this important should be specified...and if not specified, removed and have the functionality provided by features in Go that are specified.

@ianlancetaylor
Copy link
Contributor

Frankly, requiring every type that you want to serialize into JSON to define an interface to do the serialization is a non-starter. That would require a ridiculous amount of duplicate code compared to how it works today. Go is, above all else, a pragmatic language. We are not going to remove struct tags unless we have something else that works better. I have not yet seen that better thing in this issue.

@bradclawsie
Copy link
Author

Agree, the answer is not to remove struct tags but to specify them.

@rsc rsc changed the title Proposal: Remove struct tags from Go proposal: spec: remove struct tags Jun 16, 2017
@swizzley
Copy link

I'd like to be able to strip/sanitize tags with a method rather than removing tags from the language. Simply because it's problematic to read from an api that requires tags for type conversion, and those tags are then carried over when writing the data to another api where the types are real.

@ianlancetaylor
Copy link
Contributor

@swizzley I don't understand what that would look like. Struct tags are a feature of a type defined at compile time, and there isn't any way to dynamically change a type.

@ianlancetaylor
Copy link
Contributor

This proposal is incomplete. Struct tags are useful and meaningful and pervasive in existing code. While their ad hoc nature is perhaps unfortunate, we can't get rid of them without some clear idea of what we are going to replace them with. Closing as incomplete.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests