Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/parser: ParseExpr returns AST with bogus Pos information which tricks go/printer into generating source code which doesn't compile #26986

Closed
MarkNahabedian opened this issue Aug 14, 2018 · 5 comments

Comments

@MarkNahabedian
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.2 windows/amd64

Does this issue reproduce with the latest release?

Is reproduces in the go playground:
https://play.golang.org/p/tpzqDF3EZ_S
and appears to still be present in a source download from earlier this morning.

What operating system and processor architecture are you using (go env)?

windows/and64

What did you do?

See this Go Nuts thread for background:
https://groups.google.com/forum/#!topic/golang-nuts/FiUBZI4kinE

parser.ParseExpr is documented to return an AST with Pos information that is "undefined". What it should do is return an AST with Pos information that is invalid. Maybe that's what the author intended but what ParseExpr does is create a new FileSet, call ParseExprFrom with that FileSet and then throw the FileSet away.

This is a problem because when the result of ParseExpr is included in a larger AST which is then serialized, the resulting code might include spurious newlines that are inserted by go/printer (see the background link above for the code in go/printer which I suspect is doing this) because of the incorrect Pos information. Because such newlines are significant in Go, the output of go/printer might not compile.

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

See https://play.golang.org/p/tpzqDF3EZ_S

What did you expect to see?

I expected ParseExpr to return an AST no node of which has valid Pos information.

What did you see instead?

ParseExpr returns an AST with valid but meaningless Pos information because that Pos information can only be interpreted with respect to a FileSet which no longer exists. This can then cause go/printer -- innocently believing that Pos information -- to insert newlines in places which prevent the resulting code from being parsed.

@andybons andybons changed the title ParseExpr returns AST with bogus Pos information which tricks go/printer into generating source code which doesn't compile go/parser: ParseExpr returns AST with bogus Pos information which tricks go/printer into generating source code which doesn't compile Aug 14, 2018
@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 14, 2018
@andybons andybons added this to the Unplanned milestone Aug 14, 2018
@andybons
Copy link
Member

@griesemer

@griesemer griesemer added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 14, 2018
@griesemer griesemer self-assigned this Aug 14, 2018
@griesemer
Copy link
Contributor

The documentation is clear - the position information is undefined, i.e., it can be whatever it happens to be. The implementation sets up a file set so that the rest of the code can work. It's not clear that the position information should be "cleared".

If this is important, it can always be done manually.

Leaving open for further comments, but I'm not sure that there's anything that needs to be done here.

@MarkNahabedian
Copy link
Author

The problem is that the incorrect Pos information tricks go/printer into writing source code that won't parse.

I have worked around this problem by copying the AST but excluding Pos values. I didn't see any other was to invalidate the incorrect Pos info. It was boring and tedious to write this and took the better part of a day.

In retrospect, it might have been simpler to use ParseExprFrom and include my FileSet. One might ask why PsrseExpr exists at all.

I understand that it would be difficult to change the current implementation of ParseExpr to ensure that the Pos values are invalid. I have a workaround. I just hope this issue doesn't waste anyone else's time.

I've seen wishful mention of some maybe eventual reimplementation of the AST library to simplify how comments are represented. Maybe during such a reimplementation, ParseExpr could be fixed as well. That a flaw is documented does not cause it to not be a flaw.

I won't presume to suggest how the go implementers should manage and prioritize their workloads. I hit a pitfall; it held me up for a while; I found a workaround. Feel free to close this ticket if you think that's the reasonable thing to do. You won't convince me that this is not a bug just because there's a warning in the documentation about it. I don't believe that the implementor of ParseExpr intentionally designed this behavior, but rather accepted it as a consequence of the rest of the parser it's built on. I do respect that there are probably many more important things to work on than this though

@griesemer
Copy link
Contributor

@MarkNahabedian ParsExpr exists for historic reasons - originally it was used in gofmt's rewrite implementation if I remember correctly (cmd/gofmt/rewrite.go:39) which happens not to care about position information (at least at that time). Later uses recognized that it was necessary to have control over position information, hence we added ParseExprFrom.

I'm sorry that you feel like you wasted time, but the parser interface is pretty small and documents both functions next to each other, so it does seem the documentation is readily available and also easy to find.

I am not trying to "convince you that this is not a bug" as we have found ourselves years ago (2015) in similar situations. But it is sufficient for other applications.

We do know that creating "valid" syntax trees from different AST parses is very cumbersome and fiddly. It may seem obvious in retrospect, but when we wrote the scanner/parser/printer etc. go/* libraries a decade ago, we wrote them primarily with a simple gofmt in mind, at a time when no other language was considering formatting all their source code by default automatically. We clearly didn't anticipate to which extent people would use them for programmatic code manipulation.

The compiler does use a brand-new and stand-alone package (cmd/compile/internal/syntax) which creates a syntax tree that has position information attached independent of some file set. That makes manipulating that tree much simpler. It also contains a printer that formats independent of position information, purely based on tree structure, exactly to avoid all these problems. The syntax tree doesn't track comments yet, though, and the printer doesn't do alignment a la gofmt. Eventually they should do it, and once we're sufficiently happy with the API, it might become an alternative choice to the go/* libraries (and move out from "internal"). In the meantime, you're free to copy from it.

Finally, there is a golang.org/x/tools/go/ast/astutil, a library that contains a rewrite mechanism (astutil.Apply) that might be helpful to do tree manipulations such as changing position information.

I'm going to close this as "unfortunate". We're not going to make changes to ParseExpr.

@griesemer griesemer added Unfortunate and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Aug 20, 2018
@MarkNahabedian
Copy link
Author

Ok. Thanks much for the explanation and historical context. It is extremely rare that someone is able to anticipate all the ways others might (ab)use their library.

I'll take a look at astutil.

@golang golang locked and limited conversation to collaborators Aug 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants