Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/ast: no way to distinguish some For statements, like for ;; {} and for {} #44257

Closed
quasilyte opened this issue Feb 14, 2021 · 13 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@quasilyte
Copy link
Contributor

quasilyte commented Feb 14, 2021

What version of Go are you using (go version)?

$ go version

go version devel +5c7748dc9d Mon Aug 10 23:44:58 2020 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE=""
GOENV=""
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH=""
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT=""
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR=""
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build608119567=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Parse a and b: they produce the same AST. There is no way to distinguish them.

// a
for ;; {}
// b
for {}

Same for the cond-only for loops, we can't distinguish them:

// a
for cond {}
// b
for ; cond; {}

With SliceExpr, we have a useful Slice3 flag that helps to distinguish ambiguous cases.

Maybe we can have something similar for the ForStmt? It can also be solved by adding semicolon positions to the ForStmt.

I know that Go AST is (ahem) AST, not a parse tree, so some information can be lost there, but it seems like this would be useful for tools that analyze the Go source code style.

In particular, gogrep tool could benefit from that.

isharipov@lispbook:go$ cat example.go
package example

func foo(cond bool) {
	for ;; {}
	for {}

	for cond {}
	for ; cond; {}
}
isharipov@lispbook:go$ gogrep -x 'for ;; {}' example.go
example.go:4:2: for { }
example.go:5:2: for { }
isharipov@lispbook:go$ gogrep -x 'for cond {}' example.go
example.go:7:2: for cond { }
example.go:8:2: for cond { }

(It also looks like go/printer has the same difficulties, this is why we get identical printed forms above.)

I'm also aware that after gofmt both a and b become identical; it's not always desirable to reformat the code being processed.

Another gogrep-related thing. for $*_ {} could read as "any for loop" while for ; $*_; {} should probably match only non-range loops. Right now it's hard to produce a different pattern for these as both of them generate identical AST.

What did you expect to see?

A way to distinguish between 2 syntax forms.

What did you see instead?

No way to distinguish between 2 syntax forms.

@quasilyte quasilyte changed the title go/ast: no way to distinguish some For statements go/ast: no way to distinguish some For statements, like for ;; {} and for {} Feb 14, 2021
@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 14, 2021
@networkimprov
Copy link

cc @griesemer

@griesemer griesemer assigned griesemer and findleyr and unassigned griesemer Mar 22, 2021
@griesemer griesemer added this to the Go1.17 milestone Mar 22, 2021
@griesemer
Copy link
Contributor

Marked for Go 1.17 but only so it's on the radar if we decide to address this.

@findleyr
Copy link
Contributor

Moving this to 1.18 as it's not going to be addressed before the freeze.

Note that adding information to the AST is not always a net win. For example, a significant fraction of gopls' memory use is devoted to AST, and adding fields can have a real (albeit incremental) impact on performance. While this change might have negligible impact, we have to draw the line somewhere, and for this there are some heuristics:

  • Does the information affect the correctness of the code. If so, we must capture it. This is the case for Slice3 (s[1:2:] is disallowed by the spec), but is not the case here.
  • Is the information required for gofmt, i.e. canonical formatting of the code. Again, this is not the case here.

So with these considerations, I'm inclined to say that we shouldn't do this. However, I am very sympathetic to the underlying motivation. In gopls we've had various struggles writing refactoring tools, related to the fact that we can't compute the exact input from the AST. There is probably room for a higher-level library on top of go/{scanner,ast} that provides a better API for working with raw Go syntax.

@findleyr findleyr modified the milestones: Go1.17, Go1.18 Apr 29, 2021
@ianlancetaylor
Copy link
Contributor

@findleyr This is in the Go1.18 milestone. Is it likely to happen for 1.18? Should it move to Backlog? Thanks.

@findleyr
Copy link
Contributor

Yes, unfortunately this needs to go into the backlog. With everything else, parser improvements have had to take a back seat. Also unassigning myself, as I am not actively working on this.

@findleyr findleyr modified the milestones: Go1.18, Backlog Nov 17, 2021
@findleyr findleyr removed their assignment Nov 17, 2021
@griesemer
Copy link
Contributor

@mvdan Perhaps you're interested in picking this up? But again, this is not urgent in any way, as far as I can tell.

@quasilyte
Copy link
Contributor Author

I would guess that we can store EmptyStmt in pre+post parts of the ForStmt.

This way, no new fields are added, so we'll pay extra memory in gopls only if the code actually had for ;; {}.
The go/printer could still print for ;; {} as for {} to avoid a backwards incompatible behavior (?), it'll check for pre+post parts being an empty stmt.

If Daniel is not interested enough, I can try rolling an implementation so we can see whether it's more involving than it looks right now or not.

@ALTree
Copy link
Member

ALTree commented Nov 17, 2021

How often do for ;; { and for ; true ; { appear in the wild?

Considering gofmt rewrites them into for { and for true { respectively, and the overwhelming majority of Go code is fmt'd, I would say... almost never? Only non-gofm't code can have these.

@quasilyte
Copy link
Contributor Author

quasilyte commented Nov 17, 2021

@ALTree, I would assume you have read my description above.

But just to be clear: it appears in user-defined syntax patters when you want to find these kinds of structures. Or when you want to write a structural refactoring tool yourself, so you want to have as good AST as possible. I mentioned gogrep, but there is also go-ruleguard and I'm sure there are more tools like that in the wild. Most users are suffering without saying anything out loud (tools that inspect AST and use it as inputs for AST-related rewrites, etc; we don't even need a proper syntax tree, just an AST that misses less information, encoding ; as EmptyStmt looks natural here for me, but I may be wrong). It just happens that AST package is used by a minority of people, so we have very low chances even to get issue reports from anyone.

I agree that we can't get 100% of what we want from the go/ast, but there are a few things we can fix so it fits the abovementioned niche even better. Small incremental steps that preferably don't break anything nor make it slower/consume less memory - that's what I think is a win-win situation.

If such tools are out of scope, then we can close this issue. If not, I can bring some more related issues and open them as separate tickets (or we can extend this one if that looks better).

@griesemer
Copy link
Contributor

@ALTree has a good point here. Unless there's a natural way to represent this with the go/ast, I don't know that we should be doing anything here. Storing EmptyStmt's in the pre/post parts may work, but we have to ensure that those won't change the effect of other tools, specifically go/printer output.

@ALTree
Copy link
Member

ALTree commented Nov 17, 2021

@quasilyte Thanks for the clarification. I did read the thread before replying, and now I've also read your second reply, but I think my point still stands (or possibly I've misunderstood your argument).

You pointed out that currently this: gogrep -x 'for ;; {}' example.go will also match for {, and that's undesirable; but my question is: why would you write the pattern that way? Assuming you're grepping over go-fmt'd code, you know for sure that for ;; { will never appear in the corpus (because gofmt rewrites it), so using for { as the pattern is a better choice, and it'll return every instance of the desired pattern as it actually appears in a go-fmt'd codebase.

Of course if you're grepping over non-fmt'd code, for example for a tool that reports that for ;; { can be simplified to for {, then I agree that being able to tell the difference is important. But I wonder if this is a compelling use case. A tool that reports a code style that gofmt already unconditionally rewrites doesn't seem terribly likely to be actually useful.

I'm not saying that being able to tell them apart is a bad idea. I just wanted to say that -since the difference can only be apparent in non-go-fmt'd code- any solution to this issue should probably have ~0 impact on performance and memory usage of go/ast.

@quasilyte
Copy link
Contributor Author

@ALTree, you're right, there is no big benefit in this since using this pattern is not very useful.

I think I'll find some workaround in one way or another. The issue can be closed unless anyone has any reasons to keep it. It'll stay in history, so at least we'll have some context for the future.

@griesemer
Copy link
Contributor

Closing issue per above discussion.

@golang golang locked and limited conversation to collaborators Nov 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants