Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/ast: add Range token.Pos to RangeStmt #50429

Closed
quasilyte opened this issue Jan 4, 2022 · 10 comments
Closed

go/ast: add Range token.Pos to RangeStmt #50429

quasilyte opened this issue Jan 4, 2022 · 10 comments

Comments

@quasilyte
Copy link
Contributor

quasilyte commented Jan 4, 2022

The current RangeStmt:

	RangeStmt struct {
		For        token.Pos   // position of "for" keyword
		Key, Value Expr        // Key, Value may be nil
		TokPos     token.Pos   // position of Tok; invalid if Key == nil
		Tok        token.Token // ILLEGAL if Key == nil, ASSIGN, DEFINE
		X          Expr        // value to range over
		Body       *BlockStmt
	}

Proposed changes:

 	RangeStmt struct {
 		For        token.Pos   // position of "for" keyword
+		Range      token.Pos   // position of "range" keyword
 		Key, Value Expr        // Key, Value may be nil
 		TokPos     token.Pos   // position of Tok; invalid if Key == nil
 		Tok        token.Token // ILLEGAL if Key == nil, ASSIGN, DEFINE
 		X          Expr        // value to range over
 		Body       *BlockStmt
 	}

It will add extra 8 bytes to the size on 64-bit platforms. An unfortunate coincidence is that right now its size is 80 bytes, exactly as a size class. Making it 88 bytes will get it to the next size class, 96 bytes allocation (so we effectively add 16 bytes per range statement on 64-bit platforms).

Although I would assume that range statements are not that frequent as, say, *ast.Ident, so it might not be a big deal.

Why I need this: when doing a partial code search/replacements, you may want to use some structural elements as anchors and then make a span of this range. For instance, gogrep now supports range patterns like range $x, which means it'll match *ast.RangeStmt and capture only it's range expression. The bad part is that it's hard to figure out what's the starting pos of this partial syntax is.

The only way to find range position right now is to take TokPos and walk the source text manually, hoping that there are no inline /**/ comments or some other things that may be floating between := and range. It's unusual to have anything between Tok and Range, but it's still possible even for gofmt'ed code.

As a side note, it looks like gofmt does move comments there: for _, x := /*ok*/ range xs {} becomes for _, x := range /*ok*/ xs {}.

If the rationale above is not convincing, I actually need this pattern matching of standalone range clauses in the rules engine (based on go-ruleguard) that is based on gogrep. Getting a precise location (pos range) is important to apply an automatic fix.

This issue may meet the same fate as #44257 did, but I figured I'll still report the fact that there is an issue that a few people can get into while creating untrivial Go tools.

If this proposal is accepted, I would like to send a CL with implementation.

Kludge / workaround

If this proposal is never accepted, I'll embed a partial solution that may also show how awkward it is to find "range" pos right now.

var from int
if rng.TokPos.IsValid() {
	// Start from the end of the '=' or ':=' token.
	from = int(rng.TokPos + 1)
	if rng.Tok == token.DEFINE {
		from++ // ':=' is 1 byte longer that '='
	}
	// Now suppose we have 'for _, x := range xs {...}'
	// If this is true, then `xs.Pos.Offset - len(" range ")` would
	// lead us to the current 'from' value.
	// It's syntactically correct to have `:=range`, so we don't
	// unconditionally add a space here.
	if int(rng.X.Pos())-len(" range ") == from {
		// This means that there is exactly one space between Tok and "range".
		// There are some afwul cases where this might break, but let's
		// not think about them too much.
		from += len(" ")
	}
} else {
	// `for range xs {...}` form.
	// There should be at least 1 space between "for" and "range".
	from = int(rng.For) + len("for ")
}
@gopherbot gopherbot added this to the Proposal milestone Jan 4, 2022
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Jan 5, 2022
@ianlancetaylor
Copy link
Contributor

CC @griesemer @findleyr

@go101
Copy link

go101 commented Jan 8, 2022

I also ever encountered this problem: https://groups.google.com/g/golang-tools/c/PaJBT2WjEPQ/m/HegyRyhjAwAJ

A related issue: #13590

There are two other similar cases:

  • the token.Pos of else in an if-else block
  • the token.Pos of type in a type-switch block

Currently, I need some efforts to get their positions, and I'm not sure my implementation is bug free.

@quasilyte
Copy link
Contributor Author

quasilyte commented Jan 8, 2022

To add some more context why a kludge with the manual token (described in the referenced thread + in this issue) discovery is hard: I'm also using a minified Go corpus that can be queried by structural patterns.
https://quasilyte.dev/gocorpus/
Sources are minified by minformat, so they consume ~25% less space (which is significant when it's 80mb vs 60mb) . Doing a structural search still works fine as it doesn't care about the things like spaces or explicit semicolons (it works on AST that will be the same).
This is probably not the most obvious usage of Go code (structural AST patterns + minified source code), but it's a thing.

But if #13590 is closed, I don't think anything has changed since then?

@findleyr
Copy link
Contributor

I don't have an objection to this proposal. My only concern is that right now go/ast is generally difficult to use in refactoring tools, due to missing information, and I wish we could take a step back and do a holistic evaluation of what would be required to solve all our problems (missing position information, floating comments, capturing irrelevant syntax, etc.). It feels like we really want a layer on top of go/ast that captures the full syntax of the original (unmodified) input. Should we be focusing on designing such a layer (in consideration of prior art), rather than adding additional information to go/ast that's not needed by gofmt or go/types?

But you're just asking for 16 bytes, and I don't think my philosophical concerns should be blocking.

@findleyr
Copy link
Contributor

@go101

There are two other similar cases:

Is that list comprehensive? I wonder how much it would take to have an invariant that "go/ast captures all the positions of semantically relevant tokens".

@go101
Copy link

go101 commented Jan 12, 2022

I think the list is comprehensive. The three are all the ones I found in developing go101/Golds.

@rsc rsc moved this from Incoming to Likely Accept in Proposals (old) Feb 23, 2022
@rsc
Copy link
Contributor

rsc commented Feb 23, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) Mar 2, 2022
@rsc
Copy link
Contributor

rsc commented Mar 2, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: go/ast: add Range token.Pos to RangeStmt go/ast: add Range token.Pos to RangeStmt Mar 2, 2022
@rsc rsc modified the milestones: Proposal, Backlog Mar 2, 2022
@gopherbot
Copy link

Change https://go.dev/cl/426091 mentions this issue: go/ast: add Range token.Pos to RangeStmt

gopherbot pushed a commit that referenced this issue Sep 5, 2022
For #50429

Change-Id: Idb027244f901d9f482c894b5b979a054d0f07de5
Reviewed-on: https://go-review.googlesource.com/c/go/+/426091
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Findley <rfindley@google.com>
@adonovan
Copy link
Member

This was fixed by https://go.dev/cl/426091 .

@dmitshur dmitshur modified the milestones: Backlog, Go1.20 Jun 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

8 participants