Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/parser: panic during object resolution: identifier already declared #64534

Closed
findleyr opened this issue Dec 4, 2023 · 10 comments
Closed
Assignees
Milestone

Comments

@findleyr
Copy link
Contributor

findleyr commented Dec 4, 2023

Discovered via Google internal bug reporting, and reproduced via fuzzing.

The following code panics in object resolution:

	_, err := parser.ParseFile(token.NewFileSet(), "", "package p; func _[A\x9d(B](){}", 0)

Output:

panic: 1:22: identifier B already declared or resolved [recovered]
	panic: 1:22: identifier B already declared or resolved

https://go.dev/play/p/SacseHmo6ha?v=gotip

This is a new regression, and does not reproduce at 1.21. Marking as a release blocker.
The only thing that changed is https://go.dev/cl/538858, though I don't immediately see the bug.

CC @griesemer @adonovan

@findleyr findleyr added this to the Go1.22 milestone Dec 4, 2023
@cuonglm
Copy link
Member

cuonglm commented Dec 4, 2023

The only thing that changed is https://go.dev/cl/538858, though I don't immediately see the bug.

I think the problem is tparams := closing == token.RBRACK does not have the same semantic with requiresName boolean in cmd/compile/internal/syntax.

There's case that we call with closing == token.RBRACK but requiresName is false.

@findleyr
Copy link
Contributor Author

findleyr commented Dec 4, 2023

Thanks @cuonglm. That sounds likely.

I can't look into this further now, but I will note that based on the rate of internal reports, it must be possible to reproduce this panic without the invalid utf-8 produced by fuzzing.

@adonovan
Copy link
Member

adonovan commented Dec 4, 2023

Here's the token stream for the TypeParams portion:

18 [ 
19 IDENT A
20 ILLEGAL �
21 ( 
22 IDENT B
23 ] 
...

and the corresponding AST:

    19  .  .  .  .  TypeParams: *ast.FieldList {
    20  .  .  .  .  .  Opening: 1:18
    21  .  .  .  .  .  List: []*ast.Field (len = 2) {
    22  .  .  .  .  .  .  0: *ast.Field {
    23  .  .  .  .  .  .  .  Doc: nil
    24  .  .  .  .  .  .  .  Names: nil
    25  .  .  .  .  .  .  .  Type: *ast.ParenExpr {
    26  .  .  .  .  .  .  .  .  Lparen: 1:21
    27  .  .  .  .  .  .  .  .  X: *ast.Ident {
    28  .  .  .  .  .  .  .  .  .  NamePos: 1:22
    29  .  .  .  .  .  .  .  .  .  Name: "B"
    30  .  .  .  .  .  .  .  .  .  Obj: nil
    31  .  .  .  .  .  .  .  .  }
    32  .  .  .  .  .  .  .  .  Rparen: 1:23
    33  .  .  .  .  .  .  .  }
    34  .  .  .  .  .  .  .  Tag: nil
    35  .  .  .  .  .  .  .  Comment: nil
    36  .  .  .  .  .  .  }
    37  .  .  .  .  .  .  1: *ast.Field {
    38  .  .  .  .  .  .  .  Doc: nil
    39  .  .  .  .  .  .  .  Names: nil
    40  .  .  .  .  .  .  .  Type: *(obj @ 25)   // <--- shared subtree
    41  .  .  .  .  .  .  .  Tag: nil
    42  .  .  .  .  .  .  .  Comment: nil
    43  .  .  .  .  .  .  }
    44  .  .  .  .  .  }
    45  .  .  .  .  .  Closing: 1:28
    46  .  .  .  .  }

Notice there are two unnamed Fields of type (*B), sharing the same subtree (a DAG), hence the redeclaration. So although this bug appears only with the parser's object resolution, it's a symptom of a real bug in the parser output.

The reason for the second field is that the parseParameterList loop over fields creates a Field (name=A, type=nil) then another with (name="" type=B), then tries to distribute the type of B across the preceding typeless fields (A).

@griesemer
Copy link
Contributor

griesemer commented Dec 5, 2023

I fail to see how this is a release blocker. First of all, features related to object resolution by the parser have been deprecated. Second, object resolution should arguably not happen when there are (hard) syntactic errors; and there are many here (consider the trace output). Third, this is a really esoteric piece of code.

Shouldn't we just turn off object resolution by default?

@griesemer
Copy link
Contributor

griesemer commented Dec 5, 2023

For the reference, this is the syntax AST for the program

package p; func _[A\x9d(B](){}
     1  *syntax.File {
     2  .  Pragma: nil
     3  .  PkgName: p @ <unknown position>[:1:9]
     4  .  DeclList: []syntax.Decl (1 entries) {
     5  .  .  0: *syntax.FuncDecl {
     6  .  .  .  Pragma: nil
     7  .  .  .  Recv: nil
     8  .  .  .  Name: _ @ <unknown position>[:1:17]
     9  .  .  .  TParamList: []*syntax.Field (1 entries) {
    10  .  .  .  .  0: *syntax.Field {
    11  .  .  .  .  .  Name: A� @ <unknown position>[:1:19]
    12  .  .  .  .  .  Type: B @ <unknown position>[:1:22]
    13  .  .  .  .  }
    14  .  .  .  }
    15  .  .  .  Type: *syntax.FuncType {
    16  .  .  .  .  ParamList: nil
    17  .  .  .  .  ResultList: nil
    18  .  .  .  }
    19  .  .  .  Body: nil
    20  .  .  }
    21  .  }
    22  .  EOF: syntax.Pos {}
    23  .  GoVersion: ""
    24  }

The difference is not due to the different meaning of requiresNames (@cuonglm), it's because parameter fields are parsed differently. As a result, there's only one field here.

Created with (syntax) test case, with tracing enabled:

func TestIssue64534(t *testing.T) {
	f, _ := Parse(nil, strings.NewReader("package p; func _[A\x9d(B](){}"), func(error) {}, nil, 0)
	Fdump(os.Stdout, f)
}

@adonovan
Copy link
Member

adonovan commented Dec 6, 2023

I fail to see how this is a release blocker.

It is a new regression and a panic, so I understand why Rob added the label. And it's also not that esoteric: any character that isn't a valid token will trigger the problem, such as $ or #. Witness: https://go.dev/play/p/RHBvS7672HB?v=gotip

First [...] Third, this is a really esoteric piece of code. Shouldn't we just turn off object resolution by default?

Deprecated or not, and as appealing as it might be, we can't just drop syntax resolution.

Unfortunately gopls must run the parser and type checker on inputs in all kinds of half-baked edit states that most users wouldn't dream of exposing to the compiler. I don't think this bug should be hard to fix, now that we understand it.

@findleyr
Copy link
Contributor Author

findleyr commented Dec 6, 2023

@griesemer I marked this as a release blocker before I fully understood the nature of the problem. But it does seem concerning that the parser is producing invalid ASTs (reusing subtrees). Furthermore, while syntax resolution may be deprecated, analysis drivers must still use it as there are several existing analyzers that rely on it (and in any case, it is part of the go/analysis contract).

We went from hardly getting any internal crash reports over the past 6 months, to getting several a week. In the grand scheme of things, this crash rate may be acceptable (if it only occurs for invalid code, as seems to be the case), but it does seem worth fixing sooner rather than later.

We can remove the release blocker label, but keep this in the 1.22 milestone. Perhaps mark it ok-after-rc1?

I can't fix it this week, but would be happy to fix it early next week.

@griesemer
Copy link
Contributor

I'll have another look at it today. The fix may be easy, but this code is subtle.

@findleyr
Copy link
Contributor Author

findleyr commented Dec 6, 2023

This code is certainly subtle! If we don't get to this before the RC, we should probably prefer a very narrow fix.

Thanks for looking into it.

@gopherbot
Copy link

Change https://go.dev/cl/548395 mentions this issue: go/parser: fix panic in object resolution for invalid type parameter list

ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
…list

This change restores the original logic in parseParameterList to what
it was before CL 538858 (which caused the issue), not in exact wording
but in identical semantic meaning, and thus restores this function to
a state that we know was working fine.

However, the change keeps the improved error reporting introduced by
CL 538858. To keep the code changes somewhat minimal as we are close
to RC1, the improved error handling exists twice for now even though
it could be factored out.

Fixes golang#64534.

Change-Id: I0b7bbf74d28811e8aae74f838f2d424f78af1f38
Reviewed-on: https://go-review.googlesource.com/c/go/+/548395
Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants