Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: does not recognise semicolon as a valid field name #39189

Closed
kolatat opened this issue May 21, 2020 · 9 comments
Closed

encoding/json: does not recognise semicolon as a valid field name #39189

kolatat opened this issue May 21, 2020 · 9 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@kolatat
Copy link

kolatat commented May 21, 2020

What version of Go are you using (go version)?

go1.14 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

windows amd64

What did you do?

import (
	"encoding/json"
	"fmt"
)

func main() {
	encoded := []byte(`{";": "World!"}`)
	type MyObject struct {
		Hello string `json:";"`
	}
	var decoded MyObject
	if err := json.Unmarshal(encoded, &decoded); err != nil {
		fmt.Println(err)
		return
	}
	fmt.Printf("%+v", decoded)
}

What did you expect to see?

{Hello:World!}

What did you see instead?

{Hello:}

@natebwangsut

@gopherbot
Copy link

Change https://golang.org/cl/234818 mentions this issue: encoding/json: allow semicolon in field key / struct tag

@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 21, 2020
@cagedmantis cagedmantis added this to the Backlog milestone May 21, 2020
@cagedmantis
Copy link
Contributor

/cc @rsc @dsnet @bradfitz @mvdan

@mvdan
Copy link
Member

mvdan commented May 21, 2020

I don't see a reason why not, but where do we draw the line on which characters are OK and which are not? And why is the semicolon part of the first group?

I'm not familiar with the history of that piece of code, so I really am asking. I think that needs an answer before we review a change.

@natebwangsut
Copy link

So pretty much we check JSON spec (RFC-7159) for validity on our "bug" and it seems to us that the spec would treat a semicolon as a normal character.

https://tools.ietf.org/html/rfc7159

@dsnet
Copy link
Member

dsnet commented May 22, 2020

I don't see a reason why not, but where do we draw the line on which characters are OK and which are not?

Back in February of 2011, the entirety of the struct tag was used as the JSON key name. It seems that the name syntax was restricted (https://golang.org/cl/4173061) so that the tags could in theory be used for other purposes like protocol buffers (#1520).

Later in June of 2011, a well-defined grammar for application-specific struct tags was defined and formally implemented in the reflect package (https://golang.org/cl/4645069).

It seems to me that the restricted set of valid characters is an artifact from a previous era to work around a limitation that no longer applies today.

The only restriction I can imagine for the character set would be a , since it is used to delimit the set of extra tag attributes that come after the name. In theory we could define a more complex grammar where someone could put a quoted string as to encode any arbitrary name that is valid UTF-8.

If the grammar is opened up to other characters, we'll need to consider how the equalFold logic is supposed to operate.

@seankhliao
Copy link
Member

relaxing the restrictions would also fix #22518 and #35287

@gopherbot
Copy link

Change https://golang.org/cl/247059 mentions this issue: encoding/json: allow add quotes in field key / struct tag

@zaneChou1
Copy link
Contributor

zaneChou1 commented Aug 6, 2020

Change https://golang.org/cl/247059 mentions this issue: encoding/json: allow add quotes in field key / struct tag

I read the JSON RFC-7159 standard and found that the quote ( ') in the Unicode character table as an ASCII punctuation mark should be allowed as a valid field name for JSON.
I submitted a change: https://go-review.googlesource.com/c/go/+/247059/

@mvdan
Copy link
Member

mvdan commented Sep 17, 2020

Thanks @dsnet. It seems like incremental steps like allowing semicolon characters should be safe and trivial, so I'm approving that CL.

If anyone wants to work on other characters, please file a separate issue. But a better solution would be a generic one, not to keep adding more exceptions. I think we should use #22518 for the generic solution. If anyone wants to work on that, just beware Joe's comment in #39189 (comment).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

9 participants