Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/xml: Collect metadata, like order and line numbers when parsing XML #67038

Open
cague opened this issue Apr 25, 2024 · 5 comments
Labels
Milestone

Comments

@cague
Copy link

cague commented Apr 25, 2024

Proposal Details

When parsing / Unmarshalling an XML Elements, collect order and line numbers.

Use field tags, like below:

type Vehicle struct {
Make string xml:"make"
Model string xml:"model"
Wheelbase float32 xml:"wheelbase"
ProblemsXML
}

type ProblemsXML struct {
XMLName Name
UnkElems []XMLElement xml:",any"
UnkAttrs []Attr xml:",any,attr"
Ooois OutOfOrderItems xml:",ooorder"
Order int xml:",order"
Line int xml:",line"
}

Example:
line 1
line 2 Chevrolet
line 3 107.2
line 4 Corvette
line 5
line 6
line 7 Chevrolet
line 8 107.2
line 9 Corvette
line 19

Vehicle[0].Order == 1
Vehicle[0].Line == 1
Vehicle[1].Order == 2
Vehicle[1].Line == 6

I have the code, not very complicated.
Is the next step to wait for approval before a pull request?

@cague cague added the Proposal label Apr 25, 2024
@gopherbot gopherbot added this to the Proposal milestone Apr 25, 2024
@ianlancetaylor
Copy link
Member

Please describe the new API you are suggesting. Thanks.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Apr 25, 2024
@cague
Copy link
Author

cague commented Apr 25, 2024

There isn't a new API in the sense that there are new functions. There are new "field tags" for structures.

xml:",order" // Stores the order of the Element within the Element's parent Element
xml:",line" // Stores the line number for the start of the Element

So when Go's regular Unmarshal function is used, the Go code will fill in those values. See ProblemsXML structure in the other comments to see the field tags in use.

@ianlancetaylor
Copy link
Member

OK, can you write out the new documentation that would be added to the encoding/xml package? Thanks.

@cague
Copy link
Author

cague commented Apr 25, 2024

This is the current doc here for Unmarshal
https://pkg.go.dev/encoding/xml#Unmarshal
There is a list of bullet points describing struct field tags.
Here's one of the existing bullet points:

  • If the XML element contains a sub-element that hasn't matched any of the above rules and the struct has a field with tag ",any", unmarshal maps the sub-element to that struct field.

The new doc would add more bullet points:

  • A struct field with tag ",line" contains the line number for the start Element
  • A struct field with tag ",order" contains the order of the Element within the Element's parent Element

And then maybe also have a field tag for catching "out of order" elements.

  • A struct field with tag ",ooorder" contains information on Elements that do not follow the order of the structure defined and passed to Unmarshal. The struct field must be of type OutOfOrderItems

type OutOfOrderItems []OutOfOrderItem

type OutOfOrderItem struct {
ElementOOO string // Out of order child element that appears after ElementMarker in the parsed XML
ElementMarker string // The element that appears before ElementOOO in the parsed XML but is defined after ElementOOO in the structure that Unmarshal stores the parsed results
}

This is similar to XML schema element xs:sequence and can be used to help enforce order.
e.g.
msg = fmt.Sprintf("out of order: Element "%v" must be before Element "%v"", oooi.ElementOOO , oooi.ElementMarker)

@ianlancetaylor
Copy link
Member

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

3 participants