Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: EncodeToken doesn't indent Comment #56340

Open
hlcfan opened this issue Oct 20, 2022 · 7 comments · May be fixed by #65737
Open

encoding/xml: EncodeToken doesn't indent Comment #56340

hlcfan opened this issue Oct 20, 2022 · 7 comments · May be fixed by #65737
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@hlcfan
Copy link

hlcfan commented Oct 20, 2022

What version of Go are you using (go version)?

$ go version
1.19.2

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/***/Library/Caches/go-build"
GOENV="/Users/***/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/***/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/***/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.19.2"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/***/workspace/go/src/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4d/n7jcw1x95zxc_cchskx53f4c0000gn/T/go-build2258919332=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"io"
)

func main() {
	input := `<book> <author>Fred</author><!-- <price>20</price><currency>USD</currency> --><isbn>23456</isbn> </book>`
	decoder := xml.NewDecoder(bytes.NewReader([]byte(input)))
	buf := new(bytes.Buffer)
	encoder := xml.NewEncoder(buf)
	encoder.Indent("", " ")

tokenize:
	for {
		token, err := decoder.Token()

		switch {
		case err == io.EOF:
			encoder.Flush()
			break tokenize
		case err != nil:
			fmt.Printf("failed to tokenize xml, error: %s", err)
		default:
			err := encoder.EncodeToken(token)
			if err != nil {
				fmt.Printf("failed to encode xml, error: %s\n", err)
			}

		}
	}

	fmt.Println(buf.String())
}

What did you expect to see?

Comment is put as a new line.

<book> 
 <author>Fred</author>
 <!-- <price>20</price><currency>USD</currency> -->
 <isbn>23456</isbn> 
</book>

What did you see instead?

Comment is put at the same line as previous element.

<book> 
 <author>Fred</author><!-- <price>20</price><currency>USD</currency> -->
 <isbn>23456</isbn> 
</book>

What to change?

I think we can add p.writeIndent(1) and p.writeIndent(-1) before and after writing the comment element, https://github.com/golang/go/blob/go1.19.2/src/encoding/xml/marshal.go#L220.

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 20, 2022
@dr2chase
Copy link
Contributor

@rsc

@seankhliao seankhliao added this to the Unplanned milestone Nov 19, 2022
@Faetu
Copy link

Faetu commented Feb 15, 2024

Nothing happens since nov 2022. But I need this fix. Is there a workaround for this?

@Faetu Faetu linked a pull request Feb 15, 2024 that will close this issue
@gopherbot
Copy link

Change https://go.dev/cl/564575 mentions this issue: encoding/xml: add indent to xml.Comment

@ianlancetaylor
Copy link
Contributor

Experience tells us that almost every change to the XML package breaks some existing code. Why is this change important?

@Faetu
Copy link

Faetu commented Feb 16, 2024

@ianlancetaylor
In certain cases, such as mine, comments need to be on their own lines with proper indentation. With this adjustment, you can maintain that formatting preference while also retaining the option to include comments inline. My modification doesn't remove any existing functionality; it simply introduces a new option.

I've tested this change in my local codebase, and it functions as expected. If you already have code that utilizes the encoding/xml package, the only difference you may notice is that when using the Indent function, comments will appear on separate lines. No other behavior is affected.

@ianlancetaylor
Copy link
Contributor

@Faetu Thanks, but that doesn't quite answer my question. Why do comments need to be on their own lines? XML is not a lined-oriented format.

My modification doesn't remove any existing functionality; it simply introduces a new option.

I'm not sure what you mean by "a new option." It seems to me that https://go.dev/cl/564575 will change all XML code that includes comments. It's true that the change shouldn't matter. But then, if the change doesn't matter, why make it?

@Faetu
Copy link

Faetu commented Feb 19, 2024

Hi @ianlancetaylor

sry for my late response.

Why do comments need to be on their own lines?

In my case, I need to present the generated XML to a human with comments in their own lines. However, there are instances where I must process provided XML and manage comments relative to the elements. To achieve the necessary handling, comments must be formatted in their own lines.

In other scenarios where Python is utilized, comments are typically generated in this manner by default.

I'm not sure what you mean by "a new option." It seems to me that https://go.dev/cl/564575 will change all XML code that includes comments.

I referred to the 'new option' as the capability to insert comments in their own lines with proper indentation. With this modification, I can now not only achieve this formatting but also retain the previous behavior, where comments are placed immediately after EndElement or between StartElement and EndElement.

For example:

if err := xmlEncoder.EncodeToken(xml.StartElement{Name: xml.Name{Local: "MyFirstElem"}}); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.EndElement{Name: xml.Name{Local: "MyFirstElem"}}); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.Comment("Comment")); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.StartElement{Name: xml.Name{Local: "MySecElem"}}); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.EndElement{Name: xml.Name{Local: "MySecElem"}}); err != nil {
return err
}

result (with this change):

<MyFirstElem></MyFirstElem>
<!-- Comment -->
<MySecElem></MySecElem>

For the old behaviour:

if err := xmlEncoder.EncodeToken(xml.StartElement{Name: xml.Name{Local: "MyFirstElem"}}); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.EndElement{Name: xml.Name{Local: "MyFirstElem"}}); err != nil {
return err
}
// disable indent
xmlEncoder.Indent("","")
if err := xmlEncoder.EncodeToken(xml.Comment("Comment")); err != nil {
return err
}
// enable indent again
xmlEncoder.Indent("","    ")
if err := xmlEncoder.EncodeToken(xml.StartElement{Name: xml.Name{Local: "MySecElem"}}); err != nil {
return err
}
if err := xmlEncoder.EncodeToken(xml.EndElement{Name: xml.Name{Local: "MySecElem"}}); err != nil {
return err
}

Result:

<MyFirstElem></MyFirstElem><!-- Comment -->
<MySecElem></MySecElem>

It's true that the change shouldn't matter. But then, if the change doesn't matter, why make it?

Yes, it shouldn't matter for most cases. However, it does matter in my case, as well as in the case of @hlcfan and perhaps others. Additionally, in Python, for example, the default formatting ensures that comments are placed on a new line. With this change, we now have a convenient and straightforward method to add XML comments on separate lines, while still retaining the ability to maintain the old behavior.

best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants