Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html/template: template.HTML strings escaped (Go 1.9 regression) #21844

Closed
bep opened this issue Sep 12, 2017 · 39 comments
Closed

html/template: template.HTML strings escaped (Go 1.9 regression) #21844

bep opened this issue Sep 12, 2017 · 39 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@bep
Copy link
Contributor

bep commented Sep 12, 2017

This has been reported by several users in the wild after we released Hugo 0.27 built on Go 1.9.

I have tried to make a simple standalone and failing test case, not yet successful.

I have, however, a failing test inside Hugo:
https://travis-ci.org/gohugoio/hugo/builds/274496313?utm_source=github_status&utm_medium=notification

It passes fine on Go 1.8.3, fails on Go 1.9 and tip.

See gohugoio/hugo#3878

The gist of it seems to be:

  • HTML strings put into the template context, i.e. data fetched from external resources such as Twitter (Tweets) and Instagram.
  • These are wrapped as template.HTML types to mark them as safe.

Note that we have had plenty of similar and passing tests in Hugo, so there is a corner case here that I don't fully understand.

I will go back to using Go 1.8.3 (where I can).

@mvdan mvdan added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 12, 2017
@mattn
Copy link
Member

mattn commented Sep 12, 2017

https://play.golang.org/p/dWGICIIIbn

As far as I can see, template.HTML works as intented. I'm not sure, is this {{< instagram xxx>}} replaced to {{ something }} which can be parsed by html/template?

@bep
Copy link
Contributor Author

bep commented Sep 12, 2017

@mattn I will have a look on this later; your conclusion is, however, a little simplistic.

@moorereason
Copy link

I'm assuming this is related to #20842.

Cc: @stjj89

@ianlancetaylor ianlancetaylor added this to the Go1.9.1 milestone Sep 12, 2017
@ianlancetaylor
Copy link
Contributor

Tentatively marking as 1.9.1 issue. Let's see what the problem and fix are.

@bep
Copy link
Contributor Author

bep commented Sep 12, 2017

A template parse tree example in this situation is:

HTMLShort: <blockquote class="htmlShort">{{.Site.Data.mydata.other | safeHTML | _html_template_htmlescaper | _html_template_htmlescaper}}</blockquote>

Which fits good with the description in #20842. I may find time to extract a repro tomorrow.

@stjj89
Copy link
Contributor

stjj89 commented Sep 12, 2017

This looks similar to #20842, where executing one template within another template confusing the autoescaper. Perhaps there was some edge case that I didn't catch?

I'm not too familiar with the internals of Hugo, so if you could come up with a repro, I would appreciate it.

@bep
Copy link
Contributor Author

bep commented Sep 13, 2017

I have added a fix (or a workaround) in Hugo:

gohugoio/hugo@2d613dd#diff-640125fcf55eaedde41a99d7d2f3ad96R308

clone := texttemplate.Must(templ.Clone())
tt.AddParseTree(withoutExt, clone.Tree)

Don't ask why we use AddParseTree, but doing a clone before adding it fixes the issue for us. The failing test case is fairly simple, but I have not been able to extract it into a main type of standalone repro. And as this now is OK for me, I will not continue trying, but I still consider this to be a bug and a regression in Go 1.9.

@mattn
Copy link
Member

mattn commented Sep 13, 2017

I'm not sure, I don't read code of hugo well.

s, err := d.Tmpl.Lookup("shortcodes/myShort.html").ExecuteToString(data)
assert.NoError(err)
assert.Contains(s, "<h1>Hi!</h1>")

s, err = d.Tmpl.Lookup("shortcodes/myShort").ExecuteToString(data)
clone := template.Must(templ.Clone())
tt.AddParseTree(withoutExt, clone.Tree)

If template is re-used to Parse() after Execute(), (AFAIK), it should be cloned.

@bep
Copy link
Contributor Author

bep commented Sep 13, 2017

@mattn you are reading the code wrong. But let us not spend time on the Hugo source code, that could take a while.

@mattn
Copy link
Member

mattn commented Sep 13, 2017

https://play.golang.org/p/JpKWWNv_mu

This is minimal code to reproduce.

@stjj89 @rsc Is this right behavior? or right usage?

@mattn
Copy link
Member

mattn commented Sep 13, 2017

diff --git a/src/html/template/template.go b/src/html/template/template.go
index 6a661bf..1be8869 100644
--- a/src/html/template/template.go
+++ b/src/html/template/template.go
@@ -217,6 +217,10 @@ func (t *Template) AddParseTree(name string, tree *parse.Tree) (*Template, error
 		return nil, err
 	}
 
+	if t.Tree == tree {
+		t.set[name] = t
+		return t, nil
+	}
 	t.nameSpace.mu.Lock()
 	defer t.nameSpace.mu.Unlock()
 	text, err := t.text.AddParseTree(name, tree)

fixes this. But I can't bet this is best.

@stjj89
Copy link
Contributor

stjj89 commented Sep 13, 2017

Thanks for the repro, @mattn.

The problem here is that two different templates ("A" and "B") that are associated in the same set share the same underlying parse tree, while having different escaper states. When A is escaped, it modifies that underlying parse tree, then marks itself as being escaped. When we get to B, we don't realize that the underlying parse tree has already been modified since its escaper state is not "escaped". B proceeds to modify the parse tree again, leading to over-escaping.

I'm not sure if this is valid use of AddParseTree. The function was added to html/template to mirror text/template functionality, where it was added to allow a template to be addressed from multiple associations:

By construction, a template may reside in only one association. If it's necessary to have a template addressable from multiple associations, the template definition must be parsed multiple times to create distinct *Template values, or must be copied with the Clone or AddParseTree method.

I think we should probably disallow aliasing of the same underlying text/template/parse.Tree via the html/template API. The way to do this would be to add a check in AddParseTree that ensures that the new tree pointer does not already exist in the template set, i.e. something like:

func (t *Template) AddParseTree(name string, tree *parse.Tree) (*Template, error) {
  for _, tmpl := range t.set {
    if tmpl.Tree == tree {
      return nil, fmt.Errorf("tree already exists in template set")
    }
  }
  ...
}

@bep
Copy link
Contributor Author

bep commented Sep 13, 2017

@stjj89 That sounds reasonable to me -- it would still be a potentially breaking change vs 1.8.3, but it will be a loud one, which is always good.

@mattn
Copy link
Member

mattn commented Sep 14, 2017

godoc says:
https://golang.org/pkg/text/template/

By construction, a template may reside in only one association. If it's necessary to have a template addressable from multiple associations, the template definition must be parsed multiple times to create distinct *Template values, or must be copied with the Clone or AddParseTree method.

So I wonder it is breaking change.

@bep
Copy link
Contributor Author

bep commented Sep 14, 2017

or must be copied with the Clone or AddParseTree method.

The above sentence does imply that the AddParseTree handles the copy for you. So, either it is

  1. a breaking change because the GoDoc is misleading.
  2. a breaking change because the implementation has been changed in a really subtle way without any warning or notes in the release notes.

I spent 5 hours debugging and fixing one application that got hit by this (because something broke). I'm in general very happy and impressed by Go and its quality, but let us call a spade for a spade: This is a breaking change. You may fix it in the documentation, but please don't close it with no action.

@mattn
Copy link
Member

mattn commented Sep 14, 2017

As @stjj89 mentioned, we can not make sure if this use of AddParseTree is valid. We can change to return error here, but I think everyone doesn't want to make text/template slow. Also I understand your frustration, and it is regrettable that it happened. So I suggest to fix doc to avoid another people will take many time to debug again.

@stjj89
Copy link
Contributor

stjj89 commented Sep 18, 2017

or must be copied with the Clone or AddParseTree method.

The above sentence does imply that the AddParseTree handles the copy for you.

That copying in the quoted statement seems to refer to "*Template values", not parse Trees, so I think it is actually accurate.

We can change to return error here, but I think everyone doesn't want to make text/template slow.

My proposed change will only affect html/template, not text/template. I don't expect AddParseTree to be used very often, so maybe the performance cost of searching through all associated templates won't be that bad.

+@robpike who originally wrote Add/AddParseTree, added the documentation above, and approved the html/template version of AddParseTree.

Rob, do you have an opinion on this? Should we prevent html/template users from shooting themselves in the foot like this with AddParseTree?

@robpike
Copy link
Contributor

robpike commented Sep 19, 2017

It sounds like a bug to me, and easy to fix, but I am not an expert on html/template. I work on text/template and do what's needed in html/template to get the tests to pass. If someone can create a simple test for html/template that catches the problem, we can see if the proposed fix works.

@gopherbot
Copy link

Change https://golang.org/cl/64770 mentions this issue: html/template: prevent aliasing of parse Trees via AddParseTree

@stjj89
Copy link
Contributor

stjj89 commented Sep 19, 2017

I've come up with a minimal proof-of-concept at https://play.golang.org/p/j6NtnrQSPq. I have uploaded one possible fix at https://golang.org/cl/64770.

I discussed this with @mikesamuel, the author of html/template, and we think that AddParseTree doesn't really make sense in html/template's API. This method was originally introduce specifically to allow html/template modify the internals of the underlying parse tree. It's not clear why a html/template user would need to directly modify the underlying parse Tree, given that html/template was designed to auto-magically transform it.

My guess is that AddParseTree is used to merge template sets, or to allow a single template to be addressable from multiple sets. Since we're considering breaking changes, perhaps we could completely replace AddParseTree with:

// AddTemplate associates the given template with t.
//
// It returns an error if t, the given template, or any associated
// template has already been executed.
func (t *Template) AddTemplate(tmpl *Template) (*Template, error)

While we're at it, we could also consider un-exporting Template's Tree field.

These changes will ensure that only html/template, not the user, gets to modify the underlying parse trees, while still allowing users to add templates to different sets. @bep, does this work for your current use case?

@rsc rsc modified the milestones: Go1.9.1, Go1.9.2 Oct 4, 2017
@rsc rsc reopened this Oct 13, 2017
@bep
Copy link
Contributor Author

bep commented Oct 13, 2017

@bep, does this work for your current use case?

Yes.

@ianlancetaylor ianlancetaylor removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 13, 2017
@rsc
Copy link
Contributor

rsc commented Oct 13, 2017

@bep, the CL just reports an error where there wasn't one before. Why does that make your Hugo users happy?

@rsc
Copy link
Contributor

rsc commented Oct 13, 2017

Now I see that @bep has a workaround already.

I don't understand why this would be considered a "fix" for Go 1.9.2:

f$ go1.8 run x.go
#1: HTMLShort: <blockquote class='htmlShort'><foo/></blockquote>
#2: HTMLShort: <blockquote class='htmlShort'><foo/></blockquote>
f$ go1.9 run x.go
#1: HTMLShort: <blockquote class='htmlShort'><foo/></blockquote>
#2: HTMLShort: <blockquote class='htmlShort'>&lt;foo/&gt;</blockquote>
f$ go run x.go
AddParseTree: html/template: cannot add parse tree that template "foo.html" already references
#1: HTMLShort: <blockquote class='htmlShort'><foo/></blockquote>
f$ 
f$ 
f$ cat x.go
package main

import (
	"fmt"
	"html/template"
	"os"
)

func main() {
	tpl := template.New("foo.html").Funcs(template.FuncMap{
		"safeHTML": func(s interface{}) template.HTML {
			return template.HTML(fmt.Sprint(s))
		},
	})

	tpl.Parse("HTMLShort: <blockquote class='htmlShort'>{{ . | safeHTML }}</blockquote>\n")
	if _, err := tpl.AddParseTree("foo", tpl.Tree); err != nil {
		fmt.Printf("AddParseTree: %v\n", err)
	}
	
	fmt.Printf("#1: ")
	tpl.Lookup("foo.html").Execute(os.Stdout, "<foo/>")

	if t := tpl.Lookup("foo"); t != nil {
		fmt.Printf("#2: ")
		if err := t.Execute(os.Stdout, "<foo/>"); err != nil {
			fmt.Printf("EXEC: %v\n", err)
		}
	}
}

The fix is not restoring the old Go 1.8 behavior, so I don't see a reason to put it in Go 1.9.2, especially since apparently Hugo may have been the only program doing this, and Hugo has a different workaround.

@bep
Copy link
Contributor Author

bep commented Oct 13, 2017

@rsc Hugo is fine. We don't need this fix. As I understand the fix, we would still need to apply the current workaround. Which is fine.

Whether this qualifies as a 1.9.2 fix is a judgement you have to take: There may be unknown bugs in the wild from the Go 1.9 release because of this which would get an Error (better than not knowing) with this patch.

@ianlancetaylor
Copy link
Contributor

As I understand it, which I don't, the argument in favor of putting this into 1.9.2 is that code that used to do something relatively harmless now does something horrible. The fix (CL 64770) would change it to return an error rather than do something horrible, so that might overall be worth doing.

@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 13, 2017
@bep
Copy link
Contributor Author

bep commented Oct 13, 2017

the argument in favor of putting this into 1.9.2 is that code that used to do something relatively harmless now does something horrible.

I'm not sure about the "horrible" part, but in general you are right. And this is in an area likely not covered by tests. Hugo didn't know before end users started to shout.

@rsc
Copy link
Contributor

rsc commented Oct 15, 2017

I'm sorry, but I am still a bit confused. Why did we decide to disallow the old uses of AddParseTree? If they were working before, wouldn't the right fix (especially for Go 1.9.2) to be to keep them working? So far I've heard basically "we didn't mean to allow that", but in general that's been true of many things about the Go libraries, and our default position is to accept what we did and preserve the behavior. Is the old behavior somehow unsafe?

@rsc
Copy link
Contributor

rsc commented Oct 17, 2017

I'm going to kick this back into Go 1.10. I think we still need to discuss whether the new error is correct or if we should make the old behavior keep working.

@rsc rsc modified the milestones: Go1.9.2, Go1.10 Oct 17, 2017
@rsc
Copy link
Contributor

rsc commented Oct 23, 2017

Ping @stjj89

@stjj89
Copy link
Contributor

stjj89 commented Oct 23, 2017

CL 64770 really just stops html/template users from shooting themselves in the foot by aliasing parse trees. Since the existing html/template docs do not say anything about the relationship between parse trees and html/template Templates, the existing behavior is technically correct, albeit unintuitive. I'm ok with reverting the change and leaving things as they are.

@rsc
Copy link
Contributor

rsc commented Oct 24, 2017

@stjj89 yes, I understand what CL 64770 does. My question is why the behavior changed from Go 1.8 to Go 1.9 and whether we should be working to move back to the Go 1.8 behavior. (See top of shell transcript in #21844 (comment).)

Was the change in Go 1.9 intentional? If not, do we know what caused it and how to revert to the old Go 1.8 behavior?

@stjj89
Copy link
Contributor

stjj89 commented Oct 24, 2017

@rsc sorry, I missed that part of your earlier comment. It turns out that this change in behavior from 1.8 to 1.9 is a side effect of CL 37880. Specifically, it removed a section of ensurePipelineContains that prevents duplicate escaping commands from being inserted into a pipeline.

This change was made with the assumption that duplicate escapers only ever occur when a predefined escapers (e.g. html, urlquery) are present at the end of a pipeline. This assumption seems to hold, since html/template prevents already-escaped templates in a template set from being escaped again. This particular usage of AddParseTree violates this assumption by bypassing the mechanisms that html/template uses to prevent its parse trees from being re-escaped.

I can restore the 1.8 behavior by changing the logic in ensurePipelineContains to account for duplicate escapers anywhere in the pipeline. Does that sound like the right thing to do?

@rsc
Copy link
Contributor

rsc commented Oct 25, 2017

@stjj89 Can html/template track whether a particular piece of syntax has been escaped or not? It seems like that would just be an extra field in the Template struct or a map[something]bool on the side. In any event, then AddParseTree should be able to tell whether a particular piece of syntax needs escaping (because it's new) or not (because it came out of an existing html/template's tree).

P.S. Thanks for linkifying CL references, but note that you have to put https:// at the beginning of URLs in markdown. :-)

@stjj89
Copy link
Contributor

stjj89 commented Oct 25, 2017

@stjj89 Can html/template track whether a particular piece of syntax has been escaped or not? It seems like that would just be an extra field in the Template struct or a map[something]bool on the side. In any event, then AddParseTree should be able to tell whether a particular piece of syntax needs escaping (because it's new) or not (because it came out of an existing html/template's tree).

By "piece of syntax", I assume you mean a template action (i.e. something bound by delimiters, like {{.X}}). We might be able to do this if we use the position of each ActionNode as a unique identifier for the action it represents, and keep a map of booleans like you described.

Per-action book-keeping might be overkill, though. Each Template and its parse tree are either completely escaped or not. We could just key the map with parse tree pointers and use the value of that entry to decide whether or not to escape the parse tree.

P.S. Thanks for linkifying CL references, but note that you have to put https:// at the beginning of URLs in markdown. :-)

Ah, thanks. I didn't realize. I've fixed the URLs in my two previous posts.

@stjj89
Copy link
Contributor

stjj89 commented Nov 10, 2017

Per-action book-keeping might be overkill, though. Each Template and its parse tree are either completely escaped or not. We could just key the map with parse tree pointers and use the value of that entry to decide whether or not to escape the parse tree.

It just occurred to me that any kind of template-level book-keeping will not work if the underlying parse tree is aliased by two different Template values. See https://play.golang.org/p/uHNbVnTa7T.

It seems to me that the only way to solve this problem in general is to attach a flag to the parse tree itself that indicates whether that tree has already been escaped. However, such a html/template-specific flag doesn't seem to belong to the text/template/parse API. Perhaps we could add an exported interface{} field to the parse Tree that html/template can use to store this flag?

@rsc rsc changed the title template/html: template.HTML strings escaped (Go 1.9 regression) html/template: template.HTML strings escaped (Go 1.9 regression) Dec 13, 2017
@rsc
Copy link
Contributor

rsc commented Dec 13, 2017

I'm pretty frustrated that we didn't manage to understand this well enough to restore the Go 1.8 behavior, but at this point I'm willing to just cut our losses and move on. We need to find a real owner for this package at some point.

@rsc rsc closed this as completed Dec 13, 2017
@gopherbot
Copy link

Change https://golang.org/cl/83920 mentions this issue: html/template: check for duplicates when inserting escapers

@stjj89
Copy link
Contributor

stjj89 commented Dec 14, 2017

Sorry, this issue got swept under the rug. I'm giving this one more shot by reverting my AddParseTree change and adding a new fix that restores Go1.8 behavior by checking for duplicate escapers when rewriting pipelines. I hope it's not too late to consider this fix.

@rsc
Copy link
Contributor

rsc commented Jan 9, 2018

Thanks very much.

@rsc rsc reopened this Jan 9, 2018
@golang golang locked and limited conversation to collaborators Jan 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

9 participants