|
|
Created:
12 years, 7 months ago by MikeSamuel Modified:
12 years, 7 months ago Reviewers:
CC:
nigeltao, golang-dev Visibility:
Public. |
Descriptionexp/template/html: Implement grammar for JS.
This transitions into a JS state when entering any attribute whose
name starts with "on".
It does not yet enter a JS on entry into a <script> element as script
element handling is introduced in another CL.
Patch Set 1 #Patch Set 2 : diff -r 4189b98b96cf https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 4189b98b96cf https://go.googlecode.com/hg/ #Patch Set 4 : diff -r 4189b98b96cf https://go.googlecode.com/hg/ #
Total comments: 65
Patch Set 5 : diff -r ad10be38bd25 https://go.googlecode.com/hg/ #
Total comments: 32
Patch Set 6 : diff -r 070b7cc84e48 https://go.googlecode.com/hg/ #
Total comments: 4
Patch Set 7 : diff -r 47d429aad39c https://go.googlecode.com/hg/ #
Total comments: 6
Patch Set 8 : diff -r f845253df880 https://go.googlecode.com/hg/ #
MessagesTotal messages: 9
http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... File src/pkg/exp/template/html/context.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:60: // stateJSRegex occurs inside a JavaScript regex literal. Go spells "regexp" with a "p". http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:64: // stateJSLineCmt occurs inside a JavaScript /* line comment */. "Line comment" should be // not /**/. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:156: type jsCtx bool I'd s/bool/uint8/. The storage size is the same, but I would prefer the consistency with state, delim and urlPart. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:65: sanitizers := arr[:1] The function body may read better with a shorter variable name: // s is the list of sanitizer functions to end the pipeline. s := make([]string, 0, 2) and replace sanitizers[0] = "exp_template_html_urlfilter" with s = append(s, "exp_template_html_urlfilter") http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:86: sanitizers = sanitizers[0:2] You can drop the "0". Alternatively: sanitizers = append(sanitizers, "html") http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:102: // If the pipe already has some of the sanitizers, do not interfere. Can you expand on the "do not interfere"? For example, IIUC, if the sanitizers are ["a1", "b1", "c1"] and the pipeline has commands ["c0", "c1"] then the merged result is ["c0", "a1", "b1", "c1"]. A comment to that effect would be helpful. The code below is pretty tricky. Can you could extract a function that took a []string of sanitizer function names and fixed up a pipe's commands to end with those sanitizers (and wrote some tests for that extracted function)? http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:296: } else if len(canonAttrName) >= 2 && "on" == canonAttrName[:2] { strings.HasPrefix(canonAttrName, "on"). http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:360: default: I think it'd be clearer if this was `case '/':` and there was a default case whose body was `panic("unreachable")`. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:380: quoteAndEsc := "\"\\" quoteAndEsc := `"\` http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:395: errStr: fmt.Sprintf("Broken escape sequence in JS string: %s", s), s/Broken/broken/. I'd also s/%s/%q/. Also, can we get a test case with a broken escape sequence? Similarly at line 430. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:407: var blockCommentEnd = []byte("*/") I'd move this var to be closer to func tJSBlockCmt. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:441: return c, nil Is it an error if inCharset is true here? http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:467: return c, s[i+1:] "+1" is not right: "\u2028" is more than one byte. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape_test.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:20: Z *interface{} In Go, it's really unusual to see a pointer to an interface. I think a *int would work just as well here. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.go File src/pkg/exp/template/html/js.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:1: package html Missing copyright header. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:20: // fail on any known useful programs. It is based on the draft Single space after a full stop. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:22: // http://www.mozilla.org/js/language/js20-2000-07/rationale/syntax.html That page redirects to a 404 for me. Do you know of an updated link? http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:26: // Trim space from the right. This could be i := len(bytes.TrimRight(s, "\t\n\f\r ")) http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:44: // ++ and -- are not Are not what? http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:50: if ((i - start) & 1) == 1 { Outer parens are unnecessary. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:185: for k, r := range s { Nit: I'd s/k/j/. Similarly below at line 251. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:234: // expression literal. /foo{{x}}bar/ matches the Unfinished comment. Also, single spaces after a full stop. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:235: func jsRegexEscaper(args ...interface{}) string { It looks to me like jsStrEscaper and jsRegexEscaper could be refactored to use common code with different replacement tables, but I'm not sure if that'll be an improvement. I may take a crack at it in a follow-up CL. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:324: // isJSIdentPart is true if the given rune is a Java Unfinished comment. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:1: //-*- mode: go-mode; indent-tabs-mode: t; tab-width: 8; -*- Drop this; add the missing copyright header. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:15: Delete the blank line. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:16: // Statement terminators precede regexs. s/regexs/regexps/. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:75: for _, test := range tests { escape_test.go uses the names "tc" and "testCases" instead of "test" and "tests". Please make the two files consistent. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:77: t.Errorf("Expected %s given %q", test.jsCtx, test.s) The usual Go language is something like if got := nextJsCtx(etc); got != test.jsCtx { t.Errorf("%q: want %s got %s", test.s, test.jsCtx, got) } http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:156: agolden := "[" + test.js[i:j] + "]" This could be agolden := "[" + strings.TrimSpace(test.js) + "]" I'd also rename s/agolden/want/. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:214: func TestJSRegexEscaper(t *testing.T) { s/Regex/Regexp/. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:259: jsStrEscaper("The quick, brown fox jumps over the lazy dog.") Benchmarks should loop from 0 up to b.N. See "godoc testing" for an example.
Sign in to reply to this message.
Incremental diffs at http://codereview.appspot.com/4968052/diff2/5001:9001/src/pkg/exp/template/ht... http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... File src/pkg/exp/template/html/context.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:60: // stateJSRegex occurs inside a JavaScript regex literal. On 2011/08/30 06:37:31, nigeltao wrote: > Go spells "regexp" with a "p". Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:64: // stateJSLineCmt occurs inside a JavaScript /* line comment */. On 2011/08/30 06:37:31, nigeltao wrote: > "Line comment" should be // not /**/. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/con... src/pkg/exp/template/html/context.go:156: type jsCtx bool On 2011/08/30 06:37:31, nigeltao wrote: > I'd s/bool/uint8/. The storage size is the same, but I would prefer the > consistency with state, delim and urlPart. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:65: sanitizers := arr[:1] On 2011/08/30 06:37:31, nigeltao wrote: > The function body may read better with a shorter variable name: > // s is the list of sanitizer functions to end the pipeline. > s := make([]string, 0, 2) > > and replace > sanitizers[0] = "exp_template_html_urlfilter" > with > s = append(s, "exp_template_html_urlfilter") Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:86: sanitizers = sanitizers[0:2] On 2011/08/30 06:37:31, nigeltao wrote: > You can drop the "0". > > Alternatively: > sanitizers = append(sanitizers, "html") Done with other append changes. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:102: // If the pipe already has some of the sanitizers, do not interfere. On 2011/08/30 06:37:31, nigeltao wrote: > Can you expand on the "do not interfere"? For example, IIUC, if the sanitizers > are ["a1", "b1", "c1"] and the pipeline has commands ["c0", "c1"] then the > merged result is ["c0", "a1", "b1", "c1"]. A comment to that effect would be > helpful. > > The code below is pretty tricky. Can you could extract a function that took a > []string of sanitizer function names and fixed up a pipe's commands to end with > those sanitizers (and wrote some tests for that extracted function)? Done. Extracted out into requirePipelineTail. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:296: } else if len(canonAttrName) >= 2 && "on" == canonAttrName[:2] { On 2011/08/30 06:37:31, nigeltao wrote: > strings.HasPrefix(canonAttrName, "on"). Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:360: default: On 2011/08/30 06:37:31, nigeltao wrote: > I think it'd be clearer if this was `case '/':` and there was a default case > whose body was `panic("unreachable")`. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:380: quoteAndEsc := "\"\\" On 2011/08/30 06:37:31, nigeltao wrote: > quoteAndEsc := `"\` Done in two places. Flipped the order of characters because otherwise it breaks emacs syntax highlighting horribly. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:395: errStr: fmt.Sprintf("Broken escape sequence in JS string: %s", s), On 2011/08/30 06:37:31, nigeltao wrote: > s/Broken/broken/. I'd also s/%s/%q/. > > Also, can we get a test case with a broken escape sequence? > > Similarly at line 430. Done in 2 places and added tests to TestErrors. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:407: var blockCommentEnd = []byte("*/") On 2011/08/30 06:37:31, nigeltao wrote: > I'd move this var to be closer to func tJSBlockCmt. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:441: return c, nil On 2011/08/30 06:37:31, nigeltao wrote: > Is it an error if inCharset is true here? Yes. I could add another context field to track that, but I think that is diminishing returns. Added a return check here and testcase. The error message here and in the string transition function above now use the original []byte to provide more context. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:467: return c, s[i+1:] On 2011/08/30 06:37:31, nigeltao wrote: > "+1" is not right: "\u2028" is more than one byte. Fixed to not treat the line terminator as part of the comment which flushed out a bug with space handling in nextJSCtx. This is more in accord with the section 7.4 language "However, the LineTerminator at the end of the line is not considered to be part of the single-line comment; it is recognised separately by the lexical grammar and becomes part of the stream of input elements for the syntactic grammar." which accounts for the fact that alert(function () { return // 42 }()) alerts undefined since a return statement is a restricted production so cannot be followed by an expression on a different logical line. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape_test.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:20: Z *interface{} On 2011/08/30 06:37:31, nigeltao wrote: > In Go, it's really unusual to see a pointer to an interface. I think a *int > would work just as well here. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.go File src/pkg/exp/template/html/js.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:1: package html On 2011/08/30 06:37:31, nigeltao wrote: > Missing copyright header. Done here and in js_test.go. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:20: // fail on any known useful programs. It is based on the draft On 2011/08/30 06:37:31, nigeltao wrote: > Single space after a full stop. Fixed in 2 places. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:22: // http://www.mozilla.org/js/language/js20-2000-07/rationale/syntax.html On 2011/08/30 06:37:31, nigeltao wrote: > That page redirects to a 404 for me. Do you know of an updated link? Hmm. http://web.archive.org/web/20070717142515/http://www.mozilla.org/js/language/... is what I was referring to: "Syntactic Resynchronization Syntactic resynchronization occurs when the lexer needs to find the end of a block (the matching }) in order to skip a portion of a program written in a future version of JavaScript. Ordinarily this would not be a problem, but regular expressions complicate matters because they make lexing dependent on parsing. The rules for recognizing regular expression literals must be changed for those portions of the program. The rule below might work, or a simplified parse might be performed on the input to determine the locations of regular expressions. This is an area that needs further work. During syntax resynchronization JavaScript 2.0 determines whether a / starts a regular expression or is a division (or /=) operator solely based on the previous token: ... " http://webcache.googleusercontent.com/search?q=cache:o5DC7Evn7-UJ:www.mozilla... works as well. Those links are now dead because Mozilla's abandoned its JavaScript 2.0 effort after Adobe pulled out of EcmaScript 4 and the EcmaScript 3.1 committee formed and demonstrated progress on selective spec fixes that resulted in ES5 as a way to unblock the process. Giving EcmaScript a regular lexical grammar was not seen as high priority by 3.1 so that part not adopted. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:26: // Trim space from the right. On 2011/08/30 06:37:31, nigeltao wrote: > This could be > i := len(bytes.TrimRight(s, "\t\n\f\r ")) Done. I went through the godoc for bytes and strings and I think I have a better handle on not reinventing the wheel. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:44: // ++ and -- are not On 2011/08/30 06:37:31, nigeltao wrote: > Are not what? Fixed. // ++ and -- are not regexp preceders, but + and - are whether // they are used as infix or prefix operators. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:50: if ((i - start) & 1) == 1 { On 2011/08/30 06:37:31, nigeltao wrote: > Outer parens are unnecessary. Ah, I assumed Go used C++ operator precedence for & and ==. Fixed. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:178: i := strings.IndexAny(s, "\u0000\t\n\f\r\"&'+/<>\\\u2028\u2029") This style of escaper seems to benchmark well, but this bit seems brittle, so I added a test to js_test that tries escaping all lower-7 codepoints and a few selected others first in one go, and then codepoint by codepoint and makes sure it gets the same results for each escaping function. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:185: for k, r := range s { On 2011/08/30 06:37:31, nigeltao wrote: > Nit: I'd s/k/j/. Similarly below at line 251. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:234: // expression literal. /foo{{x}}bar/ matches the On 2011/08/30 06:37:31, nigeltao wrote: > Unfinished comment. Also, single spaces after a full stop. Now reads // ... /foo{{.X}}bar/ matches the string "foo" followed by // the literal text of {{.X}} followed by the string "bar". http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:324: // isJSIdentPart is true if the given rune is a Java On 2011/08/30 06:37:31, nigeltao wrote: > Unfinished comment. Done. // isJSIdentPart is true if the given rune is a JS identifier part. // It does not handle all the non-Latin letters, joiners, and combining marks, // but it does handle every codepoint that can occur in a numeric literal or // a keyword. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:1: //-*- mode: go-mode; indent-tabs-mode: t; tab-width: 8; -*- On 2011/08/30 06:37:31, nigeltao wrote: > Drop this; add the missing copyright header. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:16: // Statement terminators precede regexs. On 2011/08/30 06:37:31, nigeltao wrote: > s/regexs/regexps/. Done in many places. $ egrep -in 'regex([^p]|$)' *.go produces no matches. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:75: for _, test := range tests { On 2011/08/30 06:37:31, nigeltao wrote: > escape_test.go uses the names "tc" and "testCases" instead of "test" and > "tests". Please make the two files consistent. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:77: t.Errorf("Expected %s given %q", test.jsCtx, test.s) On 2011/08/30 06:37:31, nigeltao wrote: > The usual Go language is something like > if got := nextJsCtx(etc); got != test.jsCtx { > t.Errorf("%q: want %s got %s", test.s, test.jsCtx, got) > } Done in both test files. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:156: agolden := "[" + test.js[i:j] + "]" On 2011/08/30 06:37:31, nigeltao wrote: > This could be > agolden := "[" + strings.TrimSpace(test.js) + "]" > > I'd also rename s/agolden/want/. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:214: func TestJSRegexEscaper(t *testing.T) { On 2011/08/30 06:37:31, nigeltao wrote: > s/Regex/Regexp/. Done. http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:259: jsStrEscaper("The quick, brown fox jumps over the lazy dog.") On 2011/08/30 06:37:31, nigeltao wrote: > Benchmarks should loop from 0 up to b.N. See "godoc testing" for an example. Fixed. That showed that the nospecials case was taking longer than the regular, so I removed the bytes.IndexAny from those and sped them up significantly, and removed the brittle bit I was worried about.
Sign in to reply to this message.
This is getting pretty close... http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:380: quoteAndEsc := "\"\\" On 2011/08/30 20:08:36, MikeSamuel wrote: > it breaks emacs syntax highlighting horribly. Get a real editor. :-) http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:65: if c.state == stateURL { This if-else chain should probably be a "switch c.state" block. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:124: s = s[:len(s)-i] Maybe exit early after this, to avoid an allocation and copy: if len(s) == 0 { return } http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:426: i := bytes.IndexAny(b, "/[]\\") This could also be a `` raw string. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:459: errStr: fmt.Sprintf("broken JS regexp charset: %q", s), Would s/broken/unfinished/ be better? Not just here but throughout this file?? http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:491: return c, s[i:] Please add: // The line terminator is not part of the comment, and so we do not return s[i+1:] or s[i+2:]. above the return. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape_test.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:587: } Please at least two more tests (I'll give the input and tail values below): "{{.X | html | urlquery}}" []string{"html"} and "{{.X | urlquery}}" []string{"html", "urlquery"} http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:589: tmpl := template.New("test") You could write tmpl := template.Must(template.New("test").Parse(test.input)) here and above. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.go File src/pkg/exp/template/html/js.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:17: // operator : (/ or /=). I would delete the space before the colon, and also delete the parentheses. IIUC, the colon is there as part of an English sentence, and not as a JavaScript token. Similarly, removing the parens removes any possibility of confusing them for JavaScript tokens. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:27: func nextJSCtx(s []byte, prec jsCtx) jsCtx { Since prec is used only once, I'd spell it in full. I'm guessing that it stands for "preceding", but I'm not sure. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:102: var regexPrecederKeywords = map[string]bool{ s/regexP/regexpP/. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:264: toEscape := ("\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f" + I'd name the variable "input". I'd also drop the parens. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:312: if escaped := test.escaper(toEscape); escaped != test.escaped { s/escaped/s/ would be shorter and IMO just as clear. Similarly a dozen-ish lines below. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:320: for i := 0; i < len(toEscape); { Ranging over a string gives you Unicode codepoints (not bytes), and string(65) is "A". Thus: for _, c := range toEscape { buf.WriteString(test.escaper(string(c)) }
Sign in to reply to this message.
http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/5001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:380: quoteAndEsc := "\"\\" On 2011/08/30 23:45:31, nigeltao wrote: > On 2011/08/30 20:08:36, MikeSamuel wrote: > > it breaks emacs syntax highlighting horribly. > > Get a real editor. :-) s/editor/operating system/ http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:65: if c.state == stateURL { On 2011/08/30 23:45:32, nigeltao wrote: > This if-else chain should probably be a "switch c.state" block. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:124: s = s[:len(s)-i] On 2011/08/30 23:45:32, nigeltao wrote: > Maybe exit early after this, to avoid an allocation and copy: > if len(s) == 0 { > return > } Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:426: i := bytes.IndexAny(b, "/[]\\") On 2011/08/30 23:45:32, nigeltao wrote: > This could also be a `` raw string. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:459: errStr: fmt.Sprintf("broken JS regexp charset: %q", s), On 2011/08/30 23:45:32, nigeltao wrote: > Would s/broken/unfinished/ be better? Not just here but throughout this file?? Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape.go:491: return c, s[i:] On 2011/08/30 23:45:32, nigeltao wrote: > Please add: > // The line terminator is not part of the comment, and so we do not return > s[i+1:] or s[i+2:]. > above the return. // Per section 7.4 of EcmaScript 5 : http://es5.github.com/#x7.4 // "However, the LineTerminator at the end of the line is not // considered to be part of the single-line comment; it is recognised // separately by the lexical grammar and becomes part of the stream of // input elements for the syntactic grammar." http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... File src/pkg/exp/template/html/escape_test.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:587: } On 2011/08/30 23:45:32, nigeltao wrote: > Please at least two more tests (I'll give the input and tail values below): > > "{{.X | html | urlquery}}" > []string{"html"} > > and > > > "{{.X | urlquery}}" > []string{"html", "urlquery"} I reworked requirePipelineTail to just make sure that the identifier tail contains the elements in the order specified and to make the minimal changes necessary to make this so. This does introduce a hazard where we might insert sanitizers early and then later be trumped by an unescaping pipeline element, requirePipelineContains( parseActionNode(`{{.X | html | htmlunescape}}`), []string{ "jsvalue", "html" }) produces {{.X | jsvalue | html | htmlunecape}} but I think this hazard is far outweighed by us not introducing bugs when a security conscious maintainer adds a pipeline stage to do extra sanitization or to get telemetry on a problem (heisenbug). requirePipelineContains( parseActionNode(`{{.X | html | panicIfEvil}}`), []string { "jsvalue", "html" }) should produce {{.X | jsvalue | html | panicIfEvil}} and not produce {{.X | html | panicIfEvil | jsvalue | html}} http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/esc... src/pkg/exp/template/html/escape_test.go:589: tmpl := template.New("test") On 2011/08/30 23:45:32, nigeltao wrote: > You could write > tmpl := template.Must(template.New("test").Parse(test.input)) > here and above. Done in 3 places.
Sign in to reply to this message.
http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.go File src/pkg/exp/template/html/js.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:17: // operator : (/ or /=). On 2011/08/30 23:45:32, nigeltao wrote: > I would delete the space before the colon, and also delete the parentheses. > IIUC, the colon is there as part of an English sentence, and not as a JavaScript > token. Similarly, removing the parens removes any possibility of confusing them > for JavaScript tokens. Ping. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:27: func nextJSCtx(s []byte, prec jsCtx) jsCtx { On 2011/08/30 23:45:32, nigeltao wrote: > Since prec is used only once, I'd spell it in full. I'm guessing that it stands > for "preceding", but I'm not sure. Ping. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:102: var regexPrecederKeywords = map[string]bool{ On 2011/08/30 23:45:32, nigeltao wrote: > s/regexP/regexpP/. Ping. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:264: toEscape := ("\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f" + On 2011/08/30 23:45:32, nigeltao wrote: > I'd name the variable "input". I'd also drop the parens. Ping. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:312: if escaped := test.escaper(toEscape); escaped != test.escaped { On 2011/08/30 23:45:32, nigeltao wrote: > s/escaped/s/ would be shorter and IMO just as clear. Similarly a dozen-ish lines > below. Ping. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:320: for i := 0; i < len(toEscape); { On 2011/08/30 23:45:32, nigeltao wrote: > Ranging over a string gives you Unicode codepoints (not bytes), and string(65) > is "A". Thus: > > for _, c := range toEscape { > buf.WriteString(test.escaper(string(c)) > } Ping. http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:96: errStr: fmt.Sprintf("%s appears inside a comment"), The "%s" needs a matching argument. Can we have a TestErrors case that tickles this? http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:140: if name == ident { The indentation is getting pretty deep. One small thing that might help by one notch is to replace ---- if name == ident { lotsOfCode } ---- with ---- if name != ident { continue } lotsOfCode ---- Furthermore, it might help if you can pull out lines 139-150 (the "for i, name := range s" loop) into its own function. I'm not sure how easy that is, though.
Sign in to reply to this message.
I don't know how I missed those comments before. Incremental diffs at http://codereview.appspot.com/4968052/diff2/11002:13007/src/pkg/exp/template/... http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.go File src/pkg/exp/template/html/js.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:17: // operator : (/ or /=). On 2011/08/30 23:45:32, nigeltao wrote: > I would delete the space before the colon, and also delete the parentheses. > IIUC, the colon is there as part of an English sentence, and not as a JavaScript > token. Similarly, removing the parens removes any possibility of confusing them > for JavaScript tokens. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:27: func nextJSCtx(s []byte, prec jsCtx) jsCtx { On 2011/08/31 08:51:02, nigeltao wrote: > On 2011/08/30 23:45:32, nigeltao wrote: > > Since prec is used only once, I'd spell it in full. I'm guessing that it > stands > > for "preceding", but I'm not sure. > > Ping. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js.... src/pkg/exp/template/html/js.go:102: var regexPrecederKeywords = map[string]bool{ On 2011/08/30 23:45:32, nigeltao wrote: > s/regexP/regexpP/. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:264: toEscape := ("\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f" + On 2011/08/30 23:45:32, nigeltao wrote: > I'd name the variable "input". I'd also drop the parens. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:312: if escaped := test.escaper(toEscape); escaped != test.escaped { On 2011/08/30 23:45:32, nigeltao wrote: > s/escaped/s/ would be shorter and IMO just as clear. Similarly a dozen-ish lines > below. Done. http://codereview.appspot.com/4968052/diff/9001/src/pkg/exp/template/html/js_... src/pkg/exp/template/html/js_test.go:320: for i := 0; i < len(toEscape); { On 2011/08/30 23:45:32, nigeltao wrote: > Ranging over a string gives you Unicode codepoints (not bytes), and string(65) > is "A". Thus: > > for _, c := range toEscape { > buf.WriteString(test.escaper(string(c)) > } Done. http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:96: errStr: fmt.Sprintf("%s appears inside a comment"), On 2011/08/31 08:51:02, nigeltao wrote: > The "%s" needs a matching argument. > > Can we have a TestErrors case that tickles this? Done. Added 2, one for each type of comment. http://codereview.appspot.com/4968052/diff/11002/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:140: if name == ident { On 2011/08/31 08:51:02, nigeltao wrote: > The indentation is getting pretty deep. One small thing that might help by one > notch is to replace > ---- > if name == ident { > lotsOfCode > } > ---- > with > ---- > if name != ident { > continue > } > lotsOfCode > ---- > > Furthermore, it might help if you can pull out lines 139-150 (the "for i, name > := range s" loop) into its own function. I'm not sure how easy that is, though. I pulled the loop index finding code out into indexOfStr and the new command making code which was duplicated into newCmd.
Sign in to reply to this message.
LGTM. Just a few nits (yes, I'm retentive) that I'll fix when I submit. http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/co... File src/pkg/exp/template/html/context.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/co... src/pkg/exp/template/html/context.go:159: // jsCtxRegexp occurs where a '/' would start a RegExp literal. s/RegExp/regexp/, I think. http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/es... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:105: // requirePipelineContains ensures that the pipeline has commands with s/require/ensure/ would match the language you use for the doc comment. http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/js... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/js... src/pkg/exp/template/html/js_test.go:324: t.Errorf("%s char-wise: want\n\t%q\ngot\n\t%q", test.name, test.escaped, s) s/char/rune/.
Sign in to reply to this message.
Incremental diffs at http://codereview.appspot.com/4968052/diff2/13007:19001/src/pkg/exp/template/... http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/co... File src/pkg/exp/template/html/context.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/co... src/pkg/exp/template/html/context.go:159: // jsCtxRegexp occurs where a '/' would start a RegExp literal. On 2011/09/01 01:59:32, nigeltao wrote: > s/RegExp/regexp/, I think. Done. http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/es... File src/pkg/exp/template/html/escape.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/es... src/pkg/exp/template/html/escape.go:105: // requirePipelineContains ensures that the pipeline has commands with On 2011/09/01 01:59:32, nigeltao wrote: > s/require/ensure/ would match the language you use for the doc comment. Changed here, in signature, at call-sites, and in unittest name. http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/js... File src/pkg/exp/template/html/js_test.go (right): http://codereview.appspot.com/4968052/diff/13007/src/pkg/exp/template/html/js... src/pkg/exp/template/html/js_test.go:324: t.Errorf("%s char-wise: want\n\t%q\ngot\n\t%q", test.name, test.escaped, s) On 2011/09/01 01:59:32, nigeltao wrote: > s/char/rune/. Done.
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=68c83c6fed16 *** exp/template/html: Implement grammar for JS. This transitions into a JS state when entering any attribute whose name starts with "on". It does not yet enter a JS on entry into a <script> element as script element handling is introduced in another CL. R=nigeltao CC=golang-dev http://codereview.appspot.com/4968052 Committer: Nigel Tao <nigeltao@golang.org>
Sign in to reply to this message.
|