|
|
Descriptiongo/format, cmd/gofmt: fix issues with partial Go code with indent
Fixes issue 5551.
Fixes issue 4449.
Adds tests for both issues.
Note that the two issues occur only when formatting partial Go code
with indent.
The best way to understand the change is as follows: I took the code
of cmd/gofmt and go/format, combined it into one unified code that
does not suffer from either 4449 nor 5551, and then applied that code
to both cmd/gofmt and go/format.
As a result, there is now much more identical code between the two
packages, making future code deduplication easier (it was not possible
to do that now without adding public APIs, which I was advised not to
do at this time).
More specifically, I took the parse() of cmd/gofmt which correctly
preserves comments (issue 5551) and modified it to fix issue where
it would sometimes modify literal values (issue 4449).
I ended up removing the matchSpace() function because it no longer
needed to do some of its work (insert indent), and a part of its work
had to be done in advance (determining the indentation of first code
line), because that calculation is required for cfg.Fprint() to run.
adjustIndent is used to adjust the indent of cfg.Fprint() to compensate
for the body of wrapper func being indented by one level. This allows
to get rid of the bytes.Replace text manipulation of inner content,
which was problematic and sometimes altered raw string literals (issue
4449). This means that sometimes the value of cfg.Indent is negative,
but that works as expected.
So now the algorithm for formatting partial Go code is:
1. Determine and prepend leading space of original source.
2. Determine and prepend indentation of first code line.
3. Format and write partial Go code (with all of its leading &
trailing space trimmed).
4. Determine and append trailing space of original source.
Patch Set 1 #Patch Set 2 : diff -r 932fe22207465e6c4bcdae29f5c519ba069f8927 https://code.google.com/p/go #Patch Set 3 : diff -r 932fe22207465e6c4bcdae29f5c519ba069f8927 https://code.google.com/p/go #
Total comments: 18
Patch Set 4 : diff -r 48355ec55578b156b58c7df974acf132ab2e9d73 https://code.google.com/p/go #Patch Set 5 : diff -r 48355ec55578b156b58c7df974acf132ab2e9d73 https://code.google.com/p/go #
Total comments: 6
Patch Set 6 : diff -r 1e2b51e32781fe475a580ffc8a7776507dfec1d5 https://code.google.com/p/go #Patch Set 7 : diff -r 1e2b51e32781fe475a580ffc8a7776507dfec1d5 https://code.google.com/p/go #Patch Set 8 : diff -r 032fdf145bf989b31b6c7d369c36c4b54066f55a https://code.google.com/p/go #Patch Set 9 : diff -r 032fdf145bf989b31b6c7d369c36c4b54066f55a https://code.google.com/p/go #
Total comments: 30
Patch Set 10 : diff -r 8f36a32a9d036f86fb956c8538142eb384bd1a4a https://code.google.com/p/go #Patch Set 11 : diff -r 8f36a32a9d036f86fb956c8538142eb384bd1a4a https://code.google.com/p/go #
MessagesTotal messages: 23
Hello golang-codereviews@googlegroups.com, I'd like you to review this change to https://code.google.com/p/go
Sign in to reply to this message.
R=gri On Mon, Sep 22, 2014 at 12:13 AM, <shurcooL@gmail.com> wrote: > Reviewers: golang-codereviews, > > Message: > Hello golang-codereviews@googlegroups.com, > > I'd like you to review this change to > https://code.google.com/p/go > > > Description: > go/format, cmd/gofmt: fix issues with partial Go code with indent > > Fixes issue 5551. > Fixes issue 4449. > > Adds tests for both issues. > > Note that the two issues occur only when formatting partial Go code > with indent. > > Apologies for a rather large CL, but the two issues are very related > and the current code required that many changes. > > The best way to understand the change is as follows: I took the code > of cmd/gofmt and go/format, combined it into one unified code that > does not suffer from either 4449 nor 5551, and then applied that code > to both cmd/gofmt and go/format. > > As a result, there is now much more identical code between the two > packages, making future code deduplication easier (it was not possible > to do that now without adding public APIs, which I was advised not to > do at this time). > > More specifically, I took the parse() of cmd/gofmt which correctly > preserves comments (issue 5551) and modified it to fix issue where > it would sometimes modify literal values (issue 4449). > > I ended up removing the matchSpace() function because it no longer > needed to do some of its work (insert indent), and a part of its work > had to be done in advance (determining the indentation of first code > line), because that calculation is required for cfg.Fprint() to run. > > So now the high level formula for formatting partial Go code is: > > 1. Determine and prepend leading space of original source. > 2. Determine and prepend indentation of first code line. > 3. Format and write partial Go code (with all of its leading & > trailing space trimmed). > 4. Determine and append trailing space of original source. > > If it helps with review, the individual steps I took to get to this > final result CL can be seen at: > > https://github.com/shurcooL/go/compare/0f4891c007a238c98a3ba1c73883a7 > 74298a086f...b6d9c498e14d7890b013f258965d7cb83b612c44 > > Please review this at https://codereview.appspot.com/142360043/ > > Affected files (+190, -91 lines): > M src/cmd/gofmt/gofmt.go > A src/cmd/gofmt/testdata/stdin6.golden > A src/cmd/gofmt/testdata/stdin6.input > M src/go/format/format.go > M src/go/format/format_test.go > > > -- > You received this message because you are subscribed to the Google Groups > "golang-codereviews" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-codereviews+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
> As a result, there is now much more identical code between the two > packages, making future code deduplication easier (it was not possible > to do that now without adding public APIs, which I was advised not to > do at this time). Actually, I just realized that it may be possible now that: - golang.org/s/go14nopkg is done - golang.org/s/go14internal may be done for GOROOT? (I'm not sure if it is.) If so and if I understand correctly, we can create internal/go/format package that would contain the common code between cmd/gofmt and go/format.
Sign in to reply to this message.
Some initial comments, more to come. This looks pretty good. Please don't yet factor out the shared parts, we should do this perhaps in a 2nd separate CL. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:117: remove this empty line https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:123: i, j := 0, 0 i := 0 https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:124: for j < len(src) && isSpace(src[j]) { for j := 0; j < len(src) && isSpace(src[j]); j++ { https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:126: i = j + 1 // index of last line in leading space this is not really a line index isn't it https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:128: j++ get rid of this https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:361: func cutSpace(b []byte) (middle []byte) { use bytes.TrimSpace instead? https://codereview.appspot.com/142360043/diff/40001/src/go/format/format.go File src/go/format/format.go (right): https://codereview.appspot.com/142360043/diff/40001/src/go/format/format.go#n... src/go/format/format.go:21: parserMode = parser.ParseComments make this a const https://codereview.appspot.com/142360043/diff/40001/src/go/format/format.go#n... src/go/format/format.go:241: func cutSpace(b []byte) (middle []byte) { I think you should be able to replace this function now with bytes.TrimSpace.
Sign in to reply to this message.
https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:124: for j < len(src) && isSpace(src[j]) { On 2014/09/22 18:53:51, gri wrote: > for j := 0; j < len(src) && isSpace(src[j]); j++ { Just want to double check, you are aware that this is a copy of original code from go/format/format.go, right? I can still apply these refactors to both files (this code section is identical between format.go and gofmt.go) of course, just wanted to confirm. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:361: func cutSpace(b []byte) (middle []byte) { On 2014/09/22 18:53:51, gri wrote: > use bytes.TrimSpace instead? Sure, it can be used instead.
Sign in to reply to this message.
https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:124: for j < len(src) && isSpace(src[j]) { On 2014/09/22 19:07:25, shurcooL wrote: > On 2014/09/22 18:53:51, gri wrote: > > for j := 0; j < len(src) && isSpace(src[j]); j++ { > > Just want to double check, you are aware that this is a copy of original code > from go/format/format.go, right? > > I can still apply these refactors to both files (this code section is identical > between format.go and gofmt.go) of course, just wanted to confirm. I wasn't aware at that time but it's ok to clean up if semantically equivalent and done on both sides.
Sign in to reply to this message.
> the two issues are very related By the way, I noticed earlier that issues 5551 and 4449 add up to a round 10000. Pretty neat coincidence hehe. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:123: i, j := 0, 0 On 2014/09/22 18:53:51, gri wrote: > i := 0 Can't do this; both `i` and `j` are used in the "// Determine and prepend indentation of first code line." block below. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:126: i = j + 1 // index of last line in leading space On 2014/09/22 18:53:51, gri wrote: > this is not really a line index isn't it After the for loop is complete, `i` is the byte index of the beginning of the last line in leading space. `j` is the byte index of the end of of leading space (same line). So `src[i:j]` is the indentation of the first code line. https://codereview.appspot.com/142360043/diff/40001/src/go/format/format.go File src/go/format/format.go (right): https://codereview.appspot.com/142360043/diff/40001/src/go/format/format.go#n... src/go/format/format.go:21: parserMode = parser.ParseComments On 2014/09/22 18:53:51, gri wrote: > make this a const Done.
Sign in to reply to this message.
Hello golang-codereviews@googlegroups.com, bradfitz@golang.org, gri@golang.org (cc: golang-codereviews@googlegroups.com), Please take another look.
Sign in to reply to this message.
Some nitpicks, and some questions. Also, please add some more test cases per my comment below. Looks good otherwise. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:123: i, j := 0, 0 On 2014/09/23 01:51:35, shurcooL wrote: > On 2014/09/22 18:53:51, gri wrote: > > i := 0 > > Can't do this; both `i` and `j` are used in the "// Determine and prepend > indentation of first code line." block below. Acknowledged, I missed that. https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:126: i = j + 1 // index of last line in leading space On 2014/09/23 01:51:35, shurcooL wrote: > On 2014/09/22 18:53:51, gri wrote: > > this is not really a line index isn't it > > After the for loop is complete, `i` is the byte index of the beginning of the > last line in leading space. > > `j` is the byte index of the end of of leading space (same line). > > So `src[i:j]` is the indentation of the first code line. ok, but it's not a line index, its a byte index (or actually offset from the start) - I am aware this was like this before let's change it to: // byte index of last line in leading space https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:128: j++ On 2014/09/22 18:53:51, gri wrote: > get rid of this let's put it back ... https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:123: for ; j < len(src) && isSpace(src[j]); j++ { since j cannot be local, let's leave this symmetric as before for j < len(src) && isSpace(src[j]) { ... and move the j++ at the end again (apologies for the noise) please adjust both files https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:341: src = src[2*indent+len("package p\n\nfunc _() {"):] I don't understand why this is 2*indent. Shouldn't it be just indent? Otherwise, it should be commented more clearly. https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/testdata/st... File src/cmd/gofmt/testdata/stdin6.input (right): https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/testdata/st... src/cmd/gofmt/testdata/stdin6.input:8: } Please add a couple of cases with raw strings and misformatted code around it so that we can see the effect of gofmt. Then take all those cases and replicate them 2x or so, with different indentation levels so that we can see that the indentation logic works correctly.
Sign in to reply to this message.
https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/40001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:126: i = j + 1 // index of last line in leading space On 2014/09/23 21:48:26, gri wrote: > On 2014/09/23 01:51:35, shurcooL wrote: > > On 2014/09/22 18:53:51, gri wrote: > > > this is not really a line index isn't it > > > > After the for loop is complete, `i` is the byte index of the beginning of the > > last line in leading space. > > > > `j` is the byte index of the end of of leading space (same line). > > > > So `src[i:j]` is the indentation of the first code line. > > ok, but it's not a line index, its a byte index (or actually offset from the > start) - I am aware this was like this before > > let's change it to: > > // byte index of last line in leading space > You're absolutely right! I saw gorename earlier today, and instantly realized that "byte offset" is the right term that should be used here. I think byte offset is clearer than byte index, so I'll use that. https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:123: for ; j < len(src) && isSpace(src[j]); j++ { On 2014/09/23 21:48:26, gri wrote: > since j cannot be local, let's leave this symmetric as before > > for j < len(src) && isSpace(src[j]) { ... > > and move the j++ at the end again > > (apologies for the noise) > > please adjust both files Sure, and no problem about noise - I appreciate and share the desire to make Go as good as possible. https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/gofmt.go#ne... src/cmd/gofmt/gofmt.go:341: src = src[2*indent+len("package p\n\nfunc _() {"):] On 2014/09/23 21:48:26, gri wrote: > I don't understand why this is 2*indent. Shouldn't it be just indent? > > Otherwise, it should be commented more clearly. Will add a comment to clarify. It's 2*indent because there are 2 non-blank lines, each will have indent added to it that needs to be adjusted away. https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/testdata/st... File src/cmd/gofmt/testdata/stdin6.input (right): https://codereview.appspot.com/142360043/diff/80001/src/cmd/gofmt/testdata/st... src/cmd/gofmt/testdata/stdin6.input:8: } On 2014/09/23 21:48:26, gri wrote: > Please add a couple of cases with raw strings and misformatted code around it so > that we can see the effect of gofmt. > > Then take all those cases and replicate them 2x or so, with different > indentation levels so that we can see that the indentation logic works > correctly. I'm unsure if you're aware, but there's quite good test coverage of raw strings and various indentation in the go/format tests (go/format/format_test.go - see "// indentation, leading and trailing space" section). I hope that the tests of gofmt and go/format can also be unified in the future when this code is deduplicated into a single package. With that in mind, did you want me to simply port over the go/format tests to cmd/gofmt? Or should I try to create something new? Actually, I just realized the format_test.go tests don't test that gofmt corrects misformatted code, so I can improve on that.
Sign in to reply to this message.
On 2014/09/23 21:48:26, gri wrote: > Please add a couple of cases with raw strings and misformatted code around it > so that we can see the effect of gofmt. That was a really good idea. I added a very thorough test with as many "features" as possible, and it helped uncover a bug. Namely, 4449 is not completely resolved. The culprit is this line in the adjust func: // Gofmt has also indented the function body one level. // Remove that indent. src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) The cfg.Fprint does indent the content of main func by an extra level, but it doesn't indent raw string literals. However, doing that bytes.Replace() does affect raw string literals, so it actually takes away one level of indent from raw string literals. I will upload a revised patch set with all of the feedback above addressed and the new test, but the test will be failing. After that I'll work on a fix.
Sign in to reply to this message.
Sounds good. Thanks for the update. Regarding the 2*indent: I don't think the go/printed will write indentation on empty lines. But more comprehensive tests will show if the code is working correctly, so we will see. Thanks! - gri On Tue, Sep 23, 2014 at 7:21 PM, <shurcooL@gmail.com> wrote: > On 2014/09/23 21:48:26, gri wrote: > >> Please add a couple of cases with raw strings and misformatted code >> > around it > >> so that we can see the effect of gofmt. >> > > That was a really good idea. I added a very thorough test with as many > "features" as possible, and it helped uncover a bug. > > Namely, 4449 is not completely resolved. > > The culprit is this line in the adjust func: > > // Gofmt has also indented the function body one level. > // Remove that indent. > src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) > > The cfg.Fprint does indent the content of main func by an extra level, > but it doesn't indent raw string literals. > > However, doing that bytes.Replace() does affect raw string literals, so > it actually takes away one level of indent from raw string literals. > > I will upload a revised patch set with all of the feedback above > addressed and the new test, but the test will be failing. After that > I'll work on a fix. > > https://codereview.appspot.com/142360043/ >
Sign in to reply to this message.
> Regarding the 2*indent: I don't think the go/printed will write indentation on empty lines. You are correct, go/printer will not write indentation on empty lines. But that's why it's 2*indent. If it wrote on empty lines, it would be 3*indent. If you print "package p\n\nfunc _() {" with newlines expanded: package p func _() { So when indent is, for example, 10, it would be: *{{10 tabs}}*package p *{{0 tabs because empty line}}* *{{10 tabs}}*func _() { I've investigated this part very carefully (with lots of Printf %q's) and it resulted in this commit: https://github.com/shurcooL/go/commit/47380c82f7468e5ebba5b060d99eb5ba05c931f9 Note that for the suffix both 0*indent and 1*indent work (because of TrimSpace followed by prefix/suffix matching of original source). But the leading indent has to be 2*indent, tests will fail for any other value given a large enough indent. --- About the current issue with TrimSpace messing up raw string literals, I am still thinking but so far I haven't come up with an idea of how to fix it. On Tue, Sep 23, 2014 at 9:33 PM, Robert Griesemer <gri@golang.org> wrote: > Sounds good. Thanks for the update. > > Regarding the 2*indent: I don't think the go/printed will write > indentation on empty lines. But more comprehensive tests will show if the > code is working correctly, so we will see. > > Thanks! > - gri > > On Tue, Sep 23, 2014 at 7:21 PM, <shurcooL@gmail.com> wrote: > >> On 2014/09/23 21:48:26, gri wrote: >> >>> Please add a couple of cases with raw strings and misformatted code >>> >> around it >> >>> so that we can see the effect of gofmt. >>> >> >> That was a really good idea. I added a very thorough test with as many >> "features" as possible, and it helped uncover a bug. >> >> Namely, 4449 is not completely resolved. >> >> The culprit is this line in the adjust func: >> >> // Gofmt has also indented the function body one level. >> // Remove that indent. >> src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) >> >> The cfg.Fprint does indent the content of main func by an extra level, >> but it doesn't indent raw string literals. >> >> However, doing that bytes.Replace() does affect raw string literals, so >> it actually takes away one level of indent from raw string literals. >> >> I will upload a revised patch set with all of the feedback above >> addressed and the new test, but the test will be failing. After that >> I'll work on a fix. >> >> https://codereview.appspot.com/142360043/ >> > >
Sign in to reply to this message.
> About the current issue with TrimSpace messing up raw string literals, I am still thinking but so far I haven't come up with an idea of how to fix it. I was about to give up on this approach because I saw no way to get it to work reliably due to the text manipulation, and say we should get back to working on a correct future solution (i.e. using a CommentedNode)... But I did find a way to make it work (as in, fix these 2 bugs so the CL can be accepted, until the CommentedNode work is ready to replace it in the future)! I'll submit it this evening with details. On Tue, Sep 23, 2014 at 9:42 PM, Dmitri Shuralyov <shurcool@gmail.com> wrote: > > Regarding the 2*indent: I don't think the go/printed will write > indentation on empty lines. > > You are correct, go/printer will not write indentation on empty lines. But > that's why it's 2*indent. If it wrote on empty lines, it would be 3*indent. > > If you print "package p\n\nfunc _() {" with newlines expanded: > > package p > > func _() { > > So when indent is, for example, 10, it would be: > > *{{10 tabs}}*package p > *{{0 tabs because empty line}}* > *{{10 tabs}}*func _() { > > I've investigated this part very carefully (with lots of Printf %q's) and > it resulted in this commit: > > > https://github.com/shurcooL/go/commit/47380c82f7468e5ebba5b060d99eb5ba05c931f9 > > Note that for the suffix both 0*indent and 1*indent work (because of > TrimSpace followed by prefix/suffix matching of original source). But the > leading indent has to be 2*indent, tests will fail for any other value > given a large enough indent. > > --- > > About the current issue with TrimSpace messing up raw string literals, I > am still thinking but so far I haven't come up with an idea of how to fix > it. > > On Tue, Sep 23, 2014 at 9:33 PM, Robert Griesemer <gri@golang.org> wrote: > >> Sounds good. Thanks for the update. >> >> Regarding the 2*indent: I don't think the go/printed will write >> indentation on empty lines. But more comprehensive tests will show if the >> code is working correctly, so we will see. >> >> Thanks! >> - gri >> >> On Tue, Sep 23, 2014 at 7:21 PM, <shurcooL@gmail.com> wrote: >> >>> On 2014/09/23 21:48:26, gri wrote: >>> >>>> Please add a couple of cases with raw strings and misformatted code >>>> >>> around it >>> >>>> so that we can see the effect of gofmt. >>>> >>> >>> That was a really good idea. I added a very thorough test with as many >>> "features" as possible, and it helped uncover a bug. >>> >>> Namely, 4449 is not completely resolved. >>> >>> The culprit is this line in the adjust func: >>> >>> // Gofmt has also indented the function body one level. >>> // Remove that indent. >>> src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) >>> >>> The cfg.Fprint does indent the content of main func by an extra level, >>> but it doesn't indent raw string literals. >>> >>> However, doing that bytes.Replace() does affect raw string literals, so >>> it actually takes away one level of indent from raw string literals. >>> >>> I will upload a revised patch set with all of the feedback above >>> addressed and the new test, but the test will be failing. After that >>> I'll work on a fix. >>> >>> https://codereview.appspot.com/142360043/ >>> >> >> >
Sign in to reply to this message.
PTAL. Tests now pass, and I believe this should work, but it needs review. I've added a small test to src/go/format/format_test.go as well. Here's the solution that I found to remove the problematic text manipulation: // Gofmt has also indented the function body one level. // Remove that indent. src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) If the original indent is, for example, 4, then de-indenting the function body by one level can be done implicitly by decrementing the indent used for cfg.Fprint(). That clearly works for original indent 1 or greater, but what if the original indent was 0? Well, it turns out (and this part needs review/confirmation from someone more familiar with go/printer package and its future) that go/printer works with negative values of Ident in a way you would expect: Indent int // default: 0 (all code is indented at least by this much) For example, if you parse this small program: package main func main() { if 2 > 1 { println("okay") } } Then set Ident to -1 and Fprint() it, you will get: package main func main() { if 2 > 1 { println("okay") } } Knowing that, it was pretty easy to adjust the Indent before cfg.Fprint() is done, take its (potentially negative) value into consideration in adjust func, and use it to remove the need for text manipulation via bytes.Replace. This does work and all tests do pass, but I'm not 100% sure if negative values of Indent are officially supported or just happen to work now but may stop working in the future (the documentation neither confirms nor denies it). So confirming this would be good. Also, let me know if you have any feedback about the small implementation details (like choosing to return an additional value, instead of creating an adjust struct with 2 fields). I went with the simplest solution for now but I can change those minor implementation details upon your review. On 2014/09/24 19:05:27, shurcool_gmail.com wrote: > > About the current issue with TrimSpace messing up raw string literals, I > am still thinking but so far I haven't come up with an idea of how to fix > it. > > I was about to give up on this approach because I saw no way to get it to > work reliably due to the text manipulation, and say we should get back to > working on a correct future solution (i.e. using a CommentedNode)... > > But I did find a way to make it work (as in, fix these 2 bugs so the CL can > be accepted, until the CommentedNode work is ready to replace it in the > future)! > > I'll submit it this evening with details. > > On Tue, Sep 23, 2014 at 9:42 PM, Dmitri Shuralyov <mailto:shurcool@gmail.com> > wrote: > > > > Regarding the 2*indent: I don't think the go/printed will write > > indentation on empty lines. > > > > You are correct, go/printer will not write indentation on empty lines. But > > that's why it's 2*indent. If it wrote on empty lines, it would be 3*indent. > > > > If you print "package p\n\nfunc _() {" with newlines expanded: > > > > package p > > > > func _() { > > > > So when indent is, for example, 10, it would be: > > > > *{{10 tabs}}*package p > > *{{0 tabs because empty line}}* > > *{{10 tabs}}*func _() { > > > > I've investigated this part very carefully (with lots of Printf %q's) and > > it resulted in this commit: > > > > > > > https://github.com/shurcooL/go/commit/47380c82f7468e5ebba5b060d99eb5ba05c931f9 > > > > Note that for the suffix both 0*indent and 1*indent work (because of > > TrimSpace followed by prefix/suffix matching of original source). But the > > leading indent has to be 2*indent, tests will fail for any other value > > given a large enough indent. > > > > --- > > > > About the current issue with TrimSpace messing up raw string literals, I > > am still thinking but so far I haven't come up with an idea of how to fix > > it. > > > > On Tue, Sep 23, 2014 at 9:33 PM, Robert Griesemer <mailto:gri@golang.org> wrote: > > > >> Sounds good. Thanks for the update. > >> > >> Regarding the 2*indent: I don't think the go/printed will write > >> indentation on empty lines. But more comprehensive tests will show if the > >> code is working correctly, so we will see. > >> > >> Thanks! > >> - gri > >> > >> On Tue, Sep 23, 2014 at 7:21 PM, <mailto:shurcooL@gmail.com> wrote: > >> > >>> On 2014/09/23 21:48:26, gri wrote: > >>> > >>>> Please add a couple of cases with raw strings and misformatted code > >>>> > >>> around it > >>> > >>>> so that we can see the effect of gofmt. > >>>> > >>> > >>> That was a really good idea. I added a very thorough test with as many > >>> "features" as possible, and it helped uncover a bug. > >>> > >>> Namely, 4449 is not completely resolved. > >>> > >>> The culprit is this line in the adjust func: > >>> > >>> // Gofmt has also indented the function body one level. > >>> // Remove that indent. > >>> src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) > >>> > >>> The cfg.Fprint does indent the content of main func by an extra level, > >>> but it doesn't indent raw string literals. > >>> > >>> However, doing that bytes.Replace() does affect raw string literals, so > >>> it actually takes away one level of indent from raw string literals. > >>> > >>> I will upload a revised patch set with all of the feedback above > >>> addressed and the new test, but the test will be failing. After that > >>> I'll work on a fix. > >>> > >>> https://codereview.appspot.com/142360043/ > >>> > >> > >> > >
Sign in to reply to this message.
Negative values do work, and "underflow" is just treated as 0. But I believe I put in a debug option that causes underflow to panic. But that's fine, I can adjust the code if need be. Will review thoroughly, hopefully tomorrow. Thanks! - gri On Wed, Sep 24, 2014 at 9:35 PM, <shurcooL@gmail.com> wrote: > PTAL. > > Tests now pass, and I believe this should work, but it needs review. > I've added a small test to src/go/format/format_test.go as well. > > Here's the solution that I found to remove the problematic text > manipulation: > > // Gofmt has also indented the function body one level. > // Remove that indent. > src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) > > If the original indent is, for example, 4, then de-indenting the > function body by one level can be done implicitly by decrementing the > indent used for cfg.Fprint(). That clearly works for original indent 1 > or greater, but what if the original indent was 0? > > Well, it turns out (and this part needs review/confirmation from someone > more familiar with go/printer <https://goto.google.com/printer> package > and its future) that go/printer <https://goto.google.com/printer> > works with negative values of Ident in a way you would expect: > > Indent int // default: 0 (all code is indented at least by this > much) > > For example, if you parse this small program: > > package main > > func main() { > if 2 > 1 { > println("okay") > } > } > > Then set Ident to -1 and Fprint() it, you will get: > > package main > > func main() { > if 2 > 1 { > println("okay") > } > } > > Knowing that, it was pretty easy to adjust the Indent before > cfg.Fprint() is done, take its (potentially negative) value into > consideration in adjust func, and use it to remove the need for text > manipulation via bytes.Replace. > > This does work and all tests do pass, but I'm not 100% sure if negative > values of Indent are officially supported or just happen to work now but > may stop working in the future (the documentation neither confirms nor > denies it). So confirming this would be good. > > Also, let me know if you have any feedback about the small > implementation details (like choosing to return an additional value, > instead of creating an adjust struct with 2 fields). I went with the > simplest solution for now but I can change those minor implementation > details upon your review. > > > On 2014/09/24 19:05:27, shurcool_gmail.com wrote: > >> > About the current issue with TrimSpace messing up raw string >> > literals, I > >> am still thinking but so far I haven't come up with an idea of how to >> > fix > >> it. >> > > I was about to give up on this approach because I saw no way to get it >> > to > >> work reliably due to the text manipulation, and say we should get back >> > to > >> working on a correct future solution (i.e. using a CommentedNode)... >> > > But I did find a way to make it work (as in, fix these 2 bugs so the >> > CL can > >> be accepted, until the CommentedNode work is ready to replace it in >> > the > >> future)! >> > > I'll submit it this evening with details. >> > > On Tue, Sep 23, 2014 at 9:42 PM, Dmitri Shuralyov >> > <mailto:shurcool@gmail.com> > >> wrote: >> > > > > Regarding the 2*indent: I don't think the go/printed >> <https://goto.google.com/printed> will write >> > indentation on empty lines. >> > >> > You are correct, go/printer <https://goto.google.com/printer> will not >> write indentation on empty >> > lines. But > >> > that's why it's 2*indent. If it wrote on empty lines, it would be >> > 3*indent. > >> > >> > If you print "package p\n\nfunc _() {" with newlines expanded: >> > >> > package p >> > >> > func _() { >> > >> > So when indent is, for example, 10, it would be: >> > >> > *{{10 tabs}}*package p >> > *{{0 tabs because empty line}}* >> > *{{10 tabs}}*func _() { >> > >> > I've investigated this part very carefully (with lots of Printf >> > %q's) and > >> > it resulted in this commit: >> > >> > >> > >> > > https://github.com/shurcooL/go/commit/47380c82f7468e5ebba5b060d99eb5 > ba05c931f9 > >> > >> > Note that for the suffix both 0*indent and 1*indent work (because of >> > TrimSpace followed by prefix/suffix matching of original source). >> > But the > >> > leading indent has to be 2*indent, tests will fail for any other >> > value > >> > given a large enough indent. >> > >> > --- >> > >> > About the current issue with TrimSpace messing up raw string >> > literals, I > >> > am still thinking but so far I haven't come up with an idea of how >> > to fix > >> > it. >> > >> > On Tue, Sep 23, 2014 at 9:33 PM, Robert Griesemer >> > <mailto:gri@golang.org> wrote: > >> > >> >> Sounds good. Thanks for the update. >> >> >> >> Regarding the 2*indent: I don't think the go/printed >> <https://goto.google.com/printed> will write >> >> indentation on empty lines. But more comprehensive tests will show >> > if the > >> >> code is working correctly, so we will see. >> >> >> >> Thanks! >> >> - gri >> >> >> >> On Tue, Sep 23, 2014 at 7:21 PM, <mailto:shurcooL@gmail.com> wrote: >> >> >> >>> On 2014/09/23 21:48:26, gri wrote: >> >>> >> >>>> Please add a couple of cases with raw strings and misformatted >> > code > >> >>>> >> >>> around it >> >>> >> >>>> so that we can see the effect of gofmt. >> >>>> >> >>> >> >>> That was a really good idea. I added a very thorough test with as >> > many > >> >>> "features" as possible, and it helped uncover a bug. >> >>> >> >>> Namely, 4449 is not completely resolved. >> >>> >> >>> The culprit is this line in the adjust func: >> >>> >> >>> // Gofmt has also indented the function body one level. >> >>> // Remove that indent. >> >>> src = bytes.Replace(src, []byte("\n\t"), []byte("\n"), -1) >> >>> >> >>> The cfg.Fprint does indent the content of main func by an extra >> > level, > >> >>> but it doesn't indent raw string literals. >> >>> >> >>> However, doing that bytes.Replace() does affect raw string >> > literals, so > >> >>> it actually takes away one level of indent from raw string >> > literals. > >> >>> >> >>> I will upload a revised patch set with all of the feedback above >> >>> addressed and the new test, but the test will be failing. After >> > that > >> >>> I'll work on a fix. >> >>> >> >>> https://codereview.appspot.com/142360043/ >> >>> >> >> >> >> >> > >> > > > https://codereview.appspot.com/142360043/ >
Sign in to reply to this message.
I think this looks good; and the negative indent adjustment is fine (also for go/printer). The detailed list of comments below is not related to the semantics, it's suggestions for cleanups and a bit of reorg. Thanks for doing this! - gri https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go#n... src/cmd/gofmt/gofmt.go:109: var res []byte Please apply the same changes here as in format. Since there I removed the else in this if-else, factor the code below into a separate function, both here and in format.go. Perhaps the function can be called format. That way we have parse... format... which are identical (I think) in both these files, and we can then (in separate) CL start thinking about the correct way to export this functionality. https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go#n... src/cmd/gofmt/gofmt.go:296: func parse(fset *token.FileSet, filename string, src []byte, stdin bool) (*ast.File, func(src []byte, indent int) []byte, int, error) { please apply the same changes as in format https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go File src/go/format/format.go (right): https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:85: file, adjust, adjustIndent, err := parse(fset, "", src, true) file, sourceAdj, indentAdj, err := https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:90: var res []byte remove this https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:99: res = buf.Bytes() return buf.Bytes(), nil https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:101: } else { no need for else anymore - outdent entire block below https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:136: cfg.Indent = indent + adjustIndent s/adjustIndent/indentAdj/ https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:142: res = append(res, adjust(buf.Bytes(), indent+adjustIndent)...) s/indent + adjustIndent/cfg.Indent/ https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:149: res = append(res, src[i:]...) return append(res, src[i:]...), nil https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:152: return res, nil remove https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:175: func parse(fset *token.FileSet, filename string, src []byte, stdin bool) (*ast.File, func(src []byte, indent int) []byte, int, error) { There are too many result parameters now to leave them unnamed: func parse(fset *token.FileSet, filename string, src []byte, fragmentOk bool) ( file *ast.File, sourceAdj func(src []byte, indent int) []byte, indentAdj int, err error, ) { ... Note also that stdin is confusing here - there's no stdin. Use fragmentOk. https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:177: file, err := parser.ParseFile(fset, filename, src, parserMode) s/:=/=/ https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:178: if err == nil { I think you can combine this one with the next if https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:181: // If the error is that the source file didn't begin with a If there's no error, or the error is that ... https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:182: // package line and this is standard input, fall through to ...and source fragments are ok, fall through... https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:184: if !stdin || !strings.Contains(err.Error(), "expected 'package'") { if err == nil || !fragmentOk || ... { https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:185: return nil, nil, 0, err just return https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:195: adjust := func(src []byte, indent int) []byte { sourceAdj = https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:201: adjustIndent := 0 no need to set https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:202: return file, adjust, adjustIndent, nil just return https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:208: return nil, nil, 0, err just return https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:219: adjust := func(src []byte, indent int) []byte { sourceAdj = https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:232: // Remove that indent by returning adjustIndent value of -1. // Adjust that with indentAdj. (no need to repeat the actual value in the comment - if for some reason the value changes in the future and we forget to update the comment, it's going to be wrong) https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:233: adjustIndent := -1 indentAdj = -1 https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:234: return file, adjust, adjustIndent, nil remove, just fall through https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:237: // Failed, and out of options. change to // Succeeded, or out of options. https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:238: return nil, nil, 0, err just return
Sign in to reply to this message.
https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go File src/go/format/format.go (right): https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:90: var res []byte On 2014/09/25 17:36:05, gri wrote: > remove this It's possible to do this here in format.go because it's shorter, but the same change can't be done in gofmt.go. It needs the result of res in the "if !bytes.Equal(src, res) {" and "if !*list && !*write && !*doDiff {" blocks. I would prefer to keep this currently nearly identical section (aside from actual logic differences related to sorting imports) as close to parity between the two files to make it easier to eventually merge them. All the other changes are okay to apply to both files. What do you think?
Sign in to reply to this message.
https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go File src/go/format/format.go (right): https://codereview.appspot.com/142360043/diff/160001/src/go/format/format.go#... src/go/format/format.go:181: // If the error is that the source file didn't begin with a On 2014/09/25 17:36:05, gri wrote: > If there's no error, or the error is that ... This comment change doesn't sound right. // If there's no error, or the error is that the source file didn't begin with a // package line and source fragments are ok, fall through to // try as a source fragment. Stop and return on any other error. "If there's no error ... fall through to try as a source fragment." But what's actually happening is more of: // If there's no error, return. If the error is that the source file didn't begin with a // package line and source fragments are ok, fall through to // try as a source fragment. Stop and return on any other error.
Sign in to reply to this message.
(Sorry for noise.) https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go File src/cmd/gofmt/gofmt.go (right): https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go#n... src/cmd/gofmt/gofmt.go:109: var res []byte On 2014/09/25 17:36:05, gri wrote: > Please apply the same changes here as in format. Since there I removed the else > in this if-else, factor the code below into a separate function, both here and > in format.go. > > Perhaps the function can be called format. That way we have > > parse... > format... > > which are identical (I think) in both these files, and we can then (in separate) > CL start thinking about the correct way to export this functionality. Oh, sorry, I didn't see this comment earlier (I only saw the first sentence). Please disregard what I said about not being able to remove `var res []byte` in format.go because of this. Let me try to do this then. Factoring out this code into a smaller format func makes sense.
Sign in to reply to this message.
PTAL. Ok, I've factored out the common sections into format() funcs. They are almost identical, but they continue to have a few differences as before: -format() in go/format does ast.SortImports() only for complete source. but gofmt does it earlier (outside of format() ) regardless if its complete or fragment. I think it's a good idea to try to consolidate the behavior, they should both do the same thing. this can be done in future CL. -minor difference in the printer.Config value being used. > The detailed list of comments below is not related to the semantics, it's suggestions for cleanups and a bit of reorg. With that, I've applied all of those suggestions (with a small change to one of the comment texts, see my comment earlier). Thank you for the detailed review! > Thanks for doing this! I'm happy to be able to contribute to Go and help make it a little better. On 2014/09/26 07:22:55, shurcooL wrote: > (Sorry for noise.) > > https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go > File src/cmd/gofmt/gofmt.go (right): > > https://codereview.appspot.com/142360043/diff/160001/src/cmd/gofmt/gofmt.go#n... > src/cmd/gofmt/gofmt.go:109: var res []byte > On 2014/09/25 17:36:05, gri wrote: > > Please apply the same changes here as in format. Since there I removed the > else > > in this if-else, factor the code below into a separate function, both here and > > in format.go. > > > > Perhaps the function can be called format. That way we have > > > > parse... > > format... > > > > which are identical (I think) in both these files, and we can then (in > separate) > > CL start thinking about the correct way to export this functionality. > > Oh, sorry, I didn't see this comment earlier (I only saw the first sentence). > Please disregard what I said about not being able to remove `var res []byte` in > format.go because of this. > > Let me try to do this then. Factoring out this code into a smaller format func > makes sense.
Sign in to reply to this message.
LGTM This is good enough for now. Thanks.
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/go/source/detail?r=70456f183225 *** go/format, cmd/gofmt: fix issues with partial Go code with indent Fixes issue 5551. Fixes issue 4449. Adds tests for both issues. Note that the two issues occur only when formatting partial Go code with indent. The best way to understand the change is as follows: I took the code of cmd/gofmt and go/format, combined it into one unified code that does not suffer from either 4449 nor 5551, and then applied that code to both cmd/gofmt and go/format. As a result, there is now much more identical code between the two packages, making future code deduplication easier (it was not possible to do that now without adding public APIs, which I was advised not to do at this time). More specifically, I took the parse() of cmd/gofmt which correctly preserves comments (issue 5551) and modified it to fix issue where it would sometimes modify literal values (issue 4449). I ended up removing the matchSpace() function because it no longer needed to do some of its work (insert indent), and a part of its work had to be done in advance (determining the indentation of first code line), because that calculation is required for cfg.Fprint() to run. adjustIndent is used to adjust the indent of cfg.Fprint() to compensate for the body of wrapper func being indented by one level. This allows to get rid of the bytes.Replace text manipulation of inner content, which was problematic and sometimes altered raw string literals (issue 4449). This means that sometimes the value of cfg.Indent is negative, but that works as expected. So now the algorithm for formatting partial Go code is: 1. Determine and prepend leading space of original source. 2. Determine and prepend indentation of first code line. 3. Format and write partial Go code (with all of its leading & trailing space trimmed). 4. Determine and append trailing space of original source. LGTM=gri R=golang-codereviews, bradfitz, gri CC=golang-codereviews https://codereview.appspot.com/142360043 Committer: Robert Griesemer <gri@golang.org>
Sign in to reply to this message.
|