|
|
Created:
12 years, 5 months ago by dave Modified:
12 years, 4 months ago Reviewers:
CC:
remyoudompheng, rsc, golang-dev Visibility:
Public. |
Descriptioncmd/5g: enable xtramodes optimisation
xtramodes' C_PBIT optimisation transforms:
MOVW 0(R3),R1
ADD $4,R3,R3
into:
MOVW.P 4(R3),R1
and the AADD optimisation tranforms:
ADD R0,R1
MOVBU 0(R1),R0
into:
MOVBU R0<<0(R1),R0
5g does not appear to generate sequences that
can be transformed by xtramodes' AMOVW.
Patch Set 1 #Patch Set 2 : diff -r f781c447da6e https://code.google.com/p/go #Patch Set 3 : diff -r f781c447da6e https://code.google.com/p/go #Patch Set 4 : diff -r f781c447da6e https://go.googlecode.com/hg/ #
Total comments: 18
Patch Set 5 : diff -r 1369d7bb329d https://code.google.com/p/go #Patch Set 6 : diff -r 1369d7bb329d https://code.google.com/p/go #Patch Set 7 : diff -r 97b8514db41b https://go.googlecode.com/hg/ #MessagesTotal messages: 13
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
igen is supposed to be there to avoid generating sequences that can be optimized like this. Did you find places where any of these optimizations is triggered?
Sign in to reply to this message.
Heaps The pbit optimisation kicks in for the inc at the bottom of every range loop AADD is everywhere. I observe their behaviour by adding a warn above the excise the inspecting the objdump for the progs referenced. I have some micro benchmarks which demonstrate it, but they are waiting on another review. Dave On 6 Nov 2012 03:07, <remyoudompheng@gmail.com> wrote: > igen is supposed to be there to avoid generating sequences that can be > optimized like this. Did you find places where any of these > optimizations is triggered? > > http://codereview.appspot.com/**6817085/<http://codereview.appspot.com/6817085/> >
Sign in to reply to this message.
On 2012/11/5 Dave Cheney <dave@cheney.net> wrote: > Heaps > > The pbit optimisation kicks in for the inc at the bottom of every range loop Indeed that's a good use case. > AADD is everywhere. > I observe their behaviour by adding a warn above the excise the inspecting > the objdump for the progs referenced. Usually I do the following: go build -gcflags -S -a cmd/go | sed 's,.*/src/,,' with the appropriate GOARCH and do a diff before and after. Rémy.
Sign in to reply to this message.
Yes, I also use -S once I figure out the prog that is being transformed. I will post some samples of the AADD transform. On 06/11/2012, at 8:02, Rémy Oudompheng <remyoudompheng@gmail.com> wrote: > On 2012/11/5 Dave Cheney <dave@cheney.net> wrote: >> Heaps >> >> The pbit optimisation kicks in for the inc at the bottom of every range loop > > Indeed that's a good use case. > >> AADD is everywhere. > >> I observe their behaviour by adding a warn above the excise the inspecting >> the objdump for the progs referenced. > > Usually I do the following: > go build -gcflags -S -a cmd/go | sed 's,.*/src/,,' > with the appropriate GOARCH and do a diff before and after. > > Rémy.
Sign in to reply to this message.
I think it's an opportunity to make the intention more explicit. My comments suggestions may not be accurate. Feel free to make them correct. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c File src/cmd/5g/peep.c (right): http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode265 src/cmd/5g/peep.c:265: Reg* // uniqp returns a "unique" predecessor to instruction r. // If the instruction is the first one or has multiple // predecessors due to jump, R is returned. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode738 src/cmd/5g/peep.c:738: findpre(Reg *r, Adr *v) // findpre returns the last instruction mentioning v // before r. It must be a set, and there must be // a unique path from that instruction to r. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode758 src/cmd/5g/peep.c:758: findinc(Reg *r, Reg *r2, Adr *v) findinc doesn't find an increment as the name suggests. At least not only increments by 1. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode847 src/cmd/5g/peep.c:847: int please adda comment explaining what this function is doing. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode855 src/cmd/5g/peep.c:855: if(debug['h'] && p->as == AMOVB && p->from.type == D_OREG) /* byte load */ I don't understand what the h flag is. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode864 src/cmd/5g/peep.c:864: case AADD: p1 might have a scond and I don't see a test for this. You don't want to elide a ADD.S if the carry bit is needed by a subsequent instruction. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode867 src/cmd/5g/peep.c:867: (p->as != AMOVB || (a == &p->from && (p1->from.offset&~0xf) == 0))) || these if's are absolutely atrocious.
Sign in to reply to this message.
i'm happy once remy is https://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c File src/cmd/5g/peep.c (right): https://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode855 src/cmd/5g/peep.c:855: if(debug['h'] && p->as == AMOVB && p->from.type == D_OREG) /* byte load */ On 2012/11/05 22:13:30, remyoudompheng wrote: > I don't understand what the h flag is. This is mostly picked up from 5c, and it isn't new code in this CL, so I wouldn't rearrange it too much. I don't see any meaning assigned to -h in either case, though, so I'd be inclined to just delete this line. Early versions of ARM did not have byte load instructions, so it is possible that the -h flag controlled whether those got used, in the original 5c. But it is gone even from there. https://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode864 src/cmd/5g/peep.c:864: case AADD: On 2012/11/05 22:13:30, remyoudompheng wrote: > p1 might have a scond and I don't see a test for this. You don't want to elide a > ADD.S if the carry bit is needed by a subsequent instruction. Does the compiler generate ADD.S? 5c never did, so this code does not expect it. It is not trying to be an optimizer for general ARM code, just an optimizer for what the compiler generates. https://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode867 src/cmd/5g/peep.c:867: (p->as != AMOVB || (a == &p->from && (p1->from.offset&~0xf) == 0))) || On 2012/11/05 22:13:30, remyoudompheng wrote: > these if's are absolutely atrocious. Agreed but they match 5c.
Sign in to reply to this message.
http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c File src/cmd/5g/peep.c (right): http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode864 src/cmd/5g/peep.c:864: case AADD: 5g generates ADD.S/ADC for addition of int64's. Although it looks unlikely that they becomes eligible for this, it might provoke subtle bugs.
Sign in to reply to this message.
On 07/11/2012, at 18:00, remyoudompheng@gmail.com wrote: > > http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c > File src/cmd/5g/peep.c (right): > > http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode864 > src/cmd/5g/peep.c:864: case AADD: > 5g generates ADD.S/ADC for addition of int64's. Although it looks > unlikely that they becomes eligible for this, it might provoke subtle > bugs. Right. I will address this as I want to revisit kabi's notes in cgen64.c and I don't want to stumble into this. > > http://codereview.appspot.com/6817085/
Sign in to reply to this message.
Thank you for your comments, I have tried to address as many of them as possible. Please take another look. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c File src/cmd/5g/peep.c (right): http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode265 src/cmd/5g/peep.c:265: Reg* On 2012/11/05 22:13:30, remyoudompheng wrote: > // uniqp returns a "unique" predecessor to instruction r. > // If the instruction is the first one or has multiple > // predecessors due to jump, R is returned. Done. Thank you for your suggestion. I have modified it to respect the existing comment style. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode738 src/cmd/5g/peep.c:738: findpre(Reg *r, Adr *v) On 2012/11/05 22:13:30, remyoudompheng wrote: > // findpre returns the last instruction mentioning v > // before r. It must be a set, and there must be > // a unique path from that instruction to r. Done. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode758 src/cmd/5g/peep.c:758: findinc(Reg *r, Reg *r2, Adr *v) On 2012/11/05 22:13:30, remyoudompheng wrote: > findinc doesn't find an increment as the name suggests. At least not only > increments by 1. I've tried to document what I think findinc does, I may not be correct. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode847 src/cmd/5g/peep.c:847: int On 2012/11/05 22:13:30, remyoudompheng wrote: > please adda comment explaining what this function is doing. Done. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode855 src/cmd/5g/peep.c:855: if(debug['h'] && p->as == AMOVB && p->from.type == D_OREG) /* byte load */ Deleted. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode864 src/cmd/5g/peep.c:864: case AADD: On 2012/11/07 07:00:44, remyoudompheng wrote: > 5g generates ADD.S/ADC for addition of int64's. Although it looks unlikely that > they becomes eligible for this, it might provoke subtle bugs. I've added a guard to check for this. FWIW, there are 0 occurrences of this in the stdlib. http://codereview.appspot.com/6817085/diff/6001/src/cmd/5g/peep.c#newcode867 src/cmd/5g/peep.c:867: (p->as != AMOVB || (a == &p->from && (p1->from.offset&~0xf) == 0))) || On 2012/11/06 19:03:30, rsc wrote: > On 2012/11/05 22:13:30, remyoudompheng wrote: > > these if's are absolutely atrocious. > > Agreed but they match 5c. I'd prefer to leave them untouched.
Sign in to reply to this message.
Hello remyoudompheng@gmail.com, rsc@golang.org (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/go
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
*** Submitted as http://code.google.com/p/go/source/detail?r=422c00e8e022 *** cmd/5g: enable xtramodes optimisation xtramodes' C_PBIT optimisation transforms: MOVW 0(R3),R1 ADD $4,R3,R3 into: MOVW.P 4(R3),R1 and the AADD optimisation tranforms: ADD R0,R1 MOVBU 0(R1),R0 into: MOVBU R0<<0(R1),R0 5g does not appear to generate sequences that can be transformed by xtramodes' AMOVW. R=remyoudompheng, rsc CC=golang-dev http://codereview.appspot.com/6817085
Sign in to reply to this message.
|