Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: make rulegen produce less verbose code #33644

Closed
mvdan opened this issue Aug 14, 2019 · 13 comments
Closed

cmd/compile: make rulegen produce less verbose code #33644

mvdan opened this issue Aug 14, 2019 · 13 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Aug 14, 2019

This is a continuation to #30810, once the rewrite has been merged. A number of ideas will be proposed below. The purpose of this is to:

  • Produce less lines of code, reducing the size of diffs and files on disk
  • Produce simpler code, making it more readable to humans
  • Hopefully reduce compile time of the generated code slightly

Here are my ideas so far:

  1. Simplifying of boolean expressions. For example, turn !(d == 32-c) into d != 32-c.

  2. Joining of contiguous break conditions, such as:

// verbose
if v_0.Op != OpAMD64SHLLconst {
        break
}
if v_0.AuxInt != 3 {
        break
}

// simpler
if v_0.Op != OpAMD64SHLLconst || v_0.AuxInt != 3 {
        break
}
  1. Removing of uninteresting lines, such as the cond line here:
// match: (ADDL x (SHLLconst [3] y))
// cond:
// result: (LEAL8 x y)
  1. Move declarations closer to their uses, reducing the number of variables in scope in some cases, such as:
for {
        carry := v.Args[2]
        x := v.Args[0] // Not actually used until the very end!
        v_1 := v.Args[1]
        if v_1.Op != OpAMD64MOVQconst {
                break
        }
        c := v_1.AuxInt
        if !(is32Bit(c)) {
                break
        }
        v.reset(OpAMD64ADCQconst)
        v.AuxInt = c
        v.AddArg(x)
        v.AddArg(carry)
        return true
}
  1. Reorder slice accesses to reduce the amount of manual bounds checks speed-ups:
// verbose
_ = v.Args[1]
x := v.Args[0]
v_1 := v.Args[1]

// simpler
v_1 := v.Args[1]
x := v.Args[0]

These are the ideas I have so far. If we think any of them would reduce readability, they can be dropped. I'm also open to more ideas and suggestions, of course. I'd send these as separate CLs so we can measure their impact on the generated code separately.

/cc @josharian @randall77 @laboger @mundaym @cherrymui

@mvdan mvdan added this to the Go1.14 milestone Aug 14, 2019
@mvdan mvdan added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 14, 2019
@mvdan mvdan self-assigned this Aug 14, 2019
@mvdan
Copy link
Member Author

mvdan commented Aug 14, 2019

I also wonder if some lines, such as b.Aux = nil, could be recognised as being unnecessary.

@laboger
Copy link
Contributor

laboger commented Aug 14, 2019

I would like to see a way to indicate that a rule should not be expanded to generate all possible operand orderings even though the opcode is marked commutative. The main example where this is a problem is in the rules to combine loads and stores which currently exist for many GOARCH targets. Those rules consist of multiple ORs, the statement to be matched can really only occur with a few of the possible operand orders, and yet all combinations are generated for all possible orders. This is especially costly for the 8 byte cases.

This would reduce the size of the rewrite files, the compile binary, and compile time.

@mvdan
Copy link
Member Author

mvdan commented Aug 14, 2019

That's an interesting idea, thanks. Could this be done automatically, or would the rules syntax need extra information to know when or not to do this expansion?

@laboger
Copy link
Contributor

laboger commented Aug 14, 2019

I'm not sure this could be done automatically, so I was thinking of some syntax to add to the rule, like maybe a modifier to the opcode or a prefix with a special name that could be added to the outermost level to restrict the number of combinations generated. In the case I've mentioned with combined loads and stores, we would probably want to allow just 2 combinations to be generated: the operands in the order they appear in the rule, and the reverse order. That way it could still match either of these for the 4 byte case:

x := b[0] | b[1]<<8 | b[2]<<16 | b[3]<<24
y := b[3]<<24 | b[2]<<16 | b[1]<<8 | b[0]

It is highly unlikely that this statement would be written with the operands in a different order from these, yet many combinations are generated and matches are attempted.

I'd be willing to try out something to see how it helps.

We currently do not have any of the combined load and store rules for ppc64 big endian, and we are missing some for the 4 byte cases on ppc64le, but I've avoided adding them because of how much they will increase the size of rewritePPC64.go.

@mundaym
Copy link
Member

mundaym commented Aug 14, 2019

I agree with Lynn that reducing the amount of code generated for commutative operations would be really great and a big code size win. I tried writing generic rules for unaligned loads and the 64-bit load generated megabytes of code due to the 7 commutative ops. My attempt to use generic rules to match variable rotates also ran into this combinatorial explosion.

As well as the approach suggested we could also look at swapping arguments to commutative operations so that they are put into a canonical order (I think there was a CL where Ian mentioned this is how GCC deals with this issue).

Another approach we could investigate is swapping argument values in the generated code. For example, I think in a lot of cases we could generate code that looked like this (assuming v is commutative):

...
v1 := v.Args[0]
v2 := v.Args[1]
if v1.Op != OpBlah {
    v1, v2 = v2, v1
}
if v1.Op != OpBlah {
    break
}
...

This only works when we know that the arguments to a value have different opcodes though (i.e. if in the above example both v1 and v2 were OpBlah values we would need to traverse further through the graph before deciding whether to swap them).

@laboger
Copy link
Contributor

laboger commented Aug 22, 2019

I did try a change to add something to a rule to indicate that it should not commute its operands. It reduced the size of the rewritePPC64.go file by about 1/3. However I ran into 2 cases that no longer match because the order of operands did not match the order of operands from the source statement, so some kind of reordering would be needed. I do recall that @ianlancetaylor made a comment about putting arguments in canonical order for this situation but I'm not sure on the details of how to implement it. (Force them in the order that they appear in the original source code? Or some other ordering.)

@ianlancetaylor
Copy link
Contributor

Not the order of the original source code, no. The issue arises when trying to match sequences like (plus (plus a b) c). The idea is to pick some canonical order that you use when writing the rules, like, the deeper tree is always first. Then when comparing against the rules, you check the values, and if the deeper tree is second, you swap them before comparing. Of course this can only be used for commutative operators.

@laboger
Copy link
Contributor

laboger commented Aug 26, 2019

OK, got it. That's what Mike was probably trying to explain above but I didn't get it.

@mvdan mvdan added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Sep 1, 2019
@gopherbot
Copy link

Change https://golang.org/cl/196498 mentions this issue: cmd/compile: reduce rulegen's output by 200 KiB

gopherbot pushed a commit that referenced this issue Sep 23, 2019
First, renove unnecessary "// cond:" lines from the generated files.
This shaves off about ~7k lines.

Second, join "if cond { break }" statements via "||", which allows us to
deduplicate a large number of them. This shaves off another ~25k lines.

This change is not for readability or simplicity; but rather, to avoid
unnecessary verbosity that makes the generated files larger. All in all,
git reports that the generated files overall weigh ~200KiB less, or
about 2.7% less.

While at it, add a -trace flag to rulegen.

Updates #33644.

Change-Id: I3fac0290a6066070cc62400bf970a4ae0929470a
Reviewed-on: https://go-review.googlesource.com/c/go/+/196498
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/196803 mentions this issue: cmd/compile: minor simplifications in rulegen

gopherbot pushed a commit that referenced this issue Sep 27, 2019
First, be consistent about declaring typ as &b.Func.Config.Types and
not &config.Types. Not particularly better, and it barely changes the
output, but we're more consistent now.

Second, remove a bit of duplication when handling the typ, auxint, and
aux variables.

Third and last, remove a stray canFail assignment; we ended up setting
that in add, not breakf, so it's not necessary to set it manually if we
don't use breakf.

Updates #33644.

Change-Id: I75999cb223a201969266fbfeae043599fa27fac5
Reviewed-on: https://go-review.googlesource.com/c/go/+/196803
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@mvdan
Copy link
Member Author

mvdan commented Nov 5, 2019

I don't intend to do any more work here for now, so I'm unassigning myself.

@laboger @mundaym do you want to reuse this issue for your idea above, or do you want to create a new issue? If you want to use a new issue, please close this one.

@mvdan mvdan removed their assignment Nov 5, 2019
@laboger
Copy link
Contributor

laboger commented Nov 6, 2019

I think we can open a new issue if we find a way to make the idea work. Unless @mundaym wants to keep this one open.

@mundaym
Copy link
Member

mundaym commented Nov 7, 2019

Happy to close this.

@mundaym mundaym closed this as completed Nov 7, 2019
@golang golang locked and limited conversation to collaborators Nov 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants