-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: SSA sometimes generates far more write barriers #14869
Comments
For cmd/internal/obj/ppc64.buildop at least, we're using writebarrierptr instead of typedmemmove for a slice assignment. |
You're right. Second column is diff in typedmemmove calls:
Looks like SSA is doing much better than the old backend in general. The only weird one is decomposeBuiltIn, where SSA generating two writebarrierptrs for each mapaccess/append/mapassign line instead of one. |
Looks like decomposeBuiltIn is generating a write barrier for a write to the stack. I'll see if I can repro in an independent test case. |
I did a comparison of the write barriers produced when compiling cmd/compile with and without SSA. Sometimes it generates dramatically more write barriers than the old backend. Here are the top movers:
+235 cmd/internal/obj/ppc64.buildop(SB)
+72 cmd/internal/obj/mips.buildop(SB)
+55 cmd/internal/obj/arm.buildop(SB)
+14 cmd/compile/internal/amd64.ssaGenBlock(SB)
+10 cmd/compile/internal/ssa.rewriteValuegeneric_OpStructSelect(SB)
+9 cmd/compile/internal/ssa.decomposeBuiltIn(SB)
+8 cmd/compile/internal/ssa.nilcheckelim(SB)
+6 runtime.getitab(SB)
+5 runtime.(_TypeAssertionError).Error(SB)
+4 runtime.assertI2T(SB)
+4 reflect.Value.MapKeys(SB)
+4 fmt.(_pp).printReflectValue(SB)
+4 cmd/compile/internal/ssa.(_Value).LongString(SB)
+4 cmd/compile/internal/ssa.(_edgeState).setup(SB)
+3 runtime.assertE2T(SB)
+3 reflect.(*structType).FieldByNameFunc(SB)
+3 cmd/internal/obj.Linknew(SB)
+3 cmd/compile/internal/gc.typecheck1(SB)
+3 cmd/compile/internal/gc.Main(SB)
I looked into a few of these and it looks like the old backend is generating one call to typedmemmove in cases where SSA is generating a huge number of calls to writebarrierptr. In the smaller movers like runtime.getitab this may actually be the right thing; if there are just a few pointers, writebarrierptr is better. But 235 more write barriers is probably not the right thing.
This was generated with the following script:
/cc @khr @dr2chase
The text was updated successfully, but these errors were encountered: