New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: assigning large values does not use memmove #10362
Comments
This benchmark, shows the cliff when values pass the upper limit of DUFFCOPY http://paste.ubuntu.com/10762232/
|
6g and 8g use REP with MOVSL/MOVSQ, which I believe @randall77 determined to be faster around that threshold. I would believe that the other architectures could benefit from a call to memmove or something similar. (This is a place where NEON should shine.) |
[Please don't use { } syntax in bug headings. It doesn't sort well.] This may apply to some subset of the non-x86 systems. |
we need a memmove that takes argument from register rather
from the stack (i.e. duffcopy style), otherwise the optimization
won't work.
|
538->745 is hardly a "cliff". I'm surprised it is so close given the lack of anyone tuning this mechanism on arm. (Or did I miss someone doing that?) minux is right, the moves generated here are sometimes used to marshal arguments to a function, so we can't call a function to do the marshaling. For other situations like your a=b example you could call memmove. It might take some work to distinguish those two cases, however. At the move generation point the marshaling has already been turned into a=b assignments. |
Consider this piece of code
The assignments of
a = b
orb = a
where the size ofa
orb
is above the DUFFCOPY limit of 128 words produces some very simplistic codeShould
sgen/stackcopy
take the opportunity to setup a call toruntime.memmove
for values larger than 128 words ?The text was updated successfully, but these errors were encountered: