-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: large number of NOP instructions in crypto/md5.blockGeneric #31545
Comments
Dup of #29571 ? |
Yes, this is a dup of #29571. I had hoped CL 170445 would fix instances like this, but it looks like it hasn't. |
The reason CL 170445 didn't work is that these nops really are needed for correct tracebacks. We could give the single instruction the line number of the parent, which would then obviate the need for the nop. I'm not sure what a principled method for making that happen would be, though. At the time of inlining, it isn't obvious that the inlined body will reduce to a single instruction (and it won't, on architectures that can't do unaligned loads). |
In DWARF a single instruction can represent a stack of inlined function calls. So a profiling sample at that instruction would carry both the location in |
The Go inline markers can only represent a single frame. |
I'm sure I'm misunderstanding this, but if each inline mark represents a different call site, then doesn't that imply that the inlined call was completely eliminated? I don't see a reason to keep an instruction around to represent a call that was optimized away. |
The body of the inlined call compiles to a single instruction. That instruction has a pc/line of the body of the inlined call. That instruction has a "parent" pointer to a nop instruction, which has a pc/line of the call site. |
It seems to me that if we could attach a call stack to an instruction, then the single instruction of the inlined call would describe both the location of the inlined function and the location of the call to the inlined function, and the separate nop instruction would not be required. |
True, one way to fix this would be to allow multiple locations to be attached to a single instruction. That's currently not possible. |
There seem to be a large number of contiguous NOP instructions inserted into the main loop inside
md5.blockGeneric
. Looks like they are markers for inlinedbinary.LittleEndian.UInt32
calls. I haven't checked the performance implications (probably doesn't make a huge difference) but it would still be good to remove these if we can:/cc @randall77
The text was updated successfully, but these errors were encountered: