runtime: optimize memmove for 1-16 MB overlapping case on AMD64 #49058
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I run some test cases using function 'runtime.memmove' when data size is over 1MB and below 16MB with address overlap.
What did you expect to see?
'runtime.memmove' choose the most efficient way (from the non-temporal store and temporal store) to copy data.
What did you see instead?
When the test case is with address overlap and the size is over 1MB and below 16MB, the non-temporal store copying is slower than temporal store copying, but 'runtime.memmove' chooses to copy data using non-temporal store copying.
The text was updated successfully, but these errors were encountered: