Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/chacha20poly1305: unconditional use of PSHUFB instruction on amd64 #63871

Open
rsc opened this issue Nov 1, 2023 · 4 comments
Open
Assignees
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Nov 1, 2023

It looks like golang.org/x/crypto/chacha20poly1305/chacha20poly1305_amd64.s uses PSHUFB instructions unconditionally, even when built for GOAMD64=v1. PSHUFB is part of SSSE3 which is only v2+. In my version of similar code for chacha8rand I didn't want the overhead of two copies of the code and a runtime switch, so I just did

// ROL16 rotates the uint32s in register R left by 16, using temporary T if needed.
#ifdef GOAMD64_v2
#define ROL16(R, T) PSHUFB ·rol16<>(SB), R
#else
#define ROL16(R, T) ROL(16, R, T)
#endif

// ROL8 rotates the uint32s in register R left by 8, using temporary T if needed.
#ifdef GOAMD64_v2
#define ROL8(R, T) PSHUFB ·rol8<>(SB), R
#else
#define ROL8(R, T) ROL(8, R, T)
#endif

That may be fine for this code too, since newer x86 chips are going to use the AVX code path anyway.

@rsc rsc added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 1, 2023
@rsc rsc added this to the Backlog milestone Nov 1, 2023
mauri870 added a commit to mauri870/crypto that referenced this issue Nov 1, 2023
The PSHUFP instruction is part of SSE which is only v2+ but it is being
used without the GOAMD64_v2 guard.

The ROL routines were copied from Go's internal/chacha20poly1305
package.

Fixes golang/go#63871
@gopherbot
Copy link

Change https://go.dev/cl/538786 mentions this issue: chacha20poly1305: guard PSHUFP usage with GOAMD64_v2

@randall77
Copy link
Contributor

Does this need backporting to 1.20 and 1.21?
I think it does. Reopening to make this decision.

@randall77 randall77 reopened this Nov 7, 2023
@rfjakob
Copy link

rfjakob commented Mar 3, 2024

Hi, I have a user reporting a 15% performance regression in gocryptfs due to this change ( rfjakob/gocryptfs#828 ).

The CPU is cpu: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz so I guess too old for AVX but it does have PSHUFB.

Can PSHUFB be detected at runtime instead?

@mauri870
Copy link
Member

mauri870 commented Mar 7, 2024

Hi @rfjakob, thanks for reporting this issue. I think having a feature detection for PSHUFB was not considered, it is simply grouped with other amd64v2 features. Do you mind opening a fresh new issue to discuss this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants