Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/cpu: MIPS[64] feature detection #26538

Closed
smasher164 opened this issue Jul 22, 2018 · 20 comments
Closed

internal/cpu: MIPS[64] feature detection #26538

smasher164 opened this issue Jul 22, 2018 · 20 comments

Comments

@smasher164
Copy link
Member

This might already be in the works, but certain mathematical functions can be sped up by taking advantage of built-in MIPS instructions. In my case, an FMA intrinsic (#25819) can use MADDF.D fd, fs, ft, which is available to a subset of mips64r6. The cp0.Config0, cp0.Config1, and cp1.FIR registers contain the necessary information to detect this feature. Unfortunately, reading from cp0 requires the privileged instruction mfc0.

That means that the following code to detect FMA support on mips64 will receive a SIGILL on the marked line:

TEXT ·isFMASupported(SB),NOSPLIT,$0
#ifndef GOMIPS64_softfloat
    // CP0.Config0[14:13] == 0b10 <==> Release 6
    MOVW M16, R8 // <-- mfc0 causes a SIGILL
    SRL $13, R8
    AND $3, R8
    MOVW $2, R9
    BNE R9, R8, nosupport
    // CP0.Config1[1:0] == 1 <==> FPU enabled
    // mfc0 8, 16, 1
    WORD $0x40088001
    AND $1, R8
    BEQ R0, R9, nosupport
    // CP1.FIR[18:17] == 1 <==> Double-precision operations supported
    // CFC1 $0, R8
    MOVW FCR0, R8
    SRL $17, R8
    AND $1, R8
    MOVB R8, ret(FP)
nosupport:
#endif
    RET

If internal/cpu exports the required Config registers as described in sections 9.42 and 9.43 in https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00091-2B-MIPS64PRA-AFP-05.04.pdf, runtime specialization of MIPS code will become much simpler.

@randall77
Copy link
Contributor

I'm a bit confused as to how internal/cpu is going to run the privileged instruction successfully. It runs at the same privilege as your example code does.

@smasher164
Copy link
Member Author

I should clarify that internal/cpu would use the HWCAP bits to provide this info.

@randall77
Copy link
Contributor

What package(s) are you intending to use this in?
The reason I ask is because if it's only a single package, it might be better for this code to live in that package. Only if multiple packages want it would it be better off in internal/cpu.

@smasher164
Copy link
Member Author

The use-case is for the math package, where I'm working on an FMA intrinsic. There's a lot of room for specialization though, since almost none of the functions use a built-in instruction. crypto can benefit as well, since SIMD also needs to be detected by accessing cp0.

@martisch
Copy link
Contributor

martisch commented Jul 23, 2018

I think all cpu feature detection code should be in internal/cpu as it provides a single point of allowing to turn off features and thereby can be used to benchmark and test different code paths and consolidates all feature detection code in the std library and runtime into one place. Starting in internal/cpu also avoids having to move code later or missing that the detection is already implemented elsewhere. There already are some features e.g. HasFMA in internal/cpu that are only used in math. Some architectures e.g. arm, arm64 and ppc64 use hwcap in internal/cpu to detect features so adding mips and mips64 support once needed in math should be able to be added in the same manner to internal/cpu. However, only features that are used by the runtime/compiler or std lib should be added.

@smasher164
Copy link
Member Author

Here is a working set that can be covered by AT_HWCAP, the FIR register, and some heuristics.

// From AT_HWCAP
IsRelease6 (more portably detected via https://github.com/v8mips/v8mips/issues/97).
HasSIMD
HasCRC32

// From FIR
Has3DASE
HasPairedSingle
HasDoublePrecision
HasSinglePrecision

What should we return for users who have enabled an environment variable, i.e. softfloat? Does internal/cpu accommodate envvars or should standard library functions that use internal/cpu accommodate envvars?

That begs the question:

HasFP <-- Is this necessary since we assume that FP operations work?

@martisch
Copy link
Contributor

martisch commented Jul 23, 2018

internal/cpu up until now tried to use for naming what linux reported for cpu flags (e.g. with /proc/cpuinfo) or used as hwcap naming.

e.g. for mips and hwcap i find only these:
https://github.com/torvalds/linux/blob/master/arch/mips/include/uapi/asm/hwcap.h

There is only one env var that is relevant to internal/cpu at the moment and that is GODEBUGCPU to mask cpu feature detection for debugging and benchmarking. f045ddc

I dont think any other setting should effect internal/cpu feature variables at the moment. If the setting is not from HWCAP or feature flag registers of the cpu it is likely not something that should be in internal/cpu (e.g. whether the user has enabled softfloat as compilation option).

Note that some feature variables are assumed to always be true but we still have to check if they are true in case we start on a cpu that does not support e.g. FP. You can assume that they are always true in code but we should have a test that makes sure HasFP is indeed detected to be true if e.g. HWCAP supplies that feature.

Could you please clarify what you meant by:
"should standard library functions that use internal/cpu accommodate envvars?"

@smasher164
Copy link
Member Author

Although HasFP is not in HWCAP, it is in the cp0.Config register that can only be read from a privileged instruction. Setting HasFP would need a different method, possibly:

  • Issuing a read instruction to FIR and catching a SIGILL.
  • Parsing the output of /proc/cpuinfo.

I meant to ask should stdlib assembly functions continue to check for envvars, but the question doesn't make much sense since macros are compile-time detection.

@martisch
Copy link
Contributor

I think neither parsing /proc/cpuinfo nor running into SIGILL are a good fit internal/cpu and for the early stage that internal/cpu is initialized during runtime. Seems the only features that seem to be detectable reliably and without privileged instructions by internal/cpu are the once mentioned for HWCAP so far.

@smasher164
Copy link
Member Author

Fair enough, although given the alternative way to determine that a CPU is MIPS Release 6, we can probably hold off on MIPS feature detection. The following code should do the trick of detecting FMA on big-endian systems:

TEXT ·isFMASupported(SB),NOSPLIT,$0
	MOVV R0, R2
#ifndef GOMIPS64_softfloat
	// Detect Release 6. ADDI < R6 == BOVC on R6.
	// See https://github.com/v8mips/v8mips/issues/97#issue-44761752
	WORD $0x20420001
	BNE R0, R2, nosupport
	// Detect double-precision. CP1.FIR[18:17] == 1
	MOVW FCR0, R2
	MOVW $(1<<17), R9
	AND R9, R2, R2
	SRL $17, R2, R2
nosupport:
#endif
	MOVV R2, ret(FP)
	RET

@martisch
Copy link
Contributor

If the above isFMASupported code works in a normal go program and on every mips cpu then the two features mentioned in there should be added and used from internal/cpu. Release 6 detection seems to be available through HWCAP. If FCR0 is readable by a normal go program then the features detected through it can also be added to intearnal/cpu once needed.

@smasher164
Copy link
Member Author

smasher164 commented Jul 26, 2018

Cool! I can work on a CL for mips[64][le] that incorporates HWCAP and FCR0. I'll open a similar issue for ARM as well.

@gopherbot
Copy link

Change https://golang.org/cl/126657 mentions this issue: internal/cpu: expose mips[le][64] feature flags for FMA

@milanknezevic
Copy link
Contributor

@smasher164 This kind of detection can be a little overkill on mips[64}r6, if not imposible. R6 is not backward-compatible release, so the binaries that are built for former releases can't be run on R6. There are some plans that mips[64}r6 support will be available through GOMIPS{64} environment variable. You can read more about this here.

@smasher164
Copy link
Member Author

@milanknezevic Looking at the latest MIPS ISA, there is a large overlap between R6 and previous releases, even if there is backwards-incompatibility. The only instruction-based detection that is done for now is to read a value from the floating-point implementation register, which has existed since release 1. If we want to specialize mips code without sub-arch flags like in the thread you mentioned, runtime detection is necessary.

That said, given the scarce number for release 6 processors, holding off on R6-specific code isn't a bad option.

@smasher164
Copy link
Member Author

CL 200579 from @mengzhuo adds preliminary support for MIPS64x feature detection.

Also, as noted in #35008 (comment), detecting FMA support has effectively been solved since MIPSIII via the floating-point implementation register (fcr0).

@gopherbot
Copy link

Change https://golang.org/cl/280353 mentions this issue: internal/cpu: add HasR6 and HasF64 for mips64 and mips64le

@mengzhuo
Copy link
Contributor

mengzhuo commented Jan 5, 2021

@smasher164

https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00926-2B-MSA-WHT-01.03.pdf

There is a Multiply-Add (vector) intrinsic requires MSA only.

AFAIK MSA vector register shared with FPU register, maybe we could try that instead of adding a F64 flag in feature.

@smasher164
Copy link
Member Author

smasher164 commented Jan 5, 2021

Good idea. I'll take a look at this.
Question: Does this mean that MIPS has multiple FMA instructions? R6 w/ F64 uses MADDF and MSA has FMADD_DF.

Edit: Abandoned the above change since we already check for MSA. I will try to get in an FMA intrinsic for MIPS in the 1.17 cycle based on HasMSA.

@smasher164
Copy link
Member Author

I will also close this issue, since FMA-related feature detection has technically already been done. Future feature detection can have their respective issues opened up.

@golang golang locked and limited conversation to collaborators Jan 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants