Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: runtime: add support for the Arm’s ArmV8.5-A Memory Tagging Extension (MTE) in Go #59090

Open
zhangfannie opened this issue Mar 17, 2023 · 10 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Hold
Milestone

Comments

@zhangfannie
Copy link
Contributor

zhangfannie commented Mar 17, 2023

What is Memory Tagging Extension (MTE)?

It was first introduced as part of the armv8.5 instruction set in August 2019 and built into the first Armv9 compliant CPUs that were announced in May 2021. It is a security architecture feature to detect and prevent memory safety vulnerabilities before and after deployment.

Why is MTE needed for Go?

Go has structures and its memory layout is visible to the developers. We also provide pointers for accessing some memory directly. This provides a lot of flexibility to the developers and makes the interaction with C code easier. But it also opens the door for the developers to make mistakes on using the memory. So, it is necessary to enable the memory detector in Go.

We have deployed ASan, a software memory error detector. It consists of a compiler instrumentation module and a runtime library. The disadvantage is that ASan comes with a heavy performance cost (it is roughly 2x slower), which makes it unsuitable for widespread deployment. MTE can reduce this performance overhead (Its performance overhead estimate is 5% in asynchronous mode and 25% in synchronous mode) whilst offering some level of protection provides the mechanism to detect out-of-bounds and use-after-free bugs in production code with no instrumentation.

How does MTE work?

At a high level, MTE tags each memory allocation/deallocation with additional metadata. It allocates a tag to a memory location, which then can be associated with pointers that reference that memory location. At runtime the CPU checks that the pointer and the metadata tags match on each load and store.

There are two types of tagging, one is Address Tagging, it adds four bits to the lowest 4-bit of the top-byte of every pointer. Another is Memory Tagging, it also consists of four bits, linked with every aligned 16-byte region in the application’s memory space. To implement the address tagging bits without requiring larger pointers, MTE only works with 64-bit applications since it uses the Top Byte Ignore (TBI) feature, which is an feature of Arm 64bit Architecture.

MTE offers a tuneable level of performance, protection, and precision at runtime, it has 4 operating modes: OFF, Asynchronous mode, Synchronous mode, and Asymmetric mode (this is added in MTE3, for more information, see chapter D9 in Arm Architecture Reference Manual for A-profile architecture).

The details are explained in Armv8.5-A Memory Tagging Extension.

Prototype heap tagging in Go.

As we know, Glibc2.33 has support for MTE on arm64 and it just has the userspace heap tagging, see https://elixir.bootlin.com/glibc/glibc-2.33/source/sysdeps/aarch64. Glibc enables MTE using a memory tunable glibc.mem.tagging, which takes a value between 0 and 255 and acts as a bitmask that enables various capabilities, please see Memeory Related Tunables for details.

Referring to the implementation of glibc, we plan to enable MTE in go using an environment variable as well, named GOMTE, which has the same values and capabilities as glibc’s memory tunable. The GOMTE behaves as follows:

  • The default value is 0, which disables all memory tagging.
  • The go runtime will enable memory tagging support if GOMTE has any non-zero value.
  • It is only used to control the way the Go code uses MTE feature.
  • It is only supported on AArch64 systems with the MTE feature; it is ignored for all other systems.
  • It cannot be used with other sanitizers (-asan, -msan or -race).
  • For cgo, user needs to set glibc.mem.tagging for C code and GOMTE for Go code, respectively.
  • If GOMTE is inconsistent with glibc.mem.tagging, go runtime will print a warning message and MTE is still enabled.

Note: Given the cgo situation, both GLIBC and Go runtime may configure the MTE protection level for the threads. If the configurations are inconsistent, some threads may be protected at a different level as developer expected. To avoid confusion, we prefer making GOMTE definition always aligned with glibc.mem.tagging definition. Considering that GLIBC tunables are not guaranteed to be stable, GOMTE definitions may also be changed in the future to match the behavior of GLIBC.

The implementation:

  • Add CPU feature detection to check whether the hardware supports MTE.
  • Add an environment variable GOMTE.
  • Enable memory tagging and set tag checks fault mode based on the value of GOMTE.
  • Modify the allocator when tagging is needed.
    • Allocate 16 bytes aligned pointer.
    • Tag the pointer and the memory.
  • Modify the deallocator when tagging is needed.
    • Re-tag all freed memory areas with zero tagging.

Any feedback on the current design is welcome. Thank you.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 17, 2023
@seankhliao seankhliao changed the title runtime: add support for the Arm’s ArmV8.5-A Memory Tagging Extension (MTE) in Go proposal: runtime: add support for the Arm’s ArmV8.5-A Memory Tagging Extension (MTE) in Go Mar 17, 2023
@gopherbot gopherbot added this to the Proposal milestone Mar 17, 2023
@ianlancetaylor
Copy link
Contributor

I'm having a hard time seeing how this can work with the garbage collector. The garbage collector must obviously be able to examine any memory location. Is the intent that the garbage collector record the tag for each page, and use that tag to construct a pointer when examining that page?

Alternatively, since Go is already a memory-safe language, should we simply reserve a memory tag for all Go memory allocations? That would let C code check its use of Go pointers, and let Go code check its use of C pointers.

@zhangfannie
Copy link
Contributor Author

@ianlancetaylor Yes, you are right. When MTE is enabled, we should do special handling of pointers (remove and add the Address Tagging) in the process of the garbage collector. For example, a pointer should be removed its address tag when the GC checks whether it is a heap-allocated object. And when loading an underlying pointer from it, the pointer should be added its address tag.

As for the implementation, use p & 0xf0ffffffffffffff to remove the address tag of the pointer p. To add the address tag for a pointer p , we have the following three options:

  1. Load the memory tag from the memory location referencedby p, and add it to p.
  2. The address tag added for allocations is not randomly generated, but calculated by a hash of the start address of a span instead, so we can get the start address of the span of p, and then calculate the address tag for p.
  3. Just reserve a memory/address tag for all Go memeory allocations.

These three options offer difference level of performance, protection and precision. Like, the option 1 would be incorrect if this pointer p is invalid. The option 2 has higher precision but with weaker performance. The option 3 has strong performance but with weaker protection. We are considering the best option. Thank you.

@rsc
Copy link
Contributor

rsc commented May 10, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented May 17, 2023

The discussion above is mostly about mechanics. That's about how we would add MTE.

The unanswered question is why we would add MTE. It seems like there are three parts to that question:

  1. What must Go do to play nicely with glibc as far as cgo uses are concerned?
  2. What opportunities do we have to use MTE to better isolate Go memory from C memory?
  3. What opportunities do we have to use MTE in the Go runtime to our own benefit? (We could do what is described above for allocation, but what benefits would that give the Go runtime? In C the benefit is catching dangling pointers. Go programs have a GC, so they almost never have dangling pointers.)

Does anyone want to take a stab at answering any of these?

@zhangfannie
Copy link
Contributor Author

zhangfannie commented May 25, 2023

@rsc Sorry for the late reply.

  1. What must Go do to play nicely with glibc as far as cgo uses are concerned?

In terms of implementation, we only need to modify the allocator and deallocator of go. In the case of cgo, no special handling is required. See the implementation of supporting MTE for heap mentioned in the above proposal.

From the MTE user interface, both GLIBC and Go runtime may configure the MTE protection level for the threads. If the configurations are inconsistent, some threads may be protected at a different level as developer expected. To avoid confusion, we prefer making GOMTE definition always aligned with glibc.mem.tagging definition. Considering that GLIBC tunables are not guaranteed to be stable, GOMTE definitions may also be changed in the future to match the behavior of GLIBC.

  1. What opportunities do we have to use MTE to better isolate Go memory from C memory?

With MTE, the memory will naturally be isolated, but the tag bits are random, and the memory will not be isolated with Go and C as the boundary. But if MTE is enabled, when the C code out of bounds accesses the Go memory, there is a high probability that the tag bits will be different, which will cause the MTE to report an error.

  1. What opportunities do we have to use MTE in the Go runtime to our own benefit? (We could do what is described above
    for allocation, but what benefits would that give the Go runtime? In C the benefit is catching dangling pointers. Go rograms
    have a GC, so they almost never have dangling pointers.)

There should be less abuse of Go pointers. But we found 3 abuse in top 20 GitHub Go project by just running stock tests with -asan option. The item Why is MTE needed for Go? in the proposal above mentions some benefits.

As for whether the go runtime can use MTE to do some optimization, I think of a point, that is, during the GC process, address tagging can be used to determine whether the pointer is allocated by the heap.

Thank you.

@ianlancetaylor
Copy link
Contributor

In terms of implementation, we only need to modify the allocator and deallocator of go.

Can you be more specific as to exactly how the allocator and deallocator should be modified? And are you referring to the allocator that allocates a single Go object, or are you referring to the allocator that allocates a set of memory pages, or what? Thanks.

@zhangfannie
Copy link
Contributor Author

In terms of implementation, we only need to modify the allocator and deallocator of go.

Can you be more specific as to exactly how the allocator and deallocator should be modified? And are you referring to the allocator that allocates a single Go object, or are you referring to the allocator that allocates a set of memory pages, or what?

  1. The allocator still allocates a single Go object.

  2. The allocator and deallocator changes as follow:

  • Allocate a 16-byte aligned object.
    Because the Memeory Tagging Extension (MTE) is built on top of the ARMv8.0 virtual address tagging TBI (Top Byte Ignore) feature and allows software to access a 4-bit allocation tag for each 16-byte granule in the physical address space. The current allocation is not 16 bytes aligned, we need to change it.

  • Tag the pointer and memory.
    These are the two types of tagging mentioned above. The armv8.5 has extra instructions to manipulate the tag bits in pointers. The IRG is to choose a random tag and insert it into an address. The STG is to set tags on memory locations using the tag in the address. The details are explained in
    https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Arm_Memory_Tagging_Extension_Whitepaper.pdf .

  • Re-tag all freed memory areas with zero tagging.

Thank you.

@rsc
Copy link
Contributor

rsc commented May 31, 2023

@zhangfannie, can you provide details about the kinds of problems you found with -asan?

There are various pieces of the runtime that clearly need to be aware of MTE, especially if we are coordinating with C.

What's not clear to me is whether we need GOMTE as a separate variable. If MTE is good, shouldn't we just turn it on any time it is available? Why would we give users control over this or initiate the use of MTE? (If we expect bugs we could always add a GODEBUG=mte=off to let users override it.)

It sounds like from the runtime side we don't really know how big the changes are, and we'd like to hold the decision for understanding how invasive this all is. Do you have a prototype CL or any sense of what the changes involve and how invasive they are?

Perhaps it would make sense to put this on hold until we have a prototype CL?

@zhangfannie
Copy link
Contributor Author

zhangfannie commented Jun 13, 2023

@rsc Sorry for the late reply.

@zhangfannie, can you provide details about the kinds of problems you found with -asan?

We found problems from three different projects, but we were not familiar with these projects and did not spend much time looking at too many details. We submitted these issues we found, please see issues below for details.
pingcap/tidb#32952
grafana/grafana#46395
go-delve/delve#2919

What's not clear to me is whether we need GOMTE as a separate variable. If MTE is good, shouldn't we just turn it on any time it is available? Why would we give users control over this or initiate the use of MTE? (If we expect bugs we could always add a GODEBUG=mte=off to let users override it.)

I do not know if it is a good behavior to turn on MTE any time. Because MTE has overhead, it has about 3% memory consumption overhead for tags, as for performance overhead, a document introducing MTE mentions that a speculative expectation is an overhead lower than 5% on average for asynchronous mode, but this data needs to be tested after MTE is enabled.

We design an environment variable to control MTE, this is the implementation of reference glibc, Because considering that Go will interact with C, it is best to be consistent.

As for the use interface of MTE in Go, this can be discussed.

It sounds like from the runtime side we don't really know how big the changes are, and we'd like to hold the decision for understanding how invasive this all is. Do you have a prototype CL or any sense of what the changes involve and how invasive they are?

We found that there are four main places for the invasion of runtime.

One is the garbage collector. When MTE is enabled, we should do some special handling of pointers (remove or re-add Address Tagging) in the process of garbage collector. For example:

  • A heap pointer's address tag should be removed when the GC checks whether it is a heap-allocated object by calling spanOf() function. The line 682 of mheap.go file.
func spanOf(p uintptr) *mspan {
         p = p & 0xf0ff ffff ffff ffff                 //  we should add code like this to remove address tagging from this pointer.
 } 
  • The pointer should have its address tag added when loading an underlying pointer from it. See the sanobject() function. The line 1321 of mgcmark.go file.
func scanobject(b uintptr, gcw *gcWork) {
     ...
     addr = reAddTag(addr)                      //  we should add code like this to re-add address tagging for this pointer.
     obj := *(*uintptr)(unsafe.Pointer(addr))
     ...
}

There are many places in the GC process that need to add such handling.

One is the pack and unpack lfstack.
The current pack process takes the 16 bits from the top of a 64-bit address, which conflicts with the 4-bit mte tag. We should remove the address tagging during the pack process, and then re-add the address tagging during the unpack process when the node is heap allocated.

One is some assembly funtions have intentional out-of-bounds access operations. However, the MTE will treat these out-of-bounds accesses as illegal accesses and report a segmentation fault error.
For example, the indexbyte() function implemented in assembly code, it intentionally out-of-bounds to load data in 32-byte alignment when the data is not 32-byte aligned. The related codes are as follows.

BIC    $0x1f, R0, R3                              // R0: data   
...
VLD1.P (R3),   [V1.B16, V2.B16]           //  this is out-of-bounds access. 

To be compatible with MTE, we need to modify the implementation of the indexbyte function. We can refer this implementation
strrchr-mte.S

The last one is some assembly functions that subtract the two addresses to get an offset. When MTE is turned on, the tags of the two addresses are different, and an incorrect offset will be obtained.
For example, the memmove() funtion implemented in assembly code, the line line 148 of memmove_arm64.s file, the related codes are as follows. We should add some extra code to handle this case.

backward_check:
	// Use backward copy if there is an overlap.
	SUB	R1, R0, R14                                // R0: dst, R1: src, the offset is incorrect when the tags of src and dst are different.
	CBZ	R14, copy0
	CMP	R2, R14
	BCC	copy_long_backward

The above are some changes we found during the implementation process. Among them, the processing in GC is most difficult, because GC has many operations on addresses and memory, we cannot guarantee that every place has been considered. And the implementation of GC has been changing, and the changes need to take these special treatments into account.

Perhaps it would make sense to put this on hold until we have a prototype CL?

It is ok to put this on hold until we have a prototype CL.

Thank you.

@rsc
Copy link
Contributor

rsc commented Jun 14, 2023

Placed on hold.
— rsc for the proposal review group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Hold
Projects
Status: No status
Status: Hold
Development

No branches or pull requests

4 participants