Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: go:wasmimport allows pointers to pointers #59156

Closed
johanbrandhorst opened this issue Mar 20, 2023 · 11 comments
Closed

cmd/compile: go:wasmimport allows pointers to pointers #59156

johanbrandhorst opened this issue Mar 20, 2023 · 11 comments
Labels
arch-wasm WebAssembly issues compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@johanbrandhorst
Copy link
Member

johanbrandhorst commented Mar 20, 2023

Background

The go:wasmimport directive was added in #38248 and has since been used to implement the prototype implementation of the WASI port. It is currently limited to use in the runtime, syscall and syscall/js standard library packages. With #59149 we want to allow any Go user to write their own go:wasmimport statements to access functions implemented on the WASI host.

Wasm today uses 32 bit pointers, while internally in the Go Wasm implementation we use 64 bit pointers. When a function is decorated with the go:wasmimport directive, the compiler translates the arguments (including pointers) to Wasm types. However, pointers to structs containing pointers cannot have their pointer fields converted in the same way since the conversion would require copying the objects to new memory locations, including the conversion of pointer fields to 32 bits, and copying the data back into the objects when the call returns. The mechanism would have to recursively walk the objects, dealing with pointer cycles, and likely introduce a significant overhead when calling go:wasmimport functions.

What is the behaviour today?

The compiler does not error when compiling functions with arguments that point to pointers. At runtime, there is an ABI mismatch between the Go program and the Wasm host modules, the latter expecting pointers to be 32 bits, causing misalignment of fields in struct types passed by pointer to go:wasmimport functions.

What did I expect to see?

I expected the compiler to disallow functions using arguments that point to pointers so we prevent users of go:wasmimport from constructing programs with a memory layout that is incompatible with the Wasm host.

Discussion

The following should be allowed (no internal struct pointers):

//go:wasmimport module F1
func F1(p *byte)

//go:wasmimport module F2
func F2(p unsafe.Pointer)

type buffer struct {
  ptr uint32
  len uint32
}

//go:wasmimport module F3
func F3(b *buffer)

The following should not be allowed (types contain pointers to pointers):

type buffer struct {
  ptr *byte
  len uint32
}

//go:wasmimport module F4
func F4(b *buffer)

//go:wasmimport module F5
func F5(p **byte)
@johanbrandhorst johanbrandhorst added the arch-wasm WebAssembly issues label Mar 20, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 20, 2023
@johanbrandhorst
Copy link
Member Author

CC @golang/wasm

@heschi heschi added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 21, 2023
@heschi
Copy link
Contributor

heschi commented Mar 21, 2023

cc @golang/compiler

@heschi heschi added this to the Backlog milestone Mar 21, 2023
@cherrymui
Copy link
Member

Is it only an issue for pointer to pointer? What about other (pointer to) data types (e.g. a struct), even if it doesn't contain pointers, how do we ensure the underlying data type at the other side (JS or other languages) has compatible memory layout?

Maybe we permit only unsafe.Pointer but not concrete pointers? And let the user or a higher level package to do the conversion, which would make sure the data type is compatible? Or even narrower, permit just the scalar types? This is somewhat similar to syscall.Syscall which only takes uintptr arguments, with higher level code handling the conversion.

@mknyszek mknyszek added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Mar 29, 2023
@achille-roussel
Copy link
Contributor

Is it only an issue for pointer to pointer? What about other (pointer to) data types (e.g. a struct)

This is an issue for pointer-to-pointer, and also for pointers in inner fields of structs (see our second example).

even if it doesn't contain pointers, how do we ensure the underlying data type at the other side (JS or other languages) has compatible memory layout?

There is no way to do this with the current WebAssembly tooling. ABIs are conventions that must be agreed upon by the WebAssembly hosts and guest modules. This is somewhat the root cause of this issue, GOARCH=wasm uses a memory layout that is different from what the other compilers have implemented (Go has 64 bits pointers where the rest of the WebAssembly ecosystem uses 32 bits pointers).

Maybe we permit only unsafe.Pointer but not concrete pointers? And let the user or a higher level package to do the conversion, which would make sure the data type is compatible? Or even narrower, permit just the scalar types? This is somewhat similar to syscall.Syscall which only takes uintptr arguments, with higher level code handling the conversion.

Allowing only scalar type is pretty close to what we are suggesting here I believe. We also expect the use of //go:wasmimport to always be done in internal library code. A problem with allowing uintptr is it has the size of a pointer, which is 64 bits in GOARCH=wasm, and therefore has the same problems than regular pointers.

From the experience of developing #58141, bugs caused by mistakenly having pointers or uintptr in struct types were really difficult to track down.

I hope these answers are useful!

@cherrymui
Copy link
Member

Thanks. So it sounds like this is an issue for all data types (other than scalars). While the immediate issue is pointers, I think we also don't have guarantees to ensure other data types are laid out exactly same (e.g. even for a struct with all scalar fields, do they have the same padding rules? Maybe they do, but I'm not sure we want to guarantee that). So, I would think we allow only scalar types that directly correspond to a Wasm type, i32, i64, f32, f64 (perhaps unsafe.Pointer also?) and leave the conversion to higher level code. (This means it will also not allow types like int, or pointer to scalar structs like *struct{ x int32 }).

@achille-roussel
Copy link
Contributor

So it sounds like this is an issue for all data types (other than scalars). While the immediate issue is pointers, I think we also don't have guarantees to ensure other data types are laid out exactly same (e.g. even for a struct with all scalar fields, do they have the same padding rules? Maybe they do, but I'm not sure we want to guarantee that).

Yes 👍

So, I would think we allow only scalar types that directly correspond to a Wasm type, i32, i64, f32, f64 (perhaps unsafe.Pointer also?) and leave the conversion to higher level code. (This means it will also not allow types like int, or pointer to scalar structs like *struct{ x int32 })

Generally the alignment rules seem to work well enough that we haven't had issues with these, the rules used by Go appear to match what is expected by WASI-compatible runtimes, tho it's possible that this is only accidental and may be architecture dependent?

Starting with the 4 scalar types + unsafe.Pointer is probably a good balance, it would be more restrictive than what we allow internally today but likely will lead to fewer problems (e.g. the presence of unsafe tends to raise attention and discourage the use of imports in exported APIs).

@johanbrandhorst @evanphx What do you think?

@evanphx
Copy link
Contributor

evanphx commented Apr 4, 2023

TL,DR: We should instead prioritize switching to 32bit pointers so that Go works the same as other Webassembly compilers.

Hi all,

The UX of only allowing that set of types will make public usage of wasmimport really awkward. If this was only for our usage within runtime/syscall, it would probably be fine. But forcing every integration to hand write a set of wrappers (which are pretty hard to write) is not going to serve us very well.

It would be the equiv of cgo only allowing those types and not containing of the automapping code.

Given this, I think we should go the other direction entirely, which is to prioritize switch to 32bit pointers so that this issue goes away entirely. The only other issue would be alignment, which @achille-roussel has indicated hasn't been an issue, mostly because Go's alignment is mostly the same as LLVM/etc.

This, I'll say, is probably the worst specified part of Webassembly. There is mostly a handshake agreement on ABI currently, but we can't ignore that because doing see means that compiling Go to Webassembly means our users can't use that Go code with existing Webassembly infrastructure (which is the whole point of doing this).

@johanbrandhorst
Copy link
Member Author

Just noting here that we intend to submit a proposal detailing the move to 32 bit, we're not trying to make this issue discussion the forum for that discussion 😅.

@cherrymui
Copy link
Member

Thanks.

The UX of only allowing that set of types will make public usage of wasmimport really awkward.

The syscall.Syscall function just takes uintptr arguments. One can build functions using more appropriate types and hide that detail in libraries. Could we do the same for here?

It would be the equiv of cgo only allowing those types and not containing of the automapping code.

For cgo, on the Go side it usually uses explicit C types like C.int, C.struct_XXX, etc. instead of directly using Go types and hoping the ABI is the same. Also, (at least with the current implementation) under the hood there are wrappers generated by the cgo command automatically. Do we plan to do either for Wasm?

Maybe we want to introduce a WasmPtr type, and expect to use that type instead of Go pointer type?

Given this, I think we should go the other direction entirely, which is to prioritize switch to 32bit pointers so that this issue goes away entirely.

I don't think the issue really goes away entirely. Maybe it doesn't cause any immediate problem now. But I don't think we want to guarantee in the language spec that Go's struct layout etc. has to match Wasm's. One possibility is that we document that the user of wasmimport is responsible for making sure the layout agrees on both sides. (But it might still change from time to time?)

Just noting here that we intend to submit a proposal detailing the move to 32 bit, we're not trying to make this issue discussion the forum for that discussion

Thanks. (Not sure this is the best place to discuss) A few questions:

  • Does Wasm still plan to support 64-bit address space? Is it going to happen sometime soon-ish, or it is still some indefinite future?
  • Changing the pointer size would be a big breaking change. Should we introduce GOARCH=wasm32? The implementation could probably share a lot of code.

@johanbrandhorst
Copy link
Member Author

johanbrandhorst commented Apr 7, 2023

Thanks for your comments. I think we'd prefer to fix this bug with your suggested reduced set of allowed types in the parameters. That is, only allow the parameters of a function using go:wasmimport to be one of:

Go type Wasm type
int32, uint32 i32
int64, uint64 i64
float32 f32
float64 f64
unsafe.Pointer i32

We'll prepare to submit a CL with this change shortly.

Re: 32 bit port, I think our current thinking is a new port, wasm32, yeah, since it would be a big breaking change. Thanks for those thought provoking questions, we'll make sure to answer them in the proposal.

EDIT: removed 8 and 16 bit integer types for now.

@gopherbot
Copy link

Change https://go.dev/cl/483415 mentions this issue: wasm: restrict supported types in go:wasmimport function signatures

@golang golang locked and limited conversation to collaborators Apr 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly issues compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

7 participants