-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: jsoniter stack corruption with 32-bit arches, potential GC issues with 64-bit arches #71408
Comments
I can reproduce. |
I suspect that https://github.com/modern-go/reflect2, which is imported by https://github.com/json-iterator/go, can not handle the switch to swissmaps. It reaches into internal details of the runtime with unsafe/linkname/etc. In fact I'm surprised it mostly works. |
I kept the |
I opened a new issue against Not sure what if anything the Go project should do about this. I'll leave this issue open in case anyone has some suggestions, but I'm inclined to say "tough luck". But |
Thanks for the investigation and quick response. FWIW I was aware that jsoniter is unmaintained, but I didn’t realise it was this problematic. A little research shows that Kubernetes decided years ago to move off (although work remains). |
Thinking about potential for issues with GC, here are the overlaps on amd64:
(These structures are actually the same size. https://go.dev/cl/625275 added one more field to hiter, but that isn't reflected in reflect2, nor was it ever released.) The pointer/non-pointer mismatches are: Recorded as pointer, but not actually a pointer:
For these, the GC may throw if these look like pointers that aren't allocated. Also if they are non-zero but look too small to be a valid pointer [1]. The latter is more likely, all of these values will generally be quite small. Recorded as non-pointer, but actually a pointer:
For these, if they aren't otherwise reachable then they could be freed early.
[1] I think? But now I can't find the code. It looks like we could potentially work around this problem by rearranging the fields. Move tab and group up to align with a pointer. Move entryOffset, dirOffset, clearSeq, globalDepth off of pointers. The only problem is that we have 5 non-pointer words, but hiter only has 4. But globalDepth is small, so if we align it the top byte of a pointer, then that pointer will always look like a kernel address, which is probably OK. |
Release blocker until we make a decision. |
I count 6 in |
Here's what I've got so far:
We could consider this acceptable. With all of the pointers in Iter covered, we won't get memory corruption. If |
Radically different approach:
We could add an extra layer of indirection. This is unfortunate because it requires an extra allocation, and extra indirection in mapiternext, which will hurt performance. On the other hand, it is much simpler and arguably more future-proof [1], as we can freely modify [1] On the other hand, if packages like reflect2 copy this new 3-word Iter type, then we can never grow this struct beyond 3 words, which would be very painful. |
We could also mix these ideas, and make |
I also propose we add a new function with an open linkname:
This allocates (and initializes) a new iteration type. This would give projects using To be clear, they should stop that as well, but this migration would be simpler and would avoid future issues with the iter type definition. The changes to the internals of the map itself haven't been an issue because maps are allocated with |
(Sorry for using this issue as brainstorming board :P)
We could make |
The problem with the wrapper, or with |
Actually, maybe reflect2 causes an allocation of an |
It does escape (source). I think we'd need quite clever escape analysis to avoid an escape there, even with inlining. It it turns out this function doesn't inline anyway, as it has cost 83... Regardless, I am more concerned with correctness than performance of these unsafe packages. |
In my opinion we should not introduce an extra allocation for all programs in order to support the reflect2 package. That is carrying compatibility too far. |
Sure, so am I. I just want to make sure that we don't add an extra allocation in any new solution that needs adopting, because with that new allocation they won't adopt it. |
I am currently prototyping #71408 (comment). In this approach, normal Go programs will use newly introduced
Personally, I think that is a good compromise. No impact on normal Go programs, maintains correctness for users of linkname, and doesn't require any migration. It's true that linkname users will take a small performance hit. I think that is an acceptable trade-off for maintainability. It's true that there is a question of whether these users will try to work around this compatibility layer. It will be more difficult though, as |
The wrapper in #71408 (comment) seems like a good solution that would not slow down normal code (though of course has the cost of needing to be implemented by Michael or whoever). I think I recall at least one or maybe more of the linkname-related CLs that just redirected the unsafe linkname usage into the normal path, which meant it would function correctly but at "normal" speed (that is, no speed benefit from the linkname). I hunted for a couple of minutes just now to see if I could find an example, but didn't find one right away, but that's my recollection. |
Change https://go.dev/cl/643898 mentions this issue: |
Change https://go.dev/cl/643899 mentions this issue: |
mapiterinit allows external linkname. These users must allocate their own iter struct for initialization by mapiterinit. Since the type is unexported, they also must define the struct themselves. As a result, they of course define the struct matching the old hiter definition (in map_noswiss.go). The old definition is smaller on 32-bit platforms. On those platforms, mapiternext will clobber memory outside of the caller's allocation. On all platforms, the pointer layout between the old hiter and new maps.Iter does not match. Thus the GC may miss pointers and free reachable objects early, or it may see non-pointers that look like heap pointers and throw due to invalid references to free objects. To avoid these issues, we must keep mapiterinit and mapiternext with the old hiter definition. The most straightforward way to do this is to use mapiterinit and mapiternext as a compatibility layer between the old and new iter types. The first step to that is to move normal map use off of these functions, which is what this CL does. Introduce new mapIterStart and mapIterNext functions that replace the former functions everywhere in the toolchain. These have the same behavior as the old functions. This CL temporarily makes the old functions throw to ensure we don't have hidden dependencies on them. We cannot remove them entirely because GOEXPERIMENT=noswissmap still uses the old names, and internal/goobj requires all builtins to exist regardless of GOEXPERIMENT. The next CL will introduce the compatibility layer. I want to avoid using linkname between runtime and reflect, as that would also allow external linknames. So mapIterStart and mapIterNext are duplicated in reflect, which can be done trivially, as it imports internal/runtime/maps. For #71408. Change-Id: I6a6a636c6d4bd1392618c67ca648d3f061afe669 Reviewed-on: https://go-review.googlesource.com/c/go/+/643898 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
FYI Prometheus unit tests pass on all architectures with 1.24RC3. Thanks! |
Go version
go1.24rc2 linux/amd64
Output of
go env
in your module/workspace:What did you do?
When GOARCH was set to 386, and trying out the release candidates of Go 1.24, a unit test failed in the project Prometheus: https://github.com/prometheus/prometheus/actions/runs/12932314085/job/36070245239?pr=15707
I have simplified to just one use of the 'jsoniter' library.
https://go.dev/play/p/gtfdrAhV48r?v=gotip
(That link gets a different error:
could not parse netrc (GOAUTH=netrc): $HOME is not defined
, so I'll supply files too.)go.mod
go.sum
jsoniter_test.go
What did you see happen?
What did you expect to see?
The resulting JSON should be about 57 bytes long, with two items in it.
The test passes with GOARCH=amd64 and Go 1.24rc2, also with GOARCH=386 on Go 1.23.
The text was updated successfully, but these errors were encountered: