-
Notifications
You must be signed in to change notification settings - Fork 18k
encoding/json: revert "reuse values when decoding map elements" #39149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I was going to concede this (perhaps grudgingly), but my argument for it convinced me the other way again. When describing how decode works with various types, the documentation mostly avoids saying "stores". For instance:
This doesn't actually explicitly state what happens when the keys match the keys used by Marshal, but we can reasonably infer that it stores them. Of the specific types, only the map says that Unmarshal "stores" keys. And you draw a reasonable distinction between "store" and "merge". But wait! What else, if anything, uses the word "stores"?
Which is to say: The behavior currently described for each of these types is what Unmarshal means by "stores". If we are "storing" a value into a map, with Unmarshal, we should be doing to it the same thing we would have done had it been the top-level parameter to Unmarshal. Which is to say that, for a pointer to a struct, we should do what we would do with that normally: Store the value into the thing-pointed-to (or allocate a new zero-valued one if the pointer is currently nil), and then follow the rules given for storing into a struct -- which overwrites explicitly provided fields, but won't zero out previously-existing non-zero fields. So I think that if existing users expected "store" to mean "overwrite deleting the old contents" for a pointer-to-a-struct, they were wrong, and if they expected it to mean that only if the value was in a map, they were... still probably wrong, because unmarshal is recursive, so "storing" is still governed by the definition the function provides for what it means to "store" json values in existing values. |
Interesting line of reasoning to derive a different definition of "store" than one would typically expect since it's already used elsewhere in the documentation to mean some form of merging. I'm inclined to agree with you and would prefer consistent behavior for structs and maps. However, this doesn't change the fact that there's a decent amount of code that depends on the old behavior. There's a pretty high bar for what changes we permit that may break user code. I've been involved with Go release regression testing for about 4.5 years and this change is up there for most number of breakages caused (that I can remember). The other instance that I can remember that had more regression failures was the monotonic time change which broke many targets relying on The arguments made for the monotonic time change don't quite apply here. While it is nice to unify the behavior of structs and maps and this change benefits some users, it's not a critical feature nor critical bug fix. Secondly, the fact that there's even confusion and debate about proper behavior means that it's unfair to tell affected users that they were holding it wrong for go1.14 and below and thus break them. |
Thanks @dsnet for bringing this up. Could you clarify what those 50 cases look like? Why are they relying on json to delete data, when the package almost never does that? I agree with @seebs that I've never read "stores" as "overwrites" or "replaces"; I personally think it's pretty clear from this line at the top of the API, like he said:
If we conclude that "store" is ambiguous, then we should conclude that most "merge" versus "replace" behavior is an implementation detail and not properly documented, which means we basically need to modify the docs to reflect hyrum's law :) I never imagined so many internal Google targets would break with this change, though. I think they reasonably fall under "made incorrect assumptions from the docs", so I do think they should be fixed. Whether or not 50 is too many is a bit subjective. For example, out of how many in total using json? You mention your experience over the past five years of releases, but I imagine the number of targets has also been increasing. |
This is the reason I created #31924. We were pre-filling a large struct with default values and then calling I suspect this was the first bug report because this particular issue is difficult to pin on Go itself. When you first encounter it, you tend to think it's a mistake somewhere in your code, especially when you look at the behavior of struct pointers directly. I ran into it twice before really investigating and finding out it was an inconsistency in the language itself. |
Change https://golang.org/cl/234559 mentions this issue: |
I've created a revert CL, just for the sake of being ready to move quickly before the beta is out. If we decide to go ahead with the revert, I think two follow-up steps should be done:
|
The user code is calling As far as I can tell, no one seems to be depending on merge semantics due to duplicate keys in an object. If that were the case, then there would be greater justification to break them since RFC 7159 says that "when the names within an object are not unique, the behavior of software that receives such an object is unpredictable".
Definitely agree it's subjective, hence this issue to debate this :) I'm not at liberty to reveal exactly how many usages of For some reference, I gave a talk back in 2017 about the regression testing that we do. Some things to note:
+1, would love to see more consistent behavior in some hypothetical future v2 |
I think it's a shame that 50 internal failures can justify solidifying such an inconsistency in the API. I propose that we somehow evaluate whether external open source code relies on this behavior because that would give us perspective over whether this is a result of a ubiquitous misunderstanding that many users have come to rely on or just the product of a few internal users at google writing code in the same way. |
If you watched my talk, this has nothing to do with Google vs outside Google. The Go project has a high standard of trying not to break people's code (regardless of what company it belongs to). It is intended to be a language platform that people can depend on and we need to judiciously make informed decisions that evaluate the costs of making a change if any breakages are known (which is definitely the case here). Sometimes you break people if it is justified, sometimes you don't and bite your own tongue. Google's suite of tests happens to be a decent representation of all the Go code out there in the world. We know it's not completely representative, but it's far better than nothing and pure speculation. Google funds the Go project and it's a blessing to have the resources to even perform such regression testing. However that regression testing is unsurprisingly geared towards Google's internal codebases since we're leveraging existing platforms for practical reasons.
If you have access to a massive suite of open-source projects and can run the tip toolchain on millions of packages, I'd love to see that data. Otherwise it's not helpful objecting to concrete data that's been presented with a proposal to do something that's really nice in theory, but currently infeasible in practice. |
The position of consistency is always easier to defend that correctness, especially in a language that wants to be backwards-compatible. However, I believe that baking in compatibility based on very open assumptions should be justified on better predicates, especially when the behavior is based on what many users see as a perceived bug and have presented good arguments for fixing. We are making a decision based on the consistency of the entire I don't know, because I can't see the packages that failed, or understand why they did in the first place. If some other organization complained about 50 test failures, we certainly would want to see the code that lead us to that point. Which is why I believe this actually is a |
I disagree correctness is well-defined here. I certainly agree that the change is more consistent and have stated support for it already multiple times in this issue. However, consistency != correctness, especially when the documentation has not been clear regarding this.
It's evident that you don't trust Google, can you provide evidence that this isn't a problem in open-source? My stance in this is irrelevant of where the failures are occurring (Google or not). I've already stated before that I'm more concerned about the overall Go ecosystem.
You're preaching to the choir. I maintain many widely used packages and it would be incredibly convenient if I could arbitrarily break users for the sake of a better future especially if they are depending on undefined behavior. However, sometimes your hands are tied if a sufficient number of users do things contrary to what you desire.
Yes. What users do with a library should (for better or worse) have an effect on what changes are made by the authors of that library. Certainly you may have a different stance, but the Go project holds to a high standard of not breaking users that are not doing something clearly wrong. The If you don't care what you're users do with your library, then release it at v0 and make changes as you see fit. In fact, the Go standard library did that for many years from 2009 to 2012. We're in v1 now and we need to take backwards compatibility seriously. |
This is not what I was trying to convey, but based on this I feel the discussion has become unproductive, so I will leave to the others to discuss the topic instead. |
I don't think correctness is clearly defined, but I think that the interpretation of store-as-recursive is clearly better. It's not really a question of whether it's Google, specifically; the problem is that while I am now aware that there exist 50 packages which are affected, I don't know the denominator for that numerator, and I don't have even one example I can look at. So I don't know what kind of breakage this is. Maybe it could be fixed by changing maps to be maps of It seems like it would help a lot with evaluating things like this if at least some of the test cases were available to look at. Ideally, I'd want to talk to someone whose code was affected and find out whether it was intentional, whether they expected the behavior, etcetera. If nothing else, perhaps this would be a good place to put in a deprecation note. Go has previously done release notes with "we're going to change X for consistency" type things and that seems to have worked okay. |
Hello! This is one of the few remaining issues blocking the Beta release of Go 1.15. We'll need to make a decision on this in the next week in order to keep our release on schedule. |
I share what @seebs said in #39149 (comment). It's hard for me to properly understand what kind of regressions we're seeing, if I can't see what the code was doing or why. Having said that, like @dsnet says, we just don't have a large corpus of third party Go tests using I think at best we're evenly split on whether or not this breakage is OK given the Go1 compatibility guarantee. Plus, the beta is just a few days away. So I think we should revert, even if that feels like wasted effort on my part. The revert CL is ready, so if @dsnet wants to give it a review, it can be merged very soon. Once merged, I'll do the two follow-up steps outlined in #39149 (comment). Those will still help improve the json package in the long term, at least. |
As a note, reverting this change does not mean this discussion has to end. Thanks for handling the revert and follow-up items, @mvdan. |
Indeed, we could try again for 1.16, though I don't think that's a good idea unless such a "corpus of json tests" is constructed. |
A couple of days have elapsed, and there are only two working days left until the beta, so I'll go ahead and merge. Thanks @andybons for the review. If someone strongly feels about giving this another try in 1.16, please file a new issue. |
We've looked over this issue in a release meeting. Based on the discussion above, though it was a close call in either direction, the decision we're making is to proceed with a rollback, which is implemented in CL 234559. I'll update the issue state to NeedsFix to reflect that. Thank you for your diligent work no this @mvdan and the original report @dsnet. The |
Huh, comment race condition :) |
Regression testing showed that about ~50 targets would break with https://golang.org/cl/179337 which was submitted to address #31924.
Personally, I think https://golang.org/cl/179337 brings consistency between the handling of structs and maps and I appreciate that fact. However, the old behavior is technically specified as the documentation for maps says:
The documentation uses the term "stores", which matches the behavior of go1.14 and before. On the other hand, https://golang.org/cl/179337 (for go1.15) has functionally changed the behavior of
Unmarshal
such that it "merges key-value pairs from the JSON object into the map".While ~50 targets isn't terribly a lot in the grand scheme of things, it is definitely on the higher end for what is usually acceptable for changes. If the existing user-code was incorrectly using
encoding/json
, then there would be more justification for the Go standard library to break them. However, that does not seem to be the case here and I think we're bound by the Go1 compatibility agreement to preserve the old behavior.\cc @dmitshur @mvdan @ianlancetaylor
The text was updated successfully, but these errors were encountered: