-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String iteration with range should work with bytes, not code points #1185
Labels
Comments
Being more interesting or not is a very subjective argument. It really depends on what one is trying to achieve. The fact that the two versions above look like very reasonable alternatives to each other, and that because of this the behavior is inconsistent and error prone, is not subjective. |
Fango pointed out in the ML the issue of the extra space consumed by []int(s). To solve this issue, we can easily introduce a function in the utf8 package to help with space-efficient iteration when going through utf8 code points is desired: for i := 0; i != len(s); { rune, i := utf8.NextRune(s, i) ... } Also, an additional issue spotted in the specification: "A "for" statement with a "range" clause iterates through all entries of an array, slice, string or map, or values received on a channel. " It doesn't really iterate through all entries of the string today, unless we determine that a string isn't made out of bytes, but of code points. |
Feel free to discuss more on the mailing list. As you might imagine we spent a long time on the design of this, so the claim that it "doesn't really make any sense" doesn't ring true to us. Either way, the issue tracker is the wrong place for long discussions. Status changed to WorkingAsIntended. |
It probably doesn't ring true precisely because you've spent a long time on the design and implementation of this behavior. For someone looking at the two iterations above, deprived of any further insights on the choice made, it really doesn't make sense, if you forgive my frankness. I can certainly see the pragmatic reason why it works this way, but it feels like a language design wart which could be avoided (avoiding with it the surprise and future bugs) by either putting the feature in the library, or by optimizing the compilation of "range []int(...)" to what it does right now.). Either way, I've already presented the argument here and in the mailing list (hopefully in a clear way). Unless there's further interest from the designers in seeing this fixed/changed, it probably won't help much to continue the discussion. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: