-
Notifications
You must be signed in to change notification settings - Fork 18k
x/net/idna: recent change adds ~200KB of machine code to any program using x/net/http2 #18582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It will be hard to reduce the size of these tables. Go 1.8 includes some of these tables, probably adding about 60-80k. I estimate the the full implementation adds about 90k of data plus code. There are two options to reduce this
I don't think keeping the old implementation and the new on as an option is a good idea. The old implementation differs in hard-to-document ways. Also, it pulls in different tables, meaning that someone using the new implementation will get an additional penalty making it even more expensive to "do the right thing". I don't have a good idea on how to make the additional accuracy optional, though. Probably something along the lines of having precise and imprecise Profiles in idna and then making the profile an option in Transport. The idna package would then be designed such that tables only associated with profiles that are not used are dropped by the linker. |
Correction, I was assuming you were comparing to the current implementation in core. In 1.8, core http wraps idna and adds a few tables to make it more correct. Compared to this, the new implementation adds about 50k. I ran some tests. Very roughly, the breakdown of bulky stuff used in the new implementation is: So norm seems to be the culprit. The bad news is that norm is rather important. The good news is that norm is already used in 1.8 so that effectively this implementation adds only about 50k (it eliminates the width table). There does seem to be one possibility: the idna tables already contain some of the normalization information. It may be possible to interleave some of the norm information into these tables and then reimplement the normalization algorithm for this purpose. A very rough estimate is that this will save about 60-80k. This is a very complicated optimization, though. It also adds up to 60k of data to applications that are already using norm somewhere anyway. I'm skeptical this is worth it. The bidirule could be made optional. As it only saves 15% of the total tables used here it seems not worth the trouble. But if people think so anyway it is worth considering. |
Please take a look at https://go-review.googlesource.com/#/c/36512/. I've now adopted the semantics that a zero Profile only does Punycode and that any validation is optional. A few pre-made profiles are provided for Lookup and Registration cases of the IDNA RFC. One could argue that at least a normalization check should at least always be in there, but that's also the largest table. If validation is not used, tables are not linked in. The default ToUnicode and ToASCII are now wrappers for idna.Punycode.{ToUnicode|ToASCII}. A bit odd, but it is justifiable to take the zero profile as the default. This means the interpretation of these functions is virtually left unchanged with this elaborate update while still providing all the UTS 46 goodness. |
CL https://golang.org/cl/36512 mentions this issue. |
ToASCII and ToUnicode now closely mimic the current definitions in x/net/idna: doing a bare Punycode conversion. This may be odd, but preserves backwards compatibility. The changes also ensure that tables used by unused validations are dropped. The new semantics is that the zero Profile does a bare Punycode transformation. The user can define new Profiles with additional restrictions. Two high-level options are provided to mimic the Registration and Lookup use cases defined in the RFC. validateAndMap is identical to the code that was removed from process. Updates golang/go#18567 Updates golang/go#18582 Change-Id: If9e1a1037f968758c1b3f150860773770d032f4e Reviewed-on: https://go-review.googlesource.com/36512 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL https://golang.org/cl/37050 mentions this issue. |
Change #34770 (golang/net@67957fd) adds a bunch of stuff to
x/net/idna
and also adds a couple dependencies ongolang.org/x/text/secure/bidirule
andgolang.org/x/text/unicode/norm
, which in turn are pulling ingolang.org/x/text/transform
andgolang.org/x/text/unicode/bidi
.This totals around 20k SLOC (excluding tests etc), which translates into an additional 230749 bytes of machine code (for linux/386, and 172896 bytes when compiled with
-ldflags="-s -w"
). For linux/amd64 it's an extra 269317 bytes (197664 bytes when compiled with-ldflags="-s -w"
).I'm all for the IDNA code to be complete and correct, and I understand that this stuff is more complicated than it seems, but that seems like a hefty tax to add to all the binaries that transitively depend on this code, many of which are backend binaries that don't actually need this stuff.
I'm not sure what's the best way to trim the fat here but @bradfitz suggested to file a bug to continue the discussion, so here we are.
cc @mpvl @nigeltao
The text was updated successfully, but these errors were encountered: