Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(266)

Issue 6999045: code review 6999045: go.net/publicsuffix: tighten the encoding from 8 bytes ... (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 3 months ago by nigeltao
Modified:
12 years, 3 months ago
Reviewers:
CC:
volker.dobler, golang-dev
Visibility:
Public.

Description

go.net/publicsuffix: tighten the encoding from 8 bytes per node to 4. On the full list (running gen.go with -subset=false): Before, there were 6086 nodes (at 8 bytes per node) before. After, there were 6086 nodes (at 4 bytes per node) plus 354 children entries (at 4 bytes per node). The difference is 22928 bytes. In comparison, the (crushed) text is 21082 bytes, and for the curious, the longest label is 36 bytes: "xn--correios-e-telecomunicaes-ghc29a". All 32 bits in the nodes table are used, but there's wiggle room to accomodate future changes to effective_tld_names.dat: The largest children index is 353 (in 9 bits, so max is 511). The largest node type is 2 (in 2 bits, so max is 3). The largest text offset is 21080 (in 15 bits, so max is 32767). The largest text length is 36 (in 6 bits, so max is 63). benchmark old ns/op new ns/op delta BenchmarkPublicSuffix 19948 19744 -1.02%

Patch Set 1 #

Patch Set 2 : diff -r 3a318dc9be38 https://code.google.com/p/go.net #

Patch Set 3 : diff -r 3a318dc9be38 https://code.google.com/p/go.net #

Total comments: 12

Patch Set 4 : diff -r 3a318dc9be38 https://code.google.com/p/go.net #

Patch Set 5 : diff -r b0dd3b602c14 https://code.google.com/p/go.net #

Unified diffs Side-by-side diffs Delta from patch set Stats (+293 lines, -161 lines) Patch
M publicsuffix/gen.go View 1 2 3 4 9 chunks +132 lines, -43 lines 0 comments Download
M publicsuffix/list.go View 1 2 chunks +14 lines, -10 lines 0 comments Download
M publicsuffix/list_test.go View 1 1 chunk +8 lines, -0 lines 0 comments Download
M publicsuffix/table.go View 1 2 3 2 chunks +139 lines, -108 lines 0 comments Download

Messages

Total messages: 4
nigeltao
Hello dr.volker.dobler@gmail.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/go.net
12 years, 3 months ago (2012-12-21 10:03:35 UTC) #1
volker.dobler
LGTM https://codereview.appspot.com/6999045/diff/3/publicsuffix/gen.go File publicsuffix/gen.go (right): https://codereview.appspot.com/6999045/diff/3/publicsuffix/gen.go#newcode271 publicsuffix/gen.go:271: // Nodes is the list of nodes. Each ...
12 years, 3 months ago (2012-12-21 22:36:54 UTC) #2
nigeltao
Submitting... https://codereview.appspot.com/6999045/diff/3/publicsuffix/gen.go File publicsuffix/gen.go (right): https://codereview.appspot.com/6999045/diff/3/publicsuffix/gen.go#newcode271 publicsuffix/gen.go:271: // Nodes is the list of nodes. Each ...
12 years, 3 months ago (2012-12-22 01:01:02 UTC) #3
nigeltao
12 years, 3 months ago (2012-12-22 01:11:01 UTC) #4
*** Submitted as
https://code.google.com/p/go/source/detail?r=3d9483b02eb3&repo=net ***

go.net/publicsuffix: tighten the encoding from 8 bytes per node to 4.

On the full list (running gen.go with -subset=false):

Before, there were 6086 nodes (at 8 bytes per node) before. After,
there were 6086 nodes (at 4 bytes per node) plus 354 children entries
(at 4 bytes per node). The difference is 22928 bytes.

In comparison, the (crushed) text is 21082 bytes, and for the curious,
the longest label is 36 bytes: "xn--correios-e-telecomunicaes-ghc29a".

All 32 bits in the nodes table are used, but there's wiggle room to
accomodate future changes to effective_tld_names.dat:

The largest children index is 353 (in 9 bits, so max is 511).
The largest node type is 2 (in 2 bits, so max is 3).
The largest text offset is 21080 (in 15 bits, so max is 32767).
The largest text length is 36 (in 6 bits, so max is 63).

benchmark                old ns/op    new ns/op    delta
BenchmarkPublicSuffix        19948        19744   -1.02%

R=dr.volker.dobler
CC=golang-dev
https://codereview.appspot.com/6999045
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b