-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: use -64 instead of 0 for "different file" position encoding in binary exporter #20080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I was wondering about that when I was thinking about the encoding. Do you have concrete numbers? |
I instrumented bexport's p.pos, and ran 'go build -a std cmd'. Of the calls to pos, here's everything at >1%:
Results on the compilebench set (requires CL 41053):
So looks like it shrinks export data by 1-5% and object files sizes by a tiny amount. |
If we do this, another thing to consider at the same time is switching from using commonPrefixLen for filenames to just writing dir and file separately. A quick hack (diff below) shows another minor win from this, due entirely to the elimination of the need to encode of the length of the common prefix.
diff --git a/src/cmd/compile/internal/gc/bexport.go b/src/cmd/compile/internal/gc/bexport.go
index 3637804a12..7adcbdde28 100644
--- a/src/cmd/compile/internal/gc/bexport.go
+++ b/src/cmd/compile/internal/gc/bexport.go
@@ -118,6 +118,7 @@ import (
"encoding/binary"
"fmt"
"math/big"
+ "path/filepath"
"sort"
"strings"
)
@@ -532,17 +533,12 @@ func (p *exporter) pos(n *Node) {
} else {
// different file
p.int(0)
- // Encode filename as length of common prefix with previous
- // filename, followed by (possibly empty) suffix. Filenames
- // frequently share path prefixes, so this can save a lot
- // of space and make export data size less dependent on file
- // path length. The suffix is unlikely to be empty because
- // file names tend to end in ".go".
- n := commonPrefixLen(p.prevFile, file)
- p.int(n) // n >= 0
- p.string(file[n:]) // write suffix only
+ p.int(line) // line >= 0
+ // Encode filenames as dir+path, so that common dirs can be shared.
+ // TODO: many dirs are strict prefixes of others. Can we use this?
+ d, f := filepath.Split(file)
+ p.string(d)
+ p.string(f)
p.prevFile = file
- p.int(line)
}
p.prevLine = line
}
diff --git a/src/cmd/compile/internal/gc/bimport.go b/src/cmd/compile/internal/gc/bimport.go
index 7766a5617a..c24e216956 100644
--- a/src/cmd/compile/internal/gc/bimport.go
+++ b/src/cmd/compile/internal/gc/bimport.go
@@ -387,9 +387,9 @@ func (p *importer) pos() src.XPos {
line += delta
} else if n := p.int(); n >= 0 {
// file changed
- file = p.prevFile[:n] + p.string()
+ file = p.string() + p.string()
+ line = n
p.prevFile = file
- line = p.int()
p.posBase = src.NewFileBase(file, file)
}
p.prevLine = line |
CL https://golang.org/cl/41619 mentions this issue. |
The int 0 is used in the binary import/export format to indicate "different file". If the files are the same, an offset of 0 lines is represented by adding a second int of -1. See exporter.pos.
However, over 50% of same-file-different-line position deltas have a line delta of 0, so this representation is inefficient. A simple fix is to use the magic value -64 for "different file". That line offset is rare. -64 is the smallest value that fits in a single byte with varint encoding.
The implementation is trivial--it consists of changing 0 to -64 in a few places.
The savings are small, though, so it's not worth doing unless we are changing the export format for some other reason as well--thus an issue instead of a CL.
cc @mdempsky @griesemer
The text was updated successfully, but these errors were encountered: