Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: use less memory for large []byte literal #6643

Open
rsc opened this issue Oct 23, 2013 · 6 comments
Open

cmd/compile: use less memory for large []byte literal #6643

rsc opened this issue Oct 23, 2013 · 6 comments
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Oct 23, 2013

[]byte literals take up a lot of memory inside the compiler, because each byte in the
literal is a separate syntax Node and, worse, each byte is represented by a
multiprecision integer constant.

Probably a trick is required during parsing to turn []byte{...} into an actual byte
array holding the constant values + a list of index and value for the non-constant data.
@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 1:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 2:

Labels changed: added repo-main.

@odeke-em
Copy link
Member

@mdempsky might you be interested in this?

@mdempsky mdempsky modified the milestones: Go1.10, Go1.11 Nov 29, 2017
@bradfitz bradfitz added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels May 29, 2018
@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 May 29, 2018
@Kingwl
Copy link

Kingwl commented Aug 27, 2018

i'd like (try) to start my first pr on this,
maybe need some help :)

@josharian
Copy link
Contributor

Help is always welcome. The first step would be to reproduce the issue and convince ourselves that it is worth fixing. This issue was originally filed in 2013, and a lot has changed in the compiler since then. A good way to demonstrate this would be to write some realistic code (presumably autogenerated, maybe from go-bindata, cc @kevinburke) demonstrating that the byte slices are in fact still a major memory factor. And if it turns out that they aren't, that's also really useful to know. Thanks!

@pacew
Copy link

pacew commented Nov 5, 2018

Hi. I'm new here, and not ready to dig into the compiler sources yet, but I can report on some data I collected relevant to the question of "what is the current situation for compiling big bytes literals?".

The results in short: On Ubuntu 18.04 and go version 1.11.2 linux/amd64, bytes literals of up to 128k elements work with no appreciable extra compiler memory needed, as do string literals to at least 2 million elements. Larger bytes literals require about 865 extra bytes of compiler memory for every additional element. So, a 2 million element literal needs 1.8 gigabytes of compiler memory.

Here's a graph: https://github.com/pacew/goissue6643/blob/master/goissue6643.png

Perhaps this will give the compiler stewards the information needed to decide whether to pursue making bytes literals as efficient as strings.

My approach is to use the setrlimit(2) system call to squeeze down the data segment size for the compiler until it fails for a test file with a given size literal, then plotting the results for "string" and "bytes" literals.

It appears the compiler requires a data segment size of 200 megabytes to do anything, and this stays pretty much constant for string literals up to at least 2 megabytes.

Using bytes literals with up to 128k elements doesn't cause the compiler data requirement to change. At 256k elements, the compiler requires 306 megabytes, then the growth is linear with about 865 bytes of compiler data needed for each additional element of the literal. I tested as far as 2 million element literal needing 1.8 gigabytes of compiler ram.

The repository https://github.com/pacew/goissue6643 contains the helper programs I wrote (mainly a C program that does a binary search on ulimits), along with a graph of the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Projects
None yet
Development

No branches or pull requests

8 participants