Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/flate, archive/zip: large memory allocations reading flate zip archive with many files #59774

Closed
subtle-byte opened this issue Apr 22, 2023 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance

Comments

@subtle-byte
Copy link
Contributor

subtle-byte commented Apr 22, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do? What did you see instead?

I noticed large memory allocations reading a flate-endoded zip archive with many files. Looks like opening each file 4KB buffer is allocated (if I read the files sequentially it is not needed - GC does a lot of work). So if I open a zip archive with 50000 files the memory allocations will be about 200 MB in sum just for the buffers.

See the test in the prepared PR for a code example.

What did you expect to see?

Less memory allocations.

@gopherbot
Copy link

Change https://go.dev/cl/487675 mentions this issue: compress/flate, archive/zip: reduce memory allocations

@seankhliao seankhliao added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 22, 2023
@seankhliao
Copy link
Member

cc @dsnet

related #32371 ?

@subtle-byte subtle-byte changed the title compress/flate, archive/zip: large memory allocations reading zip archive with many files compress/flate, archive/zip: large memory allocations reading flate zip archive with many files Apr 22, 2023
@subtle-byte
Copy link
Contributor Author

subtle-byte commented Apr 22, 2023

@seankhliao I think it is unrelated, the mentioned issue does not say about allocations in Reset method, but mine is actually about this (zip reader calls Reset for each file in the archive). Let me show an example:

func BenchmarkFlateReset(b *testing.B) {
	r := &struct{ io.Reader }{strings.NewReader("abc")} // so r does not implement io.ByteReader
	fr := flate.NewReader(r) // 4KB buffer is allocated
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		for i := 0; i < 1000; i++ {
			fr.(flate.Resetter).Reset(r, nil) // Should not allocate 4KB buffer again
		}
	}
}

In the latest Go I get the following results:

BenchmarkFlateReset-8   	    1821	    695933 ns/op	 4192019 B/op	    2000 allocs/op

So it is allocated about (unsafe.Sizeof(bufio.Reader{})+<default bufio.Reader buffer size>)*1000=(88+4096)*1000. But 96 bytes are allocated instead of 88 because it is first larger allocation size class. So (96+4096)*1000=4_192_000 bytes.
After my PR it looks better:

BenchmarkFlateReset-8   	   10000	    115474 ns/op	       0 B/op	       0 allocs/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants