Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/gzip: pooling flate.Writer #65114

Closed
proost opened this issue Jan 16, 2024 · 6 comments
Closed

compress/gzip: pooling flate.Writer #65114

proost opened this issue Jan 16, 2024 · 6 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@proost
Copy link

proost commented Jan 16, 2024

Proposal Details

can pooling flate.Writer below?

		if z.compressor == nil {
			z.compressor, _ = flate.NewWriter(z.w, z.level)
		}

Before:

goos: darwin
goarch: arm64
pkg: compress/gzip
Benchmark_gzip
Benchmark_gzip-10         19827         65751 ns/op        807084 B/op        16 allocs/op

After:

goos: darwin
goarch: arm64
pkg: compress/gzip
Benchmark_gzip
Benchmark_gzip-10    	   79310	     13082 ns/op	     288 B/op	       3 allocs/op

I know that when using gzip.Writer, most case use pool of gzip.Writer. but pooling flate.Writer can reduce memory pressure slightly more.

@gopherbot gopherbot added this to the Proposal milestone Jan 16, 2024
@mauri870 mauri870 added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 16, 2024
@mauri870
Copy link
Member

Taking this out of the proposal process as it does not require api changes.

I think adding a pool for flate is valid given the improvements in performance.

cc @dsnet per https://dev.golang.org/owners

@mauri870 mauri870 changed the title proposal: compress/gzip: pooling flate.Writer compress/gzip: pooling flate.Writer Jan 16, 2024
@mauri870 mauri870 removed the Proposal label Jan 16, 2024
@proost
Copy link
Author

proost commented Jan 17, 2024

Can i contribute this issue?

@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Jan 17, 2024

can reduce memory pressure slightly more.

How much of the difference would there be between pooling writers and compressors?

Benchmark_gzip

This looks like a custom benchmark, could you provide the code please?

@proost
Copy link
Author

proost commented Jan 18, 2024

@AlexanderYastrebov
Sure, Absolutely!

I found that depending on compression level, benchmark can be different. so i change benchmark above.

func Benchmark_gzipHuffmanOnly(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, HuffmanOnly)
}

func Benchmark_gzipDefaultCompression(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, DefaultCompression)
}

func Benchmark_gzip0(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, NoCompression)
}

func Benchmark_gzip1(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 1)
}

func Benchmark_gzip2(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 2)
}

func Benchmark_gzip3(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 3)
}

func Benchmark_gzip4(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 4)
}

func Benchmark_gzip5(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 5)
}

func Benchmark_gzip6(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 6)
}

func Benchmark_gzip7(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 7)
}

func Benchmark_gzip8(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 8)
}

func Benchmark_gzip9(b *testing.B) {
	b.ReportAllocs()
	b.ResetTimer()

	benchmarkGzipLevel(b, 9)
}

func benchmarkGzipLevel(b *testing.B, level int) {
	for i := 0; i < b.N; i++ {
		var buf bytes.Buffer
		w, _ := NewWriterLevel(&buf, level)
		w.Write([]byte("hello world"))
		w.Flush()
		w.Close()
	}
}
  • Huffman

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzipHuffmanOnly
    Benchmark_gzipHuffmanOnly-10    	  788176	      1490 ns/op	     328 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzipHuffmanOnly
    Benchmark_gzipHuffmanOnly-10    	   19839	     56348 ns/op	  737974 B/op	      18 allocs/op
  • Default

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzipDefaultCompression
    Benchmark_gzipDefaultCompression-10    	   73125	     15166 ns/op	     315 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzipDefaultCompression
    Benchmark_gzipDefaultCompression-10    	   18640	     59237 ns/op	  814007 B/op	      20 allocs/op
  • 0

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip0
    Benchmark_gzip0-10    	 4879906	       238.9 ns/op	     394 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip0
    Benchmark_gzip0-10    	   26554	     44113 ns/op	  733363 B/op	      16 allocs/op
  • 1

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip1
    Benchmark_gzip1-10    	 5075827	       237.3 ns/op	     410 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip1
    Benchmark_gzip1-10    	   16958	     71365 ns/op	 1200313 B/op	      19 allocs/op
  • 2

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip2
    Benchmark_gzip2-10    	   80785	     14882 ns/op	     324 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip2
    Benchmark_gzip2-10    	   17382	     65414 ns/op	  814008 B/op	      20 allocs/op
  • 3

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip3
    Benchmark_gzip3-10    	   77919	     14819 ns/op	     304 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip3
    Benchmark_gzip3-10    	   18144	     64379 ns/op	  814008 B/op	      20 allocs/op
  • 4

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip4
    Benchmark_gzip4-10    	   76875	     14987 ns/op	     325 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip4
    Benchmark_gzip4-10    	   18896	     63697 ns/op	  814008 B/op	      20 allocs/op
  • 5

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip5
    Benchmark_gzip5-10    	   74888	     14885 ns/op	     304 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip5
    Benchmark_gzip5-10    	   18529	     66656 ns/op	  814008 B/op	      20 allocs/op
  • 6

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip6
    Benchmark_gzip6-10    	   79525	     14897 ns/op	     324 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip6
    Benchmark_gzip6-10    	   18442	     63420 ns/op	  814008 B/op	      20 allocs/op
  • 7

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip7
    Benchmark_gzip7-10    	   76423	     14833 ns/op	     314 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip7
    Benchmark_gzip7-10    	   18171	     64918 ns/op	  814008 B/op	      20 allocs/op
  • 8

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip8
    Benchmark_gzip8-10    	   77782	     14842 ns/op	     325 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip8
    Benchmark_gzip8-10    	   17658	     65884 ns/op	  814009 B/op	      20 allocs/op
  • 9

    pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip9
    Benchmark_gzip9-10    	   78577	     14919 ns/op	     324 B/op	       4 allocs/op

    non-pooling

    goos: darwin
    goarch: arm64
    pkg: compress/gzip
    Benchmark_gzip9
    Benchmark_gzip9-10    	   16924	     67334 ns/op	  814009 B/op	      20 allocs/op

@dsnet
Copy link
Member

dsnet commented Jan 18, 2024

I don't think it's wise to add pooling in situations where sync.Pool.Get and the corresponding sync.Pool.Put are outside the control of the package itself.

In this situation, the call to sync.Pool.Get would presumably occur in gzip.Writer.Write, while the associated call to sync.Pool.Put would presumably happen in gzip.Writer.Close. Whether these are ever called is outside of gzip's control. This has at least two issues:

  • It assumes that users always call gzip.Writer.Close otherwise we leak the pooled resource, which can affect global performance. Unlike os.File, where not closing the file leaks a file descriptor, there was never a requirement for gzip.Writer.Close to be called. In many code paths where an error may occur, there is no deferred call to gzip.Writer.Close.
  • It will hurt use cases that already rely on gzip.Writer.Reset. When gzip.Writer.Close is called, it cannot distinguish between whether the user intends to manually reuse use the writer or whether the writer will shortly be GC-able. Always putting the resource back in the pool and grabbing it back out again later may have pathological effects if the pool is often drained out (perhaps due to the previous effect). A sync.Pool is more expensive than a regular allocation when empty since it needs to do O(n) work checking if other P's have a resource available.

With the current API, pooling should be the responsibility of the user.

@proost
Copy link
Author

proost commented Jan 19, 2024

OK. I understood. Thank you for answering!

@proost proost closed this as completed Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants