Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container/list: insert and remove is re-used but contains inefficiencies #27747

Closed
ValarDragon opened this issue Sep 19, 2018 · 7 comments
Closed
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@ValarDragon
Copy link

ValarDragon commented Sep 19, 2018

Within list, currently MoveToFront is implemented as follows:

func (l *List) MoveToFront(e *Element) {
	if e.list != l || l.root.next == e {
		return
	}
	// see comment in List.Remove about initialization of l
	l.insert(l.remove(e), &l.root)
}

Because we're inserting the node back in immediately, the following 4 lines in remove are unnecessary, as their effect is undone by insert:

	e.next = nil // avoid memory leaks
	e.prev = nil // avoid memory leaks
	e.list = nil
  l.len--

(similarly the lines in insert that set e.list and increase l.len are also unnecessary)

This pattern of l.insert(l.remove(e), node) is used a couple of times in this function. The efficiency could be improved by making a single combined function to do both remove from one location and insert elsewhere, which doesn't perform these unnecessary lines.

This would increase efficiency by reducing the number of statements, and by reducing function call overhead.

I would be happy to write a benchmark for this, and submit a CL for this if it seems valuable.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 19, 2018
@bcmills bcmills added this to the Unplanned milestone Sep 19, 2018
@bcmills
Copy link
Contributor

bcmills commented Sep 19, 2018

I would be somewhat surprised if those lines have any performance effect at all: the compiler should be able to inline both insert and remove, and in that case can clearly see that the stores are redundant.

(Number of statements and number of function calls do not necessarily correlate to performance.)

@bcmills
Copy link
Contributor

bcmills commented Sep 19, 2018

Beyond that, I would expect that list usage in practice is pretty much completely dominated by cache effects, and redundant stores have very little impact on the cache hierarchy.

Unoptimized dead stores may make a big difference in microbenchmarks of very small lists, but I would expect that they're in the noise in the vast majority of real programs.

@ValarDragon
Copy link
Author

ValarDragon commented Sep 21, 2018

This did actually make some a performance impact on my system:

Before:

BenchmarkListMoveToFront/size_16-8         	    100000000	        12.6 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_64-8         	    100000000	        12.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_256-8         	  100000000	        13.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_1024-8        	  100000000	        13.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_4096-8        	  100000000	        13.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_16384-8        	  100000000	        13.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_65536-8        	  100000000	        13.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_262144-8        	100000000	        14.2 ns/op	       0 B/op	       0 allocs/op

After:

BenchmarkListMoveToFront/size_16-8         	    200000000	         9.31 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_64-8         	    200000000	         9.29 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_256-8         	  200000000	         9.38 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_1024-8        	  200000000	         9.58 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_4096-8        	  200000000	         9.79 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_16384-8        	  200000000	         9.83 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_65536-8        	  100000000	        10.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkListMoveToFront/size_262144-8        	100000000	        10.9 ns/op	       0 B/op	       0 allocs/op

This is suffciently simple to implement that I think we should include it. Your right that the performance of this probably doesn't make a difference in most applications. Though it may matter for LRU's.

@ghost
Copy link

ghost commented Sep 21, 2018

@ValarDragon you can use benchstat to get a better comparison
https://godoc.org/golang.org/x/perf/cmd/benchstat

@ValarDragon
Copy link
Author

ValarDragon commented Sep 21, 2018

Sure:

ListMoveToFront/size_16-8        12.9ns ± 0%     9.4ns ± 0%  -27.11%  (p=0.008 n=5+5)
ListMoveToFront/size_64-8        13.1ns ± 1%     9.4ns ± 1%  -28.12%  (p=0.008 n=5+5)
ListMoveToFront/size_256-8       13.3ns ± 2%     9.5ns ± 1%  -28.73%  (p=0.008 n=5+5)
ListMoveToFront/size_1024-8      13.5ns ± 2%     9.7ns ± 1%  -28.73%  (p=0.008 n=5+5)
ListMoveToFront/size_4096-8      13.7ns ± 4%     9.8ns ± 1%  -28.21%  (p=0.008 n=5+5)
ListMoveToFront/size_16384-8     13.6ns ± 4%     9.9ns ± 2%  -26.96%  (p=0.008 n=5+5)
ListMoveToFront/size_65536-8     14.0ns ± 5%    10.3ns ± 1%  -26.57%  (p=0.008 n=5+5)
ListMoveToFront/size_262144-8    14.7ns ± 3%    11.1ns ± 0%  -24.69%  (p=0.000 n=5+4)
ListMoveToFront/size_524288-8    14.3ns ± 1%    11.0ns ± 1%  -22.91%  (p=0.008 n=5+5)

@ianlancetaylor ianlancetaylor added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Sep 24, 2018
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 24, 2018
jamdagni86 added a commit to jamdagni86/go that referenced this issue Oct 26, 2018
…lements within a list.

Methods MoveToFront, MoveToBack, MoveBefore and MoveAfter remove the given element from the list and later add the same element at a desired position. This has resulted in some inefficiencies, with remove method making 4 unncessary operations and insert method making 2. The commit merges these two methods into a method called move.

Fixes golang#27747
@gopherbot
Copy link

Change https://golang.org/cl/144998 mentions this issue: container/list: combining insert and remove operations while moving elements within a list.

@jamdagni86
Copy link
Contributor

Benchmark results:

name                     old time/op  new time/op  delta
MoveToFront/len_16-8     6.65ns ± 0%  5.75ns ± 0%  -13.53%  (p=1.000 n=1+1)
MoveToFront/len_32-8     6.78ns ± 0%  5.71ns ± 0%  -15.78%  (p=1.000 n=1+1)
MoveToFront/len_64-8     6.68ns ± 0%  5.84ns ± 0%  -12.57%  (p=1.000 n=1+1)
MoveToFront/len_128-8    6.98ns ± 0%  5.86ns ± 0%  -16.05%  (p=1.000 n=1+1)
MoveToFront/len_256-8    6.73ns ± 0%  5.84ns ± 0%  -13.22%  (p=1.000 n=1+1)
MoveToFront/len_512-8    6.86ns ± 0%  5.78ns ± 0%  -15.74%  (p=1.000 n=1+1)
MoveToFront/len_1024-8   7.02ns ± 0%  5.88ns ± 0%  -16.24%  (p=1.000 n=1+1)
MoveToFront/len_2048-8   6.83ns ± 0%  5.82ns ± 0%  -14.79%  (p=1.000 n=1+1)
MoveToFront/len_4096-8   6.80ns ± 0%  5.91ns ± 0%  -13.09%  (p=1.000 n=1+1)
MoveToFront/len_8192-8   6.84ns ± 0%  5.84ns ± 0%  -14.62%  (p=1.000 n=1+1)
MoveToFront/len_16384-8  6.89ns ± 0%  5.86ns ± 0%  -14.95%  (p=1.000 n=1+1)
MoveToFront/len_32768-8  6.89ns ± 0%  5.90ns ± 0%  -14.37%  (p=1.000 n=1+1)
MoveToFront/len_65536-8  6.84ns ± 0%  5.87ns ± 0%  -14.18%  (p=1.000 n=1+1)
MoveToBack/len_16-8      6.76ns ± 0%  5.10ns ± 0%  -24.56%  (p=1.000 n=1+1)
MoveToBack/len_32-8      6.85ns ± 0%  5.10ns ± 0%  -25.55%  (p=1.000 n=1+1)
MoveToBack/len_64-8      7.45ns ± 0%  5.13ns ± 0%  -31.14%  (p=1.000 n=1+1)
MoveToBack/len_128-8     7.39ns ± 0%  5.11ns ± 0%  -30.85%  (p=1.000 n=1+1)
MoveToBack/len_256-8     7.38ns ± 0%  5.17ns ± 0%  -29.95%  (p=1.000 n=1+1)
MoveToBack/len_512-8     7.54ns ± 0%  5.17ns ± 0%  -31.43%  (p=1.000 n=1+1)
MoveToBack/len_1024-8    8.55ns ± 0%  5.18ns ± 0%  -39.42%  (p=1.000 n=1+1)
MoveToBack/len_2048-8    7.34ns ± 0%  5.21ns ± 0%  -29.02%  (p=1.000 n=1+1)
MoveToBack/len_4096-8    7.92ns ± 0%  5.32ns ± 0%  -32.83%  (p=1.000 n=1+1)
MoveToBack/len_8192-8    8.02ns ± 0%  5.42ns ± 0%  -32.42%  (p=1.000 n=1+1)
MoveToBack/len_16384-8   8.15ns ± 0%  5.42ns ± 0%  -33.50%  (p=1.000 n=1+1)
MoveToBack/len_32768-8   7.99ns ± 0%  5.52ns ± 0%  -30.91%  (p=1.000 n=1+1)
MoveToBack/len_65536-8   8.69ns ± 0%  5.49ns ± 0%  -36.82%  (p=1.000 n=1+1)
MoveBefore/len_16-8      7.25ns ± 0%  4.77ns ± 0%  -34.21%  (p=1.000 n=1+1)
MoveBefore/len_32-8      7.19ns ± 0%  4.83ns ± 0%  -32.82%  (p=1.000 n=1+1)
MoveBefore/len_64-8      7.36ns ± 0%  4.86ns ± 0%  -33.97%  (p=1.000 n=1+1)
MoveBefore/len_128-8     7.66ns ± 0%  4.86ns ± 0%  -36.55%  (p=1.000 n=1+1)
MoveBefore/len_256-8     7.53ns ± 0%  4.85ns ± 0%  -35.59%  (p=1.000 n=1+1)
MoveBefore/len_512-8     7.52ns ± 0%  4.93ns ± 0%  -34.44%  (p=1.000 n=1+1)
MoveBefore/len_1024-8    7.69ns ± 0%  4.86ns ± 0%  -36.80%  (p=1.000 n=1+1)
MoveBefore/len_2048-8    7.84ns ± 0%  4.95ns ± 0%  -36.86%  (p=1.000 n=1+1)
MoveBefore/len_4096-8    7.94ns ± 0%  5.01ns ± 0%  -36.90%  (p=1.000 n=1+1)
MoveBefore/len_8192-8    8.21ns ± 0%  5.12ns ± 0%  -37.64%  (p=1.000 n=1+1)
MoveBefore/len_16384-8   9.01ns ± 0%  5.11ns ± 0%  -43.29%  (p=1.000 n=1+1)
MoveBefore/len_32768-8   8.25ns ± 0%  5.21ns ± 0%  -36.85%  (p=1.000 n=1+1)
MoveBefore/len_65536-8   8.39ns ± 0%  5.20ns ± 0%  -38.02%  (p=1.000 n=1+1)
MoveAfter/len_16-8       7.50ns ± 0%  5.04ns ± 0%  -32.80%  (p=1.000 n=1+1)
MoveAfter/len_32-8       7.80ns ± 0%  5.05ns ± 0%  -35.26%  (p=1.000 n=1+1)
MoveAfter/len_64-8       7.50ns ± 0%  5.13ns ± 0%  -31.60%  (p=1.000 n=1+1)
MoveAfter/len_128-8      7.62ns ± 0%  5.04ns ± 0%  -33.86%  (p=1.000 n=1+1)
MoveAfter/len_256-8      7.50ns ± 0%  5.14ns ± 0%  -31.47%  (p=1.000 n=1+1)
MoveAfter/len_512-8      7.47ns ± 0%  5.37ns ± 0%  -28.11%  (p=1.000 n=1+1)
MoveAfter/len_1024-8     7.45ns ± 0%  5.34ns ± 0%  -28.32%  (p=1.000 n=1+1)
MoveAfter/len_2048-8     7.59ns ± 0%  5.14ns ± 0%  -32.28%  (p=1.000 n=1+1)
MoveAfter/len_4096-8     7.55ns ± 0%  5.30ns ± 0%  -29.80%  (p=1.000 n=1+1)
MoveAfter/len_8192-8     7.69ns ± 0%  5.38ns ± 0%  -30.04%  (p=1.000 n=1+1)
MoveAfter/len_16384-8    7.74ns ± 0%  5.35ns ± 0%  -30.88%  (p=1.000 n=1+1)
MoveAfter/len_32768-8    7.73ns ± 0%  5.32ns ± 0%  -31.18%  (p=1.000 n=1+1)

@golang golang locked and limited conversation to collaborators Oct 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

5 participants