Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: bounds checking for unrolled loops is weird #30945

Open
seebs opened this issue Mar 20, 2019 · 7 comments
Open

cmd/compile: bounds checking for unrolled loops is weird #30945

seebs opened this issue Mar 20, 2019 · 7 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@seebs
Copy link
Contributor

seebs commented Mar 20, 2019

What version of Go are you using (go version)?

$ go version
1.12 (but also 1.11)

Does this issue reproduce with the latest release?

seems to

What operating system and processor architecture are you using (go env)?

amd64, but N/A-ish?

go env Output
$ go env

What did you do?

Tried to unroll a loop, then tried to reduce bounds checking. Here's the godbolt page with the assembly I was looking at.

https://godbolt.org/z/GQuZaf

What did you expect to see?

A lot less bounds checking.

What did you see instead?

It's obvious to me that i := 0; i < 1024; i += 4 produces no values of i such that i+3 is is outside a :1024 slice.

However, there's an even weirder thing: If I do that loop, I get three bounds checks, one for each of the three lines that uses a value computed from i.

But if I change it to i < 1024-3, which does not actually change what values i can ever take, it adds a bounds check for the plain i+0 line! That can't possibly be right. If I know the loop's limit is definitely smaller than the size I've already checked the slice to have, that should allow eliminating some bounds checks. (Interestingly, if I set the limit to 1020, then all the bounds checks go away. I suppose I could just unroll the last four entries separately.)

@randall77
Copy link
Contributor

It's obvious to me that i := 0; i < 1024; i += 4 produces no values of i such that i+3 is is outside a :1024 slice.

It's obvious to me also. But it isn't obvious to the compiler. Someone would need to write code to teach the compiler about strided inequalities like this.
Note that it's not trivial. For instance, your claim is untrue for i := 0; i < 1023; i += 4 and a :1023 slice.

However, there's an even weirder thing: If I do that loop, I get three bounds checks, one for each of the three lines that uses a value computed from i.
But if I change it to i < 1024-3, which does not actually change what values i can ever take, it adds a bounds check for the plain i+0 line!

I believe what is happening here is that when using for i := 0; i < 1024; i+= 4, the compiler at least realizes that i never wraps around and becomes negative. That allows it to remove the first bounds check. When you do for i := 0; i < 1021; i+= 4, the logic for wraparound detection fails and the compiler must assume that i can wrap around and become negative. So the first bounds check must remain.

I think we could improve the analysis here to fix both of these issues. I'll cc some bounds check / induction variable people.
@brtzsnr @rasky @dr2chase

@mikioh mikioh changed the title bounds checking for unrolled loops is weird cmd/compile: bounds checking for unrolled loops is weird Mar 20, 2019
@go101
Copy link

go101 commented Mar 20, 2019

Looks for i := 0; i <= N-4; i += 4 { can avoid bounds checking.

@seebs
Copy link
Contributor Author

seebs commented Mar 20, 2019

I was originally thinking strided inequalities would be a cool feature. The 1021 case was an accident I ran into while trying to add a hint so the compiler would know that i could never take a value such that i+3 >= 1024, and it added a panic check.

Interestingly: At 1020, there's no panic checks at all, but at 1019 there's all four again. So I think this is probably related to stride analysis in some way.

... And you're right, <= N-4 works. It is very strange that <= N-4 works and < N-3 fails.

Oh, and for extra fun: <= N-4 doesn't work if the index type is uint.

@dr2chase
Copy link
Contributor

There's a CL that's a first step to improving this.
https://go-review.googlesource.com/c/go/+/136375

I don't think it quite works for your case.
Because bounds checking CLs are so tricky, I'd prefer to keep them as small as possible, and rather than pile improvements into this one, just build on it.

@seebs
Copy link
Contributor Author

seebs commented Mar 20, 2019

Yeah, that makes sense. This is some tricky code.

@randall77 randall77 added this to the Unplanned milestone Mar 20, 2019
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 28, 2019
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@agnivade
Copy link
Contributor

agnivade commented Sep 3, 2022

@seebs - This seems like it's fixed in tip now with https://go-review.googlesource.com/c/go/+/415937. Could you confirm?

@go101
Copy link

go101 commented Jan 11, 2023

Yes, it is fixed.

But, if N is not a constant, there are still bound checks:

func foo(s []byte) {
	for i := 0; i < len(s) - 3; i += 4 {
		_ = s[i+3] // Found IsInBounds
		_ = s[i+2] // Found IsInBounds
		_ = s[i+1] // Found IsInBounds
		_ = s[i]
	}
}

func bar(s []byte) {
	for i := 0; i <= len(s) - 4; i += 4 {
		_ = s[i+3] // Found IsInBounds
		_ = s[i+2] // Found IsInBounds
		_ = s[i+1] // Found IsInBounds
		_ = s[i]
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

7 participants