Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestSmhasherWindowed is still flaky on linux-386-longtest #43130

Closed
bcmills opened this issue Dec 11, 2020 · 17 comments
Closed

runtime: TestSmhasherWindowed is still flaky on linux-386-longtest #43130

bcmills opened this issue Dec 11, 2020 · 17 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 11, 2020

2020-12-10T22:58:49-6d3d3fb/linux-386-longtest

##### GOMAXPROCS=2 runtime -cpu=1,2,4 -quick
--- FAIL: TestSmhasherWindowed (27.87s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
--- FAIL: TestSmhasherWindowed (26.69s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
--- FAIL: TestSmhasherWindowed (26.69s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
FAIL
FAIL	runtime	598.996s

2020-10-26T21:47:49-36c5edd/linux-386-longtest

--- FAIL: TestSmhasherWindowed (26.23s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:570: string keys
FAIL
FAIL	runtime	169.624s

See previously #39352.

I would guess that it's flaky on 32-bit ARM too, but we don't have a longtest builder for that configuration.

CC @randall77

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 11, 2020
@bcmills bcmills added this to the Backlog milestone Dec 11, 2020
@bcmills
Copy link
Contributor Author

bcmills commented Jan 25, 2021

2021-01-22T23:03:58-25c39e4/linux-386-longtest

--- FAIL: TestSmhasherWindowed (26.54s)
    hash_test.go:566: 32 bit keys
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
    hash_test.go:161: unexpected number of collisions: got=514 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:161: unexpected number of collisions: got=256 mean=0.499992 stddev=0.707101 threshold=156.565200
FAIL
FAIL	runtime	182.398s

@gopherbot
Copy link

Change https://golang.org/cl/280372 mentions this issue: runtime: using wyhash for memhashFallback on 32bit platform

gopherbot pushed a commit that referenced this issue Mar 16, 2021
wyhash is a general hash function that:

1. Default hash function of Zig, Nim
2. Passed Smhasher, BigCrush and PractRand
3. Less code
4. 3~26% faster than internal hashmap

name                  old time/op    new time/op    delta
Hash5                   67.8ns ± 0%    65.4ns ± 0%   -3.45%  (p=0.000 n=7+10)
Hash16                  82.5ns ± 0%    74.2ns ± 0%  -10.12%  (p=0.000 n=6+8)
Hash64                   121ns ± 0%     102ns ± 0%  -15.82%  (p=0.000 n=7+10)
Hash1024                1.13µs ± 0%    0.89µs ± 0%  -20.58%  (p=0.000 n=10+9)
Hash65536               68.9µs ± 0%    54.4µs ± 0%  -21.04%  (p=0.000 n=10+7)
HashStringSpeed          103ns ± 2%      93ns ± 3%  -10.24%  (p=0.000 n=9+10)
HashBytesSpeed           191ns ± 2%     180ns ± 1%   -5.40%  (p=0.000 n=10+8)
HashInt32Speed          59.0ns ± 2%    59.1ns ± 1%     ~     (p=0.655 n=9+8)
HashInt64Speed          72.7ns ± 3%    66.1ns ± 5%   -9.04%  (p=0.000 n=10+10)
HashStringArraySpeed     270ns ± 1%     222ns ± 2%  -17.91%  (p=0.000 n=10+10)
FastrandHashiter         108ns ± 0%     109ns ± 1%   +0.96%  (p=0.002 n=10+10)

name                  old speed      new speed      delta
Hash5                 73.8MB/s ± 0%  76.4MB/s ± 0%   +3.58%  (p=0.000 n=7+10)
Hash16                 194MB/s ± 0%   216MB/s ± 0%  +11.25%  (p=0.000 n=10+8)
Hash64                 530MB/s ± 0%   630MB/s ± 0%  +18.74%  (p=0.000 n=8+10)
Hash1024               910MB/s ± 0%  1145MB/s ± 0%  +25.88%  (p=0.000 n=10+9)
Hash65536              951MB/s ± 0%  1204MB/s ± 0%  +26.64%  (p=0.000 n=10+7)

Update #43130

Change-Id: Id00c54b116a2411fcf675e95896fffb85f0e25b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/280372
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@mengzhuo
Copy link
Contributor

mengzhuo commented Jan 6, 2022

@bcmills Unfortunately even after 483533d is merged It still failed.

Good news is it failed at the rate 1 failure per month and 256 collisions instead of 512.

I'll check whether my implement same with original wyhash :-(

@mengzhuo
Copy link
Contributor

mengzhuo commented Jan 7, 2022

Updates: I've run 32 bits SmhasherWindow for 4770 times (~24h) without failures...

@bcmills
Copy link
Contributor Author

bcmills commented Jan 20, 2022

Indeed, still failing about once a month (and that's including the lower rate of test runs due to the code freeze). The more recent failures are back to 512 collisions each.

greplogs --dashboard -md -l -e '(?ms)FAIL: TestSmhasherWindowed.*unexpected number of collisions' --since=2021-09-24

2022-01-19T20:54:49-1efc581/linux-386-longtest
2021-12-19T20:16:45-87b2a54/linux-386-longtest
2021-11-15T21:22:17-e08aae2/linux-386-longtest
2021-10-23T12:44:47-b0f7eb6/linux-386-longtest

@mengzhuo
Copy link
Contributor

Indeed, still failing about once a month (and that's including the lower rate of test runs due to the code freeze). The more recent failures are back to 512 collisions each.

Recently a new version of Smhasher is released with the "bad seed" testcase, which wyhash is failed ( caused by multiply 0).
Maybe we can skip "bad seed" on 32 bit platforms and regain a "good seed" in the alginit procedure.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 3, 2022

@mengzhuo, at the very least would it be useful to have the test log the seed(s) that produced the unexpected collision rate? That would help to verify that this test failure is due to the expected bad seeds.

@bcmills bcmills modified the milestones: Backlog, Go1.19 Feb 3, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Feb 3, 2022

Marking as release-blocker for Go 1.19: linux/386 is a first-class port, but it's probably too late in the 1.18 cycle to fully fix this without introducing too much risk.

@mengzhuo
Copy link
Contributor

mengzhuo commented Feb 8, 2022

My mistake: 386 using AES asm part do the hashing, I will take a look.

------Original comment-------------------------------
@randall77 Keith,

This issue can be reproduced within 6000 times run on 64 bit keys SmhasherWindow test.

After I deleted these 4 lines, no more flaky (12K runs):

go/src/runtime/alg.go

Lines 325 to 328 in 69e1711

hashkey[0] |= 1 // make sure these numbers are odd
hashkey[1] |= 1
hashkey[2] |= 1
hashkey[3] |= 1

I think the OR itself cancel the randomness of last bit (always 1)

But when I tried XOR

hashkey[0] ^= 1

there are flaky failures.

@gopherbot
Copy link

Change https://go.dev/cl/384075 mentions this issue: runtime: AES maphash scramble 3 times on 386

@mengzhuo
Copy link
Contributor

Unfortunatly, it still flaky within 1/40000 and the key and seed seems random enough to me.

    hash_test.go:119: aeskey 7c3fcff1e80c0a0a9a13ce82b08c0ed80192f93e919b702a728c7bee2b6be56b86146ea040eee7a85d17efc794474ef83465567cd588569ca13945feaec28df3
    hash_test.go:120: unexpected number of collisions: got=256 mean=0.499992 stddev=0.707101 threshold=156.565200

According to the smhasher, some hashing algorithm based on AES failed the test.
Is it the AES itself is not qualified as a hashing algorithm?

@bcmills
Copy link
Contributor Author

bcmills commented Mar 30, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-02-08

2022-03-21T18:58:42-79103fa/linux-386-longtest

@bcmills bcmills reopened this Mar 30, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Apr 12, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-03-30

2022-04-11T22:03:44-0605bf6/linux-386-longtest

@bcmills
Copy link
Contributor Author

bcmills commented Apr 22, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-04-12

2022-04-21T23:45:36-67d6be1/linux-386-longtest

@bcmills
Copy link
Contributor Author

bcmills commented Apr 22, 2022

@golang/runtime: this is a recurring failure on linux/386, which is a first class port. Could someone take a look at this and either fix it (if feasible) or add a test skip (if this turns out not to be a priority)?

@gopherbot
Copy link

Change https://go.dev/cl/402456 mentions this issue: runtime: disable windowed Smhasher test on 32-bit systems

@golang golang locked and limited conversation to collaborators Apr 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

3 participants