Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash/maphash: first call of the Reset method doesn't reset internal state #37315

Closed
ALTree opened this issue Feb 20, 2020 · 3 comments
Closed
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@ALTree
Copy link
Member

ALTree commented Feb 20, 2020

The new hash/maphash package has a Reset method documented as:

Reset discards all bytes added to h. (The seed remains the same.)

Moreover, the Sum64 doc says:

Sum64 returns h's current 64-bit value, which depends on h's seed and the sequence of bytes added to h since the last call to Reset

So it appears it would be reasonable to expect this to print two times the same value:

package main

import (
	"fmt"
	"hash/maphash"
)

func main() {
	h := new(maphash.Hash)

	h.WriteString("hello")
	fmt.Printf("%#x\n", h.Sum64())

	h.Reset()

	h.WriteString("hello")
	fmt.Printf("%#x\n", h.Sum64())
}

But it doesn't:

$ gotip run hash.go
0xdf1aabcc36e56272
0x5a04b54a5a2174d6

This is because even if the seed remains the same and "all the previously added bytes are discarded", the internal state of the Hash is maintained, even after the h.Reset() call.

Is this the desired behaviour? If it is, the documentation should probably be amended to make it clear what to expect after a call to the Reset method.

cc @alandonovan @randall77

@ALTree ALTree added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Feb 20, 2020
@ALTree ALTree added this to the Go1.14 milestone Feb 20, 2020
@ALTree ALTree added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Feb 20, 2020
@vovapi
Copy link
Contributor

vovapi commented Feb 20, 2020

There's seems to be a bug in a Hash because if you reset and hash again you'll get the same result as the second one:

package main

import (
    "fmt"
    "hash/maphash"
)

func main() {
    h := new(maphash.Hash)

    h.WriteString("hello")
    fmt.Printf("%#x\n", h.Sum64()) // 0x48e0352e23362dd2

    h.Reset()

    h.WriteString("hello")
    fmt.Printf("%#x\n", h.Sum64()) // 0xf9395b015acc3ffb

    h.Reset()

    h.WriteString("hello")
    fmt.Printf("%#x\n", h.Sum64()) // 0xf9395b015acc3ffb
}

The first hash sum seems to be ignoring all the data that was put through it and returning sum as it has an empty state:

package main

import (
    "fmt"
    "hash/maphash"
)

func main() {
    h1 := new(maphash.Hash)

    h1.WriteString("hello")
    sum1 := h1.Sum64() // first call on initSeed

    h1.Reset()

    fmt.Println(sum1 == h1.Sum64()) // true

    h2 := new(maphash.Hash)
    h2.SetSeed(h1.Seed())
    sum2 := h2.Sum64()

    fmt.Println(sum1 == sum2) // true
}

I think the issue is that the first Sum64 calls initSeed, which initializes the random seed using SetSeed. SetSeed discards previously written data.

vovapi added a commit to vovapi/go that referenced this issue Feb 20, 2020
Hash initializes seed on the first usage of seed or state with initSeed.
initSeed uses SetSeed which discards accumulated data.
This causes hash to return different sums for the same data in the first use
and after reset.
This CL fixes this issue by separating the seed set from data discard.

Fixes golang#37315
@gopherbot
Copy link

Change https://golang.org/cl/220259 mentions this issue: hash/maphash: don't discard data on random seed init

@ALTree ALTree changed the title hash/maphash: confusing Reset documentation hash/maphash: first call of the Reset method doesn't reset internal state Feb 21, 2020
@ALTree ALTree added NeedsFix The path to resolution is known, but the work has not been done. and removed Documentation NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 21, 2020
@gopherbot
Copy link

Change https://golang.org/cl/220617 mentions this issue: [release-branch.go1.14] hash/maphash: don't discard data on random seed init

gopherbot pushed a commit that referenced this issue Feb 24, 2020
…ed init

Hash initializes seed on the first usage of seed or state with initSeed.
initSeed uses SetSeed which discards accumulated data.
This causes hash to return different sums for the same data in the first use
and after reset.
This CL fixes this issue by separating the seed set from data discard.

Updates #37315

Change-Id: Ic7020702c2ce822eb700af462e37efab12f72054
GitHub-Last-Rev: 48b2f96
GitHub-Pull-Request: #37328
Reviewed-on: https://go-review.googlesource.com/c/go/+/220259
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
(cherry picked from commit 638df87)
Reviewed-on: https://go-review.googlesource.com/c/go/+/220617
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
@golang golang locked and limited conversation to collaborators Feb 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants