Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: Map.Store() crash with fatal error: sync: unlock of unlocked mutex #26480

Closed
amezghal opened this issue Jul 19, 2018 · 15 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@amezghal
Copy link

amezghal commented Jul 19, 2018

Hi,

For some reason our service crashes with this unrecoverable error:

 fatal error: sync: unlock of unlocked mutex
goroutine 866557 [running]: 
runtime.throw(0xc0236c, 0x1e)
	/opt/go1.10.3.linux-amd64/src/runtime/panic.go:616 +0x81 fp=0xc44a9d65c0 sp=0xc44a9d65a0 pc=0x42b191 
sync.throw(0xc0236c, 0x1e)
	/opt/go1.10.3.linux-amd64/src/runtime/panic.go:605 +0x35 fp=0xc44a9d65e0 sp=0xc44a9d65c0 pc=0x42b105 
sync.(*Mutex).Unlock(0xc42522c050)
	/opt/go1.10.3.linux-amd64/src/sync/mutex.go:184 +0xc2 fp=0xc44a9d6608 sp=0xc44a9d65e0 pc=0x46c652 
sync.(*Map).Store(0xc42522c050, 0xabd0c0, 0xc4374e15a0, 0xbabe00, 0xc4305738c0) 
	/opt/go1.10.3.linux-amd64/src/sync/map.go:162

we are using sync.Map as intended (filling the map from multiple go routines).

it's very random, and i can't force the crash, but it happens many times per day.

Thanks

@randall77
Copy link
Contributor

I can't see how that line could be unlocking an unlocked map. It is dominated by the lock call.

Any chance you're copying the Map? That might explain your crash.

A Map must not be copied after first use.

Is there any way you can share the code so we can reproduce this crash?

@amezghal
Copy link
Author

amezghal commented Jul 19, 2018

Hi, @randall77

i am aware of the copying map issue, and i confirm we are not copying the map.

i will provide a mock code

package shared

import "sync"

type Stats struct {
	items sync.Map
}

type item struct {
	Id    string  `json:"id"`
	Value float64 `json:"value"`
}

func NewStats() *Stats {
	instance := Stats{}
	instance.items = sync.Map{}
	return &Stats{}
}

func (ss *Stats) Add(key string, value item) {
	ss.items.Store(key, value)
}

func (ss *Stats) Flush() {
	ss.items = sync.Map{}
}

what we do is during start up, we create an instance and make it available globally

global.StatsManager := NewStats()

and we have a ticker that is called every 10minutes (we collect data and adapat/store in elasticSearch)
Once done, we call the Flush Method the re initiate.

we do not copy the instance,

@amezghal
Copy link
Author

Unless the Flush() method is the issue ?

what are doing is trying to restart over once data is on elastic search,

@randall77
Copy link
Contributor

Yes, I think Flush is the issue. If there are outstanding map operations currently in progress on global.StatsManager.items, you can't do global.StatsManager.items = sync.Map{}. Although technically that doesn't violate the rule "don't copy a map before first use" because the map you're copying hasn't been used yet. But you're copying onto an in-use Map, which is also bad. It may be worth adding something to the doc, although it seems pretty obvious that's unsafe if you think about it long enough. @bcmills

@randall77
Copy link
Contributor

Speaking of which, the race detector should have caught this bug. Have you tried running with the race detector on?

@agnivade agnivade changed the title sync.Map.Store crash with fatal error: sync: unlock of unlocked mutex sync: Map.Store() crash with fatal error: sync: unlock of unlocked mutex Jul 19, 2018
@agnivade agnivade added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 19, 2018
@bcmills
Copy link
Contributor

bcmills commented Jul 19, 2018

It may be worth adding something to the doc, although it seems pretty obvious that's unsafe if you think about it long enough.

We have a test that sync.Map triggers the vet copylock check. If go vet didn't catch the assignment, that's arguably a bug in vet.

func BadFunc2(sync.Map) {} // ERROR "BadFunc2 passes lock by value: sync.Map contains sync.Mutex"

@amezghal
Copy link
Author

When i run a go vet nothing is reported,

And this is very random, and very hard to reproduce, but in production it happens many times a day.

i will try to improve the Flush() method and see how this can help.

@randall77
Copy link
Contributor

@bcmills I think the vet copylock check isn't triggered when the RHS is a composite literal. Presumably because copying a zero lock is allowed.
It's the overwriting that's the problem here, not the copying itself.

@agnivade
Copy link
Contributor

@amezghal - Do you get any errors under the race detector ?

@amezghal
Copy link
Author

Once i have updated the Refresh method i no longer get this error,

@randall77
Copy link
Contributor

Ok, I'll close this issue then.
(Is Refresh the same as Flush?)

@amezghal
Copy link
Author

amezghal commented Jul 26, 2018

Yes Flush
re init the sync.Map while it was being excessively updated without synchronisation from other go-routines was the issue.

@davecheney
Copy link
Contributor

davecheney commented Jul 26, 2018 via email

@amezghal
Copy link
Author

@davecheney
I restored the old code and run it with the -race,

yes the racedetector is reporting the data race,

thks

@davecheney
Copy link
Contributor

davecheney commented Jul 26, 2018 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants