Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: sync: Allow (*sync.Map).LoadOrStore uses closure to create new value like sync.Pool #44159

Closed
ethe opened this issue Feb 8, 2021 · 10 comments

Comments

@ethe
Copy link

ethe commented Feb 8, 2021

Currently, LoadOrStore only allows passing a value directly, however, considering this case:

var smap sync.Map

// creating value needs a factory function
// maybe a heavy overhead initialization function
// always initialized even there has been an object loaded from the below map
bar := NewValue()
smap.LoadOrStore("foo", bar) 

if I want to avoid this initialization every time, I have to give up using LoadOrStore or combine it and sync.Pool: implement the lazy initialization when there is not an existing value be loaded:

var smap sync.Map

bar := valuePool.Get().(*Bar)
bar, loaded := smap.LoadOrStore("foo", bar)
if !loaded {
    bar.Init()
} else {
    valuePool.Put(bar)
}

if there is a way to put a New closure into LoadOrStore function, maybe like:

func (m *Map) LoadOrStoreFrom(key interface{}, newValue func() interface{}) (actual interface{}, loaded bool)

it may reduce tons of overhead in this case, and I think the above case is common to meet.

@ethe ethe changed the title Allow (*sync.Map).LoadOrStore uses closure to create new value like sync.Pool Proposal: sync: Allow (*sync.Map).LoadOrStore uses closure to create new value like sync.Pool Feb 8, 2021
@gopherbot gopherbot added this to the Proposal milestone Feb 8, 2021
@ethe ethe changed the title Proposal: sync: Allow (*sync.Map).LoadOrStore uses closure to create new value like sync.Pool proposal: sync: Allow (*sync.Map).LoadOrStore uses closure to create new value like sync.Pool Feb 8, 2021
@bcmills
Copy link
Contributor

bcmills commented Feb 8, 2021

See previously #20884 (particularly #20884 (comment)) and #21199.

I think the same concerns apply here — in particular, the newValue callback makes the map prone to lock-ordering bugs, and may prevent more important optimizations in the sync.Map implementation.

@bcmills
Copy link
Contributor

bcmills commented Feb 8, 2021

sync.Map is optimized for stable, long-lived keys, whereas sync.Pool helps with repeated, short-lived allocations. What leads you to believe that combining the two would be a significant benefit?

Generally the pattern we use for allocation-sensitive sync.Map is to start with a Load for the fast-path and only fall back to LoadOrStore for the slow path (as in encoding/xml.getTypeInfo and (*reflect.rtype).ptrTo).

If the initialization is expensive enough, you can guard it with a sync.Mutex guarding stores into the whole map (since stores should be rare compared to loads), or use per-value synchronization such as a sync.Once or a sync.WaitGroup embedded or closed over in the value (as in encoding/json.typeEncoder).

Which of those strategies is best depends on several properties of the access pattern — the ratio of loads to stores, tolerance for redundant work, cost of initialization, etc. I don't think we should bake assumptions about those properties into sync.Map itself.

@ethe
Copy link
Author

ethe commented Feb 9, 2021

My point is that sync.Map could have the simillar behavior with sync.Pool: get from map, or create from a closure, rather than "combine" sync.Map and sync.Pool, it just a case to show that if I want to avoid the initialization overhead, I have to create object before LoadOrStore calling even it would not be used.

Yes I think it would make the map prone to lock-ordering bugs, in current, LoadOrStore much more like a CAS operation, users would still write lock-ordering bugs outside of LoadOrStore if they are able to, the only difference is that the stack of bug is on LoadOrStore or not.

And I agree that writing a fast loading function and fallback to LoadOrStore is a possible way to avoid the cost of initialization if the official think it is easy to use enough and is a recommended way to use sync.Map. I just think this code is verbose if I have to write it in a general case, and also it is not free: you have to load it twice if it hits fallback.

@hugbubby
Copy link

hugbubby commented Feb 11, 2021

@bcmills

sync.Map is optimized for stable, long-lived keys, whereas sync.Pool helps with repeated, short-lived allocations. What leads you to believe that combining the two would be a significant benefit?

He's not trying to combine the two. He's just asking for the ability to lazily evaluate the corresponding value for a given key. If golang were as good a language as haskell, this would be the default implementation of sync.Map's LoadOrStore method and there wouldn't even be a need for a separate function.

Generally the pattern we use for allocation-sensitive sync.Map is to start with a Load for the fast-path and only fall back to LoadOrStore for the slow path (as in encoding/xml.getTypeInfo and (*reflect.rtype).ptrTo).

Unfortunately this is more verbose than the proposal, and also slower. In extreme cases, or in cases where the store is very long and not just pretty long, you might force more than one threads to create corresponding values for the same key only to throw them away on the later LoadOrStore call.

More importantly, it's a pattern that a lot of users of the sync.Map object might not come up with, because concurrency is hard. Libraries should feel comfortable designing for 110IQ programmers when they don't sacrifice the code of 130IQ programmers. This Load -> LoadOrStore trick slipped my mind until I read the docs further because I didn't have a good model of what the sync.Map object provided me in terms of thread safety.

If the initialization is expensive enough, you can guard it with a sync.Mutex

If we're bringing out sync.Mutex, why not make the users just design their own sync.Map implementation? Most of the utility of sync.Map is in preventing me from having to use sync.Mutex!

Which of those strategies is best depends on several properties of the access pattern — the ratio of loads to stores, tolerance for redundant work, cost of initialization, etc. I don't think we should bake assumptions about those properties into sync.Map itself.

You don't have to assume anything about the current users of sync.Map, you're just extending its viability to provide threadsafe key/value access to situations where allocating the map is resource intensive. I have run into this case twice and came here to make an issue, but it appears I've been beaten to it.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Feb 17, 2021
@rsc
Copy link
Contributor

rsc commented Feb 17, 2021

LoadOrStore may run a few times while racing against other goroutines before a load or store succeeds.
So it could not possibly guarantee that the closure would be called if and only if the result got stored.
Given that, why not just call Load, then if it fails call the closure, and then call LoadOrStore?

val, ok := m.Load(key)
if !ok {
    val = newValue()
    val, _ = m.LoadOrStore(key, val)
}
// Now val is either the loaded value or the one returned from newValue,
// but even if newValue was called, val may not be its result.

That's the best any implementation is going to be able to do, and you can do it in a few lines of code in your own program.
And seeing that code will make the semantics clearer than hidden in a function in the sync.Map API.

@rsc
Copy link
Contributor

rsc commented Feb 17, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Feb 17, 2021
@hugbubby
Copy link

hugbubby commented Feb 18, 2021

@rsc

That's the best any implementation is going to be able to do.

I don't think this is actually the case. Why can't the closure-based-LoadOrStore call just internally hit a lock until it's done creating the object, so that any further LoadOrStore calls block until it's finished?

Edit: Here's a loose implementation:

//Read lock key here
if val, ok := m[key]; ok {
  return val
} else {
  //Upgrade to write lock on key here
  if val, ok = m[key]; ok { //Check _again_
    return val
  } else {
    val = newValue()
    val, _ = m.Store(key, val)
  }
}

@bcmills
Copy link
Contributor

bcmills commented Feb 18, 2021

Why can't the closure-based-LoadOrStore call just internally hit a lock until it's done creating the object, so that any further LoadOrStore calls block until the first one is done running the closure?

That seems like it would require either per-key locks or a lock shared among many keys.

Since the purpose of sync.Map is to avoid (both cache and lock) contention, a shared lock would be extremely counterproductive.

That leaves per-key locks, which do not seem worth the increase in memory footprint.

@rsc
Copy link
Contributor

rsc commented Feb 24, 2021

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Feb 24, 2021
@rsc
Copy link
Contributor

rsc commented Mar 10, 2021

No change in consensus, so declined.
— rsc for the proposal review group

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

5 participants