Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: when LookupIP is timed out, all duplicate lookups wait #10117

Closed
fraenkel opened this issue Mar 8, 2015 · 8 comments
Closed

net: when LookupIP is timed out, all duplicate lookups wait #10117

fraenkel opened this issue Mar 8, 2015 · 8 comments

Comments

@fraenkel
Copy link
Contributor

fraenkel commented Mar 8, 2015

It seems that if you have many goroutines that do lookups, they are tied to one routine that does the actual DNS resolve. If it succeeds, the answer is provided to everyone and all is happy.

When it times out, rather than releasing everyone, all the goroutines must also wait their time out too. It would make more sense to also release the other routines with the errTimeout immediately.

@minux
Copy link
Member

minux commented Mar 8, 2015 via email

@fraenkel
Copy link
Contributor Author

fraenkel commented Mar 8, 2015

It won't matter. The code says to forget that host lookup, so the current ones are stranded. Only new calls after the "Forget" will cause a new Lookup.

@fraenkel
Copy link
Contributor Author

fraenkel commented Mar 8, 2015

@minux I see what you are saying....
What is odd though is that my Stacks are showing a bunch of routines waiting for an answer from a lookup but no routine is doing the lookup. While it could be a timing issue, I don't think I was that lucky.

I also noticed that the c.dups in the return is racy since it isn't guarded by a lock.

@adg adg changed the title When LookupIP is timed out, all duplicate lookups wait net: when LookupIP is timed out, all duplicate lookups wait Mar 8, 2015
@fraenkel
Copy link
Contributor Author

fraenkel commented Mar 8, 2015

Here is the stack showing all the routines stuck in a lookup with nothing actually doing a lookup.
The dial timeout is 1s.

https://gist.github.com/fraenkel/5222c65dac27f5bb3ebc

@ianlancetaylor
Copy link
Contributor

Which version of Go are you using? The file names suggest you are using a version before Go 1.4.

I expect that this was fixed in Go 1.4 by the fix for issue #8602.

@fraenkel
Copy link
Contributor Author

fraenkel commented Mar 9, 2015

This is Go 1.3. You are correct that it looks like with the Forget in there, this problem should go away.

But the c.dups on

return c.val, c.err, c.dups > 0
seems racy. Are we relying on all the locking done doCall to make it have a current value?

@fraenkel
Copy link
Contributor Author

fraenkel commented Mar 9, 2015

Will close this out since its fixed in Go 1.4 and beyond.

@fraenkel fraenkel closed this as completed Mar 9, 2015
@ianlancetaylor
Copy link
Contributor

The reference to c.dups on line 62 is safe because it happens after doCall returns. After doCall returns the entry has been removed from the g.m map, which means that it will never again be modified. All modifications to the entry happen with the lock held, so the lock in doCall provides the necessary happens-before relation.

@mikioh mikioh added this to the Go1.4.1 milestone Mar 10, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants