Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: new channel primitive, drain(c) #26343

Closed
jimmyfrasche opened this issue Jul 12, 2018 · 9 comments
Closed

proposal: new channel primitive, drain(c) #26343

jimmyfrasche opened this issue Jul 12, 2018 · 9 comments
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@jimmyfrasche
Copy link
Member

Forked from #26282.

drain(c) would be roughly equivalent to the current idiom of

go func() {
  for range c {
  }
}()

Unlike the above code, drain(c) does not create a new goroutine. Sends to c are discarded by the scheduler.

After a call to drain(c), another attempt to receive from c panics.

This is similar to close(c) but for readers. It would only be safe to use drain when there was a single reader the same way it's only safe to close a chan when there's a single writer.

Open questions:

Would select need to prioritize non-drained channels?

Should it be possible for a sender to detect when a channel has been drained? Doing so would allow a sender to shutdown, but as @neild pointed out "Channels currently pass data in only one direction. Being able to detect a drained channel would make that bidirectional."

@jimmyfrasche jimmyfrasche added LanguageChange v2 A language change or incompatible library change Proposal labels Jul 12, 2018
@gopherbot gopherbot added this to the Proposal milestone Jul 12, 2018
@uluyol
Copy link
Contributor

uluyol commented Jul 12, 2018

You haven't done anything to convince anyone that this is a good thing to have. All you have done is describe the mechanics of drain. Please describe some use cases and explain why having drain enables new functionality or why not having it leads to difficult code today.

If this was already written in the other thread, please copy into your above post.

@jimmyfrasche
Copy link
Member Author

It arose from a discussion about primitives for channel operations that avoided creating goroutines just to shuffle data around. The splice and clone operations from the original issue are more general and more generally useful. drain would allow "undoing" a clone that you are no longer interested in efficiently.

Its use is a single reader that is no longer interested in what the writer has to say:

for v := range c {
  if v == sentinel {
   drain(c)
   break
  }
  // body
}

Without clone that's less compelling, just more concise and efficient than creating a goroutine whose only purpose is to discard.

If it were more like the reverse of close and the writer could test whether the reader is still responding, it would have greater use on its own. In the example above the goroutine writing to c would be able to tell that c is being ignored and shutdown, without having to close a separate shutdown channel. That could simplify a lot of single-reader code. It would mean making the send statement into an expression returning a boolean, which is a greater ask than a new builtin.

@DeedleFake
Copy link

Can't this be handled automatically by the garbage collector, especially as a corollary to clone? If a reader's no longer interested, can't they just drop the reference?

@jimmyfrasche
Copy link
Member Author

@DeedleFake
I suppose that is possible, but that would make cloned channels behave very differently from regular channels.

Let's say the original channel is c and the cloned channel is cc. Under those rules dropping a reference to cc has no effect on the goroutine writing to c but dropping a reference to c in the reader does. The reader would have to know that the chan it is reading from a clone. That's a fragile assumption. It could also cause unwanted additional blocking in the scheduler until the GC runs depending on how much buffer is given is to the cloned channel.

@DeedleFake
Copy link

DeedleFake commented Jul 12, 2018

Hmmm... So, let's say I do the following:

c := make(chan int)
cc := clone(c)

There'll presumably be three places these are then going to be used:

  • A writer sending data to c.
  • Something reading from c.
  • Something reading from cc.

The issue you're worried about is that, because of the buffering, if the reading reference to c is dropped, the garbage collector can't know to free the buffer because there's still a reference to c from the writing end, right? Yeah, that's definitely a problem. I can think of three solutions that would let the garbage collector handle this:

  • Remove the buffering to cloned channels. This would fix it pretty easily since there would, obviously, be nothing to free. The problem is that this would probably require every reader and the writer to block until they're all reading and writing simultaneously, which would severely limit the usefulness of clone.
  • Overhaul channel creation mechanics so that creating a new channel creates a write-only channel, and clone then creates readers of that channel. In this case, it might make sense to add infinitely buffered channels and then have this apply only to them.
  • Cloning a channel invalidates the original reference. Calling clone() on a writable channel returns a write-only channel and a read-only channel. Calling clone() on a read-only channel returns two read-only channels. Or, alternatively, clone() could take two arguments, so it would be something like clone(<-chan T, <-chan T | chan<- T). If the second argument is write-only, then the first argument becomes a read-only cloned output of it. If the second's read-only, the first becomes a clone of the write-end of the second and the first, if it's not a clone, becomes a clone as well internally. Passing a two-way channel as the second argument is not allowed. Edit: Or, more simply, only allow one-way channels to be passed to clone. The same mechanics as before apply, but it returns a new channel instead of requiring you to create one. In other words, in cc := clone(c), c must be either read-only or write-only. If it's write-only, do a normal clone. If it's read-only, change the reference internally into a cloned channel so that it's the same as what it returns.

I just don't like the idea of requiring a manual drain() in a adamantly garbage collected language. I know that a lot of things often have a Close() or a Release() method or something similar for manually freeing stuff, but it feels really awkward if a built-in mechanism needs it.

@jimmyfrasche
Copy link
Member Author

Reading from c and reading from cc should be identical, they're interchangeable for receives. The only part that's special is you can only send on c.

If you leave it up to gc that means that any reader can be dropped without changing behavior as long as at least one is still being read from. If there are 99 clones of a channel that means you could safely let any 99 of the 100 readers go, but if you let that random last reader go suddenly things behave differently. It's very fragile.

@DeedleFake
Copy link

Reading from c and reading from cc should be identical, they're interchangeable for receives. The only part that's special is you can only send on c.

Right, but the problem would be that the garbage collector can't tell how the original two-way reference is used. If I have something running that writes to it, there's no way to be sure that some other code path won't start reading from it, too. Because of that, that original reference has to be buffered the same way that the clones are, but, unlike the clones, if that reference is dropped from the code that's reading from it, the buffer can't be freed because there's still a reference to it from the writing end, meaning that every write to it will just grow an unfreeable buffer, resulting in a memory leak every single time clone is used.

For example:

c := make(chan int)
go func() {
	for i := 0; true; i++ {
		c <- i
	}
}()

go func() {
	for i := range(c) {
		fmt.Printf("Original: %v\n", i)

		if i > 5 {
			break
		}
	}

	// At this point, no more code is reading from the original c, but
	// there's no way for the garbage collector to know that because the
	// reference is still in scope elsewhere, so the buffer has to
	// continuously grow in case something decides to read from it.
}()

go func() {
	cc := clone(c)

	for i := range(cc) {
		fmt.Printf("Clone: %v\n", i)

		if i > 5 {
			break
		}
	}

	// At this point, the only reference to cc goes out of scope, so the
	// garbage collector can clean up its buffer.
}()

@bcmills
Copy link
Contributor

bcmills commented Aug 21, 2018

Unlike the above code, drain(c) does not create a new goroutine. Sends to c are discarded by the scheduler.

After a call to drain(c), another attempt to receive from c panics.

I would not name that operator “drain”. If I drain a bathtub, I can refill it again.

But, honestly, I'm worried that this primitive would encourage programs to perform a lot of needless work. With a cancellation channel, the sender goroutine stops being able to send, and knows to avoid computing any future results. With a “drained” channel, the sender gets no such signal and potentially wastes a lot of CPU computing results that nobody wants.

At the very least, I would want to see some more compelling examples of where this would be useful.

@ianlancetaylor
Copy link
Contributor

The purpose of this function seems to be to permit a goroutine to write a series of values to a channel without having to worry whether any other goroutine is still reading from it. But you still need some mechanism to stop the goroutine. Otherwise, calling drain would cause the goroutine to run forever, writing values that are discarded. That would be bad.

Since there needs to be some way to stop the goroutine anyhow, when you stop the goroutine, the channel will garbage collected. The only purpose for drain seems to be for a goroutine that writes a fixed number of values only some of which will be read. And even in that case the goroutine could be stopped. This seems too special purpose for a language level addition. It's easy enough to write the drain function for any specific channel, and if we are able to add generics to the language then it will be easy enough to write the drain function, once, for any channel type.

@golang golang locked and limited conversation to collaborators Oct 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

6 participants