-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/net/http2: one slow stream may block all other steams in the same connection #16510
Comments
Go is following the spec. There is both connection-level and stream-level flow control, and Go is respecting both. Your proposed patch just ignores connection-level flow control entirely. That isn't correct. When I asked you to file the bug, the question was about whether our default values for stream- and connection-level flow should be different. I forget what they are, and I don't know offhand what other servers are using by default. It's possible we need to make our connection-level ones be some factor larger than our stream-level ones, so it would take N blocked streams to block the whole connection. But we're not going to ignore connection-level flow control and infinitely buffer all blocked streams' data in memory. |
I think that is the correct way to respect both connection level and stream level flow control. The values for steams and connection can be same or different, but I think there is an implementation logic bug. When data is received by connection, connection level window should be open for other steams, it should not wait until data is ready by handler. Reading the data or not by the handler should not be concerned by connection level, it is stream level issue. Please comment what potential issues you can think of with that patch. Regarding the default values, connection level window size should be based on the bandwidth delay product. If it too small, it will under utilize the bandwidth (by idling waiting for window to open after RTT). If it is too big, data will be buffered by TCP which may delay/block control frames or other streams. Ideally we should start with a reasonable default (64K?) and adjust it based on the measured RTT and estimated bandwidth. Not sure it we want to be that optimized. Otherwise we can set a reasonable default. It may be helpful to expose a function for user to control, with RTT and estimated BW available to user. The steam level default should be the same as connection level (equal to bandwidth delay product). If memory is not a issue. However, if we do want to limit server memory, we could limit steam level window size and max number of steams, at the cost of not fully utilizing the whole BW by each steam. I think this is not server default steam & conn flow control value issue, it is a flow control bug, so I changed the title back. |
The problem with immediately returning all conn-level flow control is that it then means you can buffer up to SETTINGS_MAX_CONCURRENT_STREAMS * SETTINGS_INITIAL_WINDOW_SIZE bytes, rather than the value of the connection-level flow control. In your view, what is the purpose of the connection-level flow control if you don't think there should be any pushback on it? As implemented, if handlers don't read data, things are buffered until the flow control limits, and WINDOW_UPDATE isn't refreshed, creating pushback to the peer to stop sending. With your patch, you ignore all connection-level flow control and have no pushback, meaning that only stream-level flow control is enforced. The reason I said it's a question of defaults is because we likely need a larger connection-level flow window. Since you want to retitle this bug, I opened #16512 for investigating the flow control values. |
Thanks for your comments, now I understand that conn-level flow control is used to limit total memory per connection that could be smaller then max_steams * steam_window_size. In that case, it makes sense to have a larger conn-level flow window. It might be helpful to add an option, different user may have different limitations. |
Closing in favor of #16512 |
Ref: #16481
h2test.tar.gz
Here is a test to produce the bug, you can change the bad handler (h2bad) to a slow consumer instead of stalling:
From rfc7540:
5.2. Flow Control
"A flow-control scheme ensures that streams on the same connection do not destructively interfere with each other. "
This is a real world case, a steam with a slow consumer (for example, the handler can't consume data fast enough) can block all other steams in the same connection, even if the connection is wide open and have lots of available bandwidth.
Here is a suggested fix:
The problem is that current implementation sends window update for both connection level and stream level when data is read by handler. The correct way should be to only send window update for steam level when data is read by handler, send connection level window update as soon as a data frame is received, doesn't matter if it is delivered of discarded.
The text was updated successfully, but these errors were encountered: