New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: http.Get unnecessary data transfer #36242
Comments
Did you try a Range header? https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range Re playground link, it has a PS: You'll get faster help with questions by posting to golang-nuts, reddit, etc... |
@networkimprov This issue isn't a question, it's about reporting the inefficient behavior of http roundtrippers. They read a lot of data before a read from a response body. The playground code is about testing the behavior of http roundtrippers on a higher-level, I'm not using that code in production—I created it just to demonstrate the issue. |
@johnfitzz sorry for I dared to look into this too. Can you please specify which version of Also as a side note I'm not sure "inefficient behavior" is the correct term. Connection first fetches chunk of data and only then we parse where is the header end. It's rather question of what prefetch size we think is better and whether we can control this. As you use GET instead of HEAD it is normal to expect you would like to receive body and so prefetching part of it doesn't look wrong. In this sense approach with LimitedReader which you describe looks quite proper for altering normal behavior. |
@RodionGork In http/2, the prefetched chunk of data is about 50% of the whole body, it consumes network and memory resources. To me inefficiency is the correct term here because the package hands us a reader but it prefetches a lot of data without asking us. It's uncontrollable. You never know how many bytes you get when you use the http package. Another problem is that there is no good documentation that it prefetches a huge amount of body data. It's a blackbox. I've debugged and analyzed the rountripping code several times. It fires up goroutines and continuously fetches data from the server before you read a single byte from the body reader. It may be normal to get some body data (that the server sends) but the http package itself prefetches megabytes of data (50% of the data, especially in http/2) and to me this is not "efficient". It's an implicit, inefficient behavior, and almost uncontrollable. The LimitedReader approach doesn't make sense because the roundtrippers already cache a lot of data before I read. I'm on OS X and nettop version depends on the version of OS X which is 10.15.1. I validated the same issue using various tools like OS X's Activity Monitor, Wireshark etc. For Ubuntu, check this out. This issue is not related to a specific tool but rather it's about the http package. Also make sure that you're fetching with http/2 to see the whole problem. |
You didn't comment on my question about the HTTP Range header above. |
@networkimprov Say you're going to download the whole body but if you read some unwanted bytes down the stream, you want to interrupt the download. If it's not clear where you can find those bytes in the body, you can't use a range header. In the current http package you can't do that efficiently because it already prefetches a lot of data. As I said, this issue is not about http range header etc. It's about the behavior of the http package itself. |
Do a series of GETs with consecutive ranges; check for unwanted bytes after each. http.Client can't divine the optimal segment size for your download. I wouldn't assume the behavior of this widely used API is wrong for most use cases, so if I'd noticed this, I'd have posted a Q on golang-nuts. |
@networkimprov As I said before, I'm not looking for solutions for my specific case, thank you. Rather, I'm talking about that the http package is doing unnecessary data transfer, it's uncontrollable and undocumented. |
http roundtrippers transfer unnecessary amount of data even if nothing is read from the response body.
In my production code (which is not demonstrated here) I'm reading the first 10-15 bytes from a large file over http. However, http roundtrippers transfer unnecessary data. In the real code I'm using io.LimitedReader and/or CopyN or something to limit the data. However, this issue is not about how to transfer data efficiently, this issue is about roundtrippers' inefficient behavior.
What version of Go are you using (
go version
)?What version of OS X are you using?
What did you do?
See the code on playground.
Note: I created this code only to demonstrate the issue.
What did you expect to see?
10 KiB is arbitrary here but I wanted to see less amount of data is being transferred.
What did you see instead?
This is what I saw — it transferred unnecessary amount of data:
Investigation:
transport.go#L530-L536 here, when it selects the alternative transport (
pconn.alt.RoundTrip(req)
), it buffers up a lot of data even before I read from the response.Body.The text was updated successfully, but these errors were encountered: