New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: document Transport errors more, types of errors returned, how to check canceled, etc #15935
Comments
@erikdubbelboer what would you do if |
@davecheney we use http for some internal communication. It's around 5000 requests per second with a 100ms timeout before we cancel the request. At the moment we log all errors but would like to ignore errors that aren't important to us. If I'm not sure which errors you would call temporary. For example we already ignore connection refused errors using: if uerr, ok := err.(*url.Error); ok {
if noerr, ok := uerr.Err.(*net.OpError); ok {
if scerr, ok := noerr.Err.(*os.SyscallError); ok {
if scerr.Err == syscall.ECONNREFUSED {
ignore = true
}
}
}
} |
Don't you already know the request was canceled since your code canceled the request, or does cancellation happen across an API boundary? Also, I think multiple different errors can be returned depending on when the request is cancelled (currently errRequestCanceled or errRequestCanceledConn). |
Yes it's across an API boundary. Both sides are internal code so we could fix this internally but I thought it would also be nice for other people to be able to detect this error this way. Shouldn't in theory all error types the Go API returns be exported so a user can always detect them reliably without resorting to string comparisons? |
I don't entirely disagree. I have some ugly code that detects net/http's errTimeout by inspecting the error string. It would be nice if that error was exported. The counter-argument is that this forces the API to have a mostly-stable list of possible errors, which can be difficult. How do you decide which errors to export? If your errors are too abstract, the error messages are less useful. If your errors are too low-level, there can be heavy churn as the code changes and old errors are deprecated (e.g., see issue #15150). Strictly speaking, I don't think all errors should be exported, but maybe there's a small set of stable errors to export? ErrRequestCanceled and ErrResponseHeadersTimedOut are both good candidates since they relate directly to behaviors controlled by the net/http API. |
@nightlyone might have suggestions for what to do here, per his #15192 |
@tombergan having a type satisfying https://godoc.org/github.com/nightlyone/errorclass#Canceled for errRequestCanceled might help this effort. So the user of the http package will know what actions to take from these error classifications. It seems the errTimeout is already correctly classified and Temporary() will will return true there. It will even report that it is a timeout. The exact error is usually not needed to decide what to do once the application is running, since complete error enumerations is an endless, never complete task. But what is useful, is to support the decision, whether I should retry (Temporary() == true), ignore it (in the cancel case) or report it. |
If only it were that simple! Before seeing this issue, you might have reasonably concluded that Temporary() was "good enough" to make programmatic decisions regarding HTTP errors. Then this bug added another case, which is to ignore canceled errors. Your errorclass seems to handle this case. Then a future bug wants to know if the request timed out because Transport.ResponseHeaderTimeout was exceeded, or because Transport.TLSHandshakeTimeout was exceeded, or because the req.Context exceeded its deadline, or because of a DNS timeout, or because of a TCP connect timeout, etc. You cannot answer these questions with a simple IsTimeout() method on the error. For example: I maintain a large proxy service that uses net/http. Rather than logging net/http errors as strings, we convert net/http errors into a custom enum error space (mostly because it's easy to verify that an enum doesn't contain PII, compared to an error string, which can contain arbitrary info; also, enums are easier to process than strings at large scales). Part of that service is ~70 lines of code that tries to convert the error returned by RoundTrip() or Body.Read() into an error in our custom enum space. This code actually isn't too bad. The standard library already exports types like net.OpError, net.DNSError, and os.SyscallError. However, we check for ResponseHeaderTimeout errors by checking for error strings that contain "timeout awaiting response headers". This introspection would be unnecessary if net/http's errTimeout was exported. I agree with your comment that "complete error enumerations is an endless, never complete task", but that comment is also overly pessimistic. Here's a simple rule: if an API exports N knobs, then it should also export N error types or vars, where each error means that the operation failed due to the corresponding knob. For net/http.RoundTrip, I think three errors could be exported following this rule:
Note that (Footnote: I think you could make a case that RoundTrip should return Request.Context().Err() instead of errRequestCanceled when Request.Context() completes, to preserve that original error.) |
@ianlancetaylor recently submitted 0b5f2f0 which does that. Maybe that's enough for this bug? |
Yep, I think that would satisfy @erikdubbelboer's specific request. WDYT about also exporting errTimeout and tlsHandshakeTimeoutError because of this suggested rule:
|
Both of those already have a I don't want the error package to be a minefield of error types and variables obscuring the useful parts. I also don't want the implementation to be restricted to returning specific error variables over time, constraining the ability to refactor things. I'd rather promise properties of the errors, like that its Actually, it's not. The only mention of url.Error at https://golang.org/pkg/net/http/ is this one appearance:
But nothing else talks about *url.Error. Before we start ad-hoc fixing stuff like exporting single variables, I'd rather have a concerted effort to document the status quo where one already exists, make inconsistent things consistent, then document them, and write new tests locking in behavior for the documentation. Maybe we have some of those tests already, and some (but not enough) docs. But I don't think the answer is more error variables right now. Witness this error variable:
It's in the public docs and was only documented to be no longer used after it was discovered that it was vestigial (#15150). We don't want more of those. Do you want to specifically know when a TLS handshake timeout happens? Can you not already tell between httptrace hooks and the |
I think that's right -- these questions can be answered via the httptrace package. So I think you can close this bug.
For the record (not important; you can close this bug), I don't think this is a problem for the rule proposed above because an error is exported only if a knob exists in the public API. You're constrained to support that error for as long as you're constrained to support that API knob. |
@tombergan, I'll just slightly repurpose this bug instead, changing its title. |
As described in #18272, it would also be nice if you could easily tell if an error returned by http.Transport.RoundTrip came from communicating with the target server or from reading request.Body (or a third place I'm not considering). |
I've started some notes in https://docs.google.com/a/golang.org/document/d/15J1ypbcK7OJ2IJ6TEgB3TitNPTCpo0kAwBbrZu2tzPo/edit?usp=sharing about what Transport actually does today. Hopefully people don't depend on the current behavior too much. I seem to recall other bugs though where people have made deep assumptions about specific error types returned by RoundTrip, even though nothing is promised. |
CL https://golang.org/cl/34381 mentions this issue. |
The client code was using time.Now() (wall time) to determine whether the cause of a non-nil error meant that a timeout had occured. But on Windows, the clock used for timers (time.After, time.Sleep, etc) is much more accurate than the time.Now clock, which doesn't update often. But it turns out that as of the recent https://golang.org/cl/32478 we already have the answer available easily. It just wasn't in scope. Instead of passing this information along by decorating the errors (risky this late in Go 1.8, especially with #15935 unresolved), just passing along the "didTimeout" func internally for now. We can remove that later in Go 1.9 if we overhaul Transport errors. Fixes #18287 (I hope) Change-Id: Icbbfceaf02de6c7ed04fe37afa4ca16374b58f3c Reviewed-on: https://go-review.googlesource.com/34381 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
Well, I looked into this, but it's too much to document for Go 1.8, not that I think we'd even want to document it in its current state. (See earlier comment) |
CL https://golang.org/cl/37495 mentions this issue. |
If I put a 10 millisecond sleep at testHookWaitResLoop, before the big select in (*persistConn).roundTrip, two flakes immediately started happening, TestTransportBodyReadError (#19231) and TestTransportPersistConnReadLoopEOF. The problem was that there are many ways for a RoundTrip call to fail (errors reading from Request.Body while writing the response, errors writing the response, errors reading the response due to server closes, errors due to servers sending malformed responses, cancelations, timeouts, etc.), and many of those failures then tear down the TCP connection, causing more failures, since there are always at least three goroutines involved (reading, writing, RoundTripping). Because the errors were communicated over buffered channels to a giant select, the error returned to the caller was a function of which random select case was called, which was why a 10ms delay before the select brought out so many bugs. (several fixed in my previous CLs the past few days). Instead, track the error explicitly in the transportRequest, guarded by a mutex. In addition, this CL now: * differentiates between the two ways writing a request can fail: the io.Copy reading from the Request.Body or the io.Copy writing to the network. A new io.Reader type notes read errors from the Request.Body. The read-from-body vs write-to-network errors are now prioritized differently. * unifies the two mapRoundTripErrorFromXXX methods into one mapRoundTripError method since their logic is now the same. * adds a (*Request).WithT(*testing.T) method in export_test.go, usable by tests, to call t.Logf at points during RoundTrip. This is disabled behind a constant except when debugging. * documents and deflakes TestClientRedirectContext I've tested this CL with high -count values, with/without -race, with/without delays before the select, etc. So far it seems robust. Fixes #19231 (TestTransportBodyReadError flake) Updates #14203 (source of errors unclear; they're now tracked more) Updates #15935 (document Transport errors more; at least understood more now) Change-Id: I3cccc3607f369724b5344763e35ad2b7ea415738 Reviewed-on: https://go-review.googlesource.com/37495 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Duplicate of #9424. (Not sure why I never noticed earlier) |
At the moment the only wat to know if an
http.Request
was canceled is to check the error string. Ifhttp.errRequestCanceled
is exported this becomes much easier.I can submit a change to gerrit if needed?
The text was updated successfully, but these errors were encountered: