Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http/httputil: replace Director with Rewrite #53002

Closed
neild opened this issue May 19, 2022 · 17 comments
Closed

net/http/httputil: replace Director with Rewrite #53002

neild opened this issue May 19, 2022 · 17 comments

Comments

@neild
Copy link
Contributor

neild commented May 19, 2022

This proposal seeks to address #50580. The quick summary for the motivation is that the ReverseProxy request-rewriting hook needs to make a clear distinction between the request received by the proxy and the forwarded request sent by the proxy. The existing Director hook does not make this distinction, resulting in unsafe behavior. See #50580 for details.

I propose adding a Rewrite hook to httputil.ReverseProxy, superseding the existing Director hook. Rewrite will take a *httputil.Rewriter parameter. The Rewriter will provide the inbound and outbound requests, as well as convenience methods for modifying the outbound request.

A ReverseProxy with a Rewrite function will default to making minimal modifications to the outbound request. In particular, it will not add X-Forwarded-* headers by default. Methods on Rewriter will provide an easy way to add these headers if desired: Rewriter.SetXForwarded to add headers (replacing any sent by the client) or Rewriter.AppendXForwarded (appending to ones sent by the client, the current default behavior). This imposes a small additional requirement on users of ReverseProxy, but makes the proxy behavior explicit and easier to change in the future.

Proposed API additions:

type ReverseProxy struct {
        // Rewrite is be a function which modifies  
        // the request into a new request to be sent
        // using Transport. Its response is then copied
        // back to the original client unmodified.
        // Rewrite must not access the provided Rewriter
        // or its contents after returning.
        //
        // The X-Forwarded-For header is not automatically set when
        // Rewrite is used. Use the Rewriter.SetXForwarded method
        // to set it.
        //
        // At most one of Rewrite or Director may be set.
        Rewrite func(*Rewriter)
}

// A Rewriter contains a request to be rewritten by a ReverseProxy.
type Rewriter struct {                                  
        // InReq is the request received by the proxy.
        // The Rewrite function must not modify InReq.
        InReq *http.Request                                        
                                                                 
        // OutReq is the request which will be sent by the proxy.
        // The Rewrite function may modify this request.       
        // Hop-by-hop headers are removed from this request   
        // before Rewrite is called.   
        OutReq *http.Request                                     
}            
                                       
// SetURL routes the outbound request to the scheme, host, and base path
// provided in target. If the target's path is "/base" and the incoming
// request was for "/dir", the target request will be for "/base/dir".
// SetURL does not rewrite the Host header.                           
func (r *Rewriter) SetURL(target *url.URL) {}

// SetXForwarded sets the X-Forwarded-For, X-Forwarded-Host, and     
// X-Forwarded-Proto headers of the outbound request.        
//
//   - The X-Forwarded-For header is set to the client IP address.
//   - The X-Forwarded-Host header is set to the host name requested
//     by the client.
//   - The X-Forwarded-Proto header is set to "http" or "https", depending
//     on whether the inbound request was made on a TLS-enabled connection.
func (r *Rewriter) SetXForwarded() {}

// SetXForwarded sets the X-Forwarded-For, X-Forwarded-Host, and
// X-Forwarded-Proto headers of the outbound request. It behaves
// like SetXForwarded, but appends the client IP address to the
// X-Forwarded-For header in the inbound request (if present).
func (r *Rewriter) AppendXForwarded() {}

Example usage:

// This is the equivalent of httputil.NewSingleHostReverseProxy(someURL).
proxy := httputil.ReverseProxy{
        Rewrite: func(r *httputil.Rewriter) {
                r.SetXForwarded()
                r.SetURL(someURL)
        }
}

A draft implementation is in https://go.dev/cl/407214.

@neild neild added the Proposal label May 19, 2022
@gopherbot gopherbot added this to the Proposal milestone May 19, 2022
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) May 19, 2022
@rsc
Copy link
Contributor

rsc commented May 25, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) May 25, 2022
@urandom2
Copy link
Contributor

I am bike shedding, but httputil.Rewriter looks like it should be an interface.

I do not see discussion in #50580, so I will ask here: what is the benefit to the func(*struct{}) and (void?) methods, over having httputil.ReverseProxy.Rewrite be a struct with fields to be configured? It seems more complex, which I think is a bit of a double edged sword, but I have not seen this pattern elsewhere in the stdlib. My alternative would look like this in your example:

// This is the equivalent of httputil.NewSingleHostReverseProxy(someURL).
proxy := httputil.ReverseProxy{
        Rewrite: httputil.Rewrite{
                SetXForwarded: true,
                SetURL: someURL
        }
}

@neild
Copy link
Contributor Author

neild commented May 25, 2022

Rewriter definitely isn't an interface. There's only one implementation of it, and we want to be able to add methods to it in the future.

The proposed Rewrite hook, like the Director hook that it would supersede, is a user-provided function that can perform any amount of modification of a proxied request before it is forwarded. It can't be replaced by a fixed list of operations.

@urandom2
Copy link
Contributor

Rewriter definitely isn't an interface. There's only one implementation of it, and we want to be able to add methods to it in the future.

Should have been more clear: pkg.Doer, to me, looks like an interface because it ends in -er. We may want to consider a different name.

The proposed Rewrite hook, like the Director hook that it would supersede, is a user-provided function that can perform any amount of modification of a proxied request before it is forwarded. It can't be replaced by a fixed list of operations.

Yep, sorry the example pigenholed my thinking; that makes sense.

That said, why then would we not make it Rewrite: func(*http.Request) *http.Request? Having the methods for setURL or forward headers as utility functions could be useful outside this usecase.

You could also have two inputs if void is desirable: func(in, out *http.Request). It could also be worth making in an http.Request, no pointer, to suggest/enforce this immutability requirement.

@neild
Copy link
Contributor Author

neild commented May 26, 2022

func(*http.Request) *http.Request doesn't address #50580. If the input request contains hop-by-hop headers, we can't tell if a header in the output was added by the rewriter; if the input doesn't contain hop-by-hop headers, the rewriter has no way to look for ones it may be interested in. We need to pass both the inbound and outbound requests to the user hook.

Wrapping both requests up in a Rewriter type gives us a good place to hang additional functions such as SetURL. While the ones in this proposal could also be stand-alone functions, making them methods of a type passed to Rewrite is good for discoverability, especially given the overly-broad API surface in httputil. It also gives us the option of adding methods in the future that customize the proxy behavior in other ways.

@urandom2
Copy link
Contributor

func(*http.Request) *http.Request doesn't address #50580. If the input request contains hop-by-hop headers, we can't tell if a header in the output was added by the rewriter; if the input doesn't contain hop-by-hop headers, the rewriter has no way to look for ones it may be interested in. We need to pass both the inbound and outbound requests to the user hook.

No that makes sense, there has already been some preprocessing done to create a populated Rewriter.OutReq, thus we would need the two parameter form:

// This is the equivalent of httputil.NewSingleHostReverseProxy(someURL).
proxy := httputil.ReverseProxy{
        Rewrite: func(in, out *http.Request) { ...}
}

Wrapping both requests up in a Rewriter type gives us a good place to hang additional functions such as SetURL. While the ones in this proposal could also be stand-alone functions, making them methods of a type passed to Rewrite is good for discoverability, especially given the overly-broad API surface in httputil. It also gives us the option of adding methods in the future that customize the proxy behavior in other ways.

Yeah, if we see a clear need to add new struct fields or methods that include magic [recte internal apis], I can see this being useful; it just feels like a novel design pattern akin to grpc input structs, which feel decidedly non go-like. I just prefer the obvious nature of my explicit format.

@rsc
Copy link
Contributor

rsc commented Jun 8, 2022

Have we resolved the objections to Rewriter?
Should we tentatively try to land it early in Go 1.20?
What other open issues would we be able to close if we add Rewriter?

@neild
Copy link
Contributor Author

neild commented Jun 8, 2022

Perhaps there's a better name than Rewriter, but in the absence of a concrete idea for something I think it's fine. (RewriteState? RewriteContext?)

Landing this early in 1.20 would be nice.

Looking through issues that mention ReverseProxy:

Primary motivation for this proposal (and sufficient to justify it, I believe, since this is a security concern):

Issues related to forwarding headers (X-Forwarded-For, etc.). Making the addition of X-Forwarded headers explicit makes it easier to adjust this behavior in a backwards-compatible fashion.

Issues related to NewSingleHostReverseProxy preserving the inbound request's Host header. We could make the Rewriter.SetURL method clear the inbound Host. Even if we preserve the current behavior of preserving the host, Rewrite makes it much simpler for the user to customize the proxy behavior, since it doesn't require wrapping the Director function created by NewSingleHostReverseProxy.

@neild
Copy link
Contributor Author

neild commented Jun 9, 2022

Possibly better name than Rewriter: ProxyRequest.

@gopherbot
Copy link

Change https://go.dev/cl/407214 mentions this issue: net/http/httputil: add ReverseProxy.Rewrite

@neild
Copy link
Contributor Author

neild commented Jun 11, 2022

CL 407214 contains an implementation of this proposal, with a few changes from the original post:

  • Renamed Rewriter to ProxyRequest.
  • When using a Rewrite func,
    • The default behavior is to send no X-Forwarded-* headers (stripping ones from the inbound request).
    • ProxyRequest.SetXForwarded will set X-Forwarded-*.
    • To append to the client's X-Forwarded-For header, copy the inbound header to the outbound request and call ProxyRequest.SetXForwarded.
  • ProxyRequest.SetURL sets the outbound request's Host header.

The X-Forwarded-* changes preference more secure behavior (not to trust the client's headers).

The SetURL change addresses #28168, while leaving it simple to get the current behavior of NewSingleHostReverseProxy back if desired.

@seankhliao
Copy link
Member

If we're clearing X-Forwarded-*, then should Forwarded also be preemptively cleared?

@neild
Copy link
Contributor Author

neild commented Jun 11, 2022

Yes, I'm actually already clearing Forwarded in the CL but neglected to mention it specifically.

@rsc
Copy link
Contributor

rsc commented Jun 15, 2022

FWIW when I said "objections to Rewriter" I meant this proposal generally, not the name. We should use whatever name you think is clearest; Rewriter is fine if so.

Does anyone object to trying this in Go 1.20?

@rsc rsc moved this from Active to Likely Accept in Proposals (old) Jun 22, 2022
@rsc
Copy link
Contributor

rsc commented Jun 22, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) Jul 1, 2022
@rsc
Copy link
Contributor

rsc commented Jul 1, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: net/http/httputil: replace Director with Rewrite net/http/httputil: replace Director with Rewrite Jul 1, 2022
@rsc rsc modified the milestones: Proposal, Backlog Jul 1, 2022
@dmitshur dmitshur removed this from the Backlog milestone Aug 16, 2022
@dmitshur dmitshur added this to the Go1.20 milestone Aug 16, 2022
@gopherbot
Copy link

Change https://go.dev/cl/450515 mentions this issue: doc/go1.20: add release notes for net/http and net/http/httputil

gopherbot pushed a commit that referenced this issue Nov 15, 2022
For #41773
For #41773
For #50465
For #51914
For #53002
For #53896
For #53960
For #54136
For #54299

Change-Id: I729d5eafc1940d5706f980882a08ece1f69bb42c
Reviewed-on: https://go-review.googlesource.com/c/go/+/450515
Auto-Submit: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@golang golang locked and limited conversation to collaborators Nov 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

6 participants