Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: Go incorrectly unescapes URL and breaks HTTP Redirects #3433

Closed
gopherbot opened this issue Mar 29, 2012 · 7 comments
Closed

net/url: Go incorrectly unescapes URL and breaks HTTP Redirects #3433

gopherbot opened this issue Mar 29, 2012 · 7 comments

Comments

@gopherbot
Copy link

by stephane.travostino:

(Using Go 1)

Consider the following URL:

http://www.hulkshare.com/dl/y9g6b0xkgg37/01%20Rack%20City%20(Remix).mp3

When downloading the following file with "curl", this is the interaction:

1) GET http://www.hulkshare.com/dl/y9g6b0xkgg37/01%20Rack%20City%20(Remix).mp3
    <- 302 Found, http://trckr.hulkshare.com/hulkdl/y9g6b0xkgg37/01_Rack_City_%28Remix%29.mp3?z=1

2) GET http://trckr.hulkshare.com/hulkdl/y9g6b0xkgg37/01_Rack_City_%28Remix%29.mp3?z=1
    <- 302 Found, http://cdn01.hulkshare.com/dev1/0/003/400/0003400309.fid/01_Rack_City_%28Remix%29.mp3?key=800ab0ba09002ead53b1f2b4fed24b32&;dl=1

3) GET
http://cdn01.hulkshare.com/dev1/0/003/400/0003400309.fid/01_Rack_City_%28Remix%29.mp3?key=800ab0ba09002ead53b1f2b4fed24b32&;dl=1
    <- 200 OK

Using Go, the following happens:

1) GET http://www.hulkshare.com/dl/y9g6b0xkgg37/01%20Rack%20City%20(Remix).mp3
    <- 302 Found, http://trckr.hulkshare.com/hulkdl/y9g6b0xkgg37/01_Rack_City_%28Remix%29.mp3?z=1

2) GET http://trckr.hulkshare.com/hulkdl/y9g6b0xkgg37/01_Rack_City_(Remix).mp3?z=1
    <- 302 Found, http://cdn04.hulkshare.com/dev4/0/003/400/0003400309.fid/01_Rack_City_%28Remix%29.mp3?key=51eb16a7b50640caabc8224df081e86c&;dl=1

3) GET
http://cdn04.hulkshare.com/dev4/0/003/400/0003400309.fid/01_Rack_City_(Remix).mp3?key=51eb16a7b50640caabc8224df081e86c&;dl=1
    <- 412 Precondition Failed

You can see that the first 302 returned an URL which has been unescaped by Go before
executing GET #2, same for #3 and for some reason the web server returns an error.

The bug seems to be in net/url Parse function which unescapes the string by default,
breaking this mechanism.
@gopherbot
Copy link
Author

Comment 1 by stephane.travostino:

The net/url parsing function is not broken in all cases. I'm attaching a simple test
case, which converts a string to *URL, and from *URL to string.
The output is:
Correct:
Before: http://www.hulkshare.com/dl/y9g6b0xkgg37/01%20Rack%20City%20(Remix).mp3
After: http://www.hulkshare.com/dl/y9g6b0xkgg37/01%20Rack%20City%20(Remix).mp3
Incorrect:
Before:
http://cdn04.hulkshare.com/dev4/0/003/400/0003400309.fid/01_Rack_City_%28Remix%29.mp3?key=33f088e7a37cd0b4d555f2700be39c1f&dl=1
After:
http://cdn04.hulkshare.com/dev4/0/003/400/0003400309.fid/01_Rack_City_(Remix).mp3?key=33f088e7a37cd0b4d555f2700be39c1f&dl=1

Attachments:

  1. issue3433.go (518 bytes)

@gopherbot
Copy link
Author

Comment 2 by stephane.travostino:

In http://golang.org/src/pkg/net/url/url.go, line 71, a TODO comment reminds the author
to update the reserved characters as specified in RFC 3986.
In that RFC, the '(' and ')' character, for example, are reserved, and as such the two
"After" strings in the output above are invalid.
I'll try to write a patch for this.

@gopherbot
Copy link
Author

Comment 3 by stephane.travostino:

With this patch, supporting RFC 3986, I no longer get the 412 HTTP error.

Attachments:

  1. support-for-rfc3986.patch (1522 bytes)

@minux
Copy link
Member

minux commented Mar 30, 2012

Comment 4:

stephane.travostino, you can follow the instruction detailed here to contribute the
patch.
http://golang.org/doc/contribute.html#Code_review

@dsymonds
Copy link
Contributor

dsymonds commented Apr 4, 2012

Comment 5:

Labels changed: added priority-later, packagebug, removed priority-triage.

Status changed to Started.

@rsc
Copy link
Contributor

rsc commented Apr 5, 2012

Comment 6:

This issue was closed by revision 56024fa.

Status changed to Fixed.

@rsc
Copy link
Contributor

rsc commented Apr 25, 2012

Comment 7:

This issue was closed by revision 4b0694f834af.

rsc pushed a commit that referenced this issue May 11, 2015
««« backport 6b46fb967ca4
net/url: Correctly escape URL as per RFC 3986

The shouldEscape function did not correctly escape the reserved characters listed in RFC 3986 §2.2, breaking some strict web servers.
Fixes #3433.

R=rsc
CC=golang-dev
https://golang.org/cl/5970050

»»»
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants