Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: mention changes in go1.6 docs #13647

Closed
bradfitz opened this issue Dec 17, 2015 · 6 comments
Closed

archive/tar: mention changes in go1.6 docs #13647

bradfitz opened this issue Dec 17, 2015 · 6 comments

Comments

@bradfitz
Copy link
Contributor

http://tip.golang.org/doc/go1.6 does not mention archive/tar at all, but tons of work went into in this cycle.

@dsnet, can you help summarize the notable parts?

@bradfitz bradfitz added this to the Go1.6 milestone Dec 17, 2015
@rsc
Copy link
Contributor

rsc commented Dec 17, 2015

It looks like "bugs fixed in arcane corner cases", which usually doesn't go into release notes. The most visible change is spelling license correctly.

Maybe e3b615f (detect truncated files) is worth mentioning, but maybe not.

$ git log go1.5..origin/master src/archive/tar
commit 2ae895c0ce36ffb607442ff053bd19cb5fcd6fd6
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Wed Dec 16 11:26:26 2015 -0800
Commit:     Russ Cox <rsc@golang.org>
CommitDate: Thu Dec 17 02:46:41 2015 +0000

    archive/tar: spell license correctly in example

    Change-Id: Ice85d161f026a991953bd63ecc6ec80f8d06dfbd
    Reviewed-on: https://go-review.googlesource.com/17901
    Run-TryBot: Joe Tsai <joetsai@digital-static.net>
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

commit 5ebb374f96da366d2c44b2fae7d9b3d0af7825dd
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Wed Dec 2 15:41:44 2015 -0800
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Fri Dec 4 22:38:15 2015 +0000

    archive/tar: properly parse GNU base-256 encoding

    Motivation:
    * Previous implementation did not detect integer overflow when
    parsing a base-256 encoded field.
    * Previous implementation did not treat the integer as a two's
    complement value as specified by GNU.

    The relevant GNU specification says:
    <<<
    GNU format uses two's-complement base-256 notation to store values
    that do not fit into standard ustar range.
    >>>

    Fixes #12435

    Change-Id: I4639bcffac8d12e1cb040b76bd05c9d7bc6c23a8
    Reviewed-on: https://go-review.googlesource.com/17424
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit 481eedce62015f9f7ed390bb01b1afc63300b0e6
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Wed Dec 2 15:48:06 2015 -0800
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Fri Dec 4 21:38:38 2015 +0000

    archive/tar: properly format GNU base-256 encoding

    Motivation:
    * Previous implementation silently failed when an integer overflow
    occurred. Now, we report an ErrFieldTooLong.
    * Previous implementation did not encode in two's complement format and was
    unable to encode negative numbers.

    The relevant GNU specification says:
    <<<
    GNU format uses two's-complement base-256 notation to store values
    that do not fit into standard ustar range.
    >>>

    Fixes #12436

    Change-Id: I09c20602eabf8ae3a7e0db35b79440a64bfaf807
    Reviewed-on: https://go-review.googlesource.com/17425
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit b8a12928a65a15ecce5c9332da9feef0cb221389
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Wed Sep 16 00:58:56 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Wed Dec 2 02:27:27 2015 +0000

    archive/tar: convert Reader.Next to be loop based

    Motivation for change:
    * Recursive logic is hard to follow, since it tends to apply
    things in reverse. On the other hand, the tar formats tend to
    describe meta headers as affecting the next entry.
    * Recursion also applies changes in the wrong order. Two test
    files are attached that use multiple headers. The previous Go
    behavior differs from what GNU and BSD tar do.

    Change-Id: Ic1557256fc1363c5cb26570e5d0b9f65a9e57341
    Reviewed-on: https://go-review.googlesource.com/14624
    Run-TryBot: Joe Tsai <joetsai@digital-static.net>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

commit 38f8e4407c3626a9509760089b82eecb4d0f2d48
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Mon Sep 28 13:49:35 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Wed Dec 2 00:29:33 2015 +0000

    archive/tar: move parse/format methods to standalone receiver

    Motivations for this change:
    * It allows these functions to be used outside of Reader/Writer.
    * It allows these functions to be more easily unit tested.

    Change-Id: Iebe2b70bdb8744371c9ffa87c24316cbbf025b59
    Reviewed-on: https://go-review.googlesource.com/15113
    Reviewed-by: Russ Cox <rsc@golang.org>
    Run-TryBot: Joe Tsai <joetsai@digital-static.net>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

commit 7823197e5d6a024a4fe2f4f9ca414cb5244eb10f
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Oct 1 01:35:15 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Tue Dec 1 20:22:38 2015 +0000

    archive/tar: fix issues with readGNUSparseMap1x0

    Motivations:
    * Use of strconv.ParseInt does not properly treat integers as 64bit,
    preventing this function from working properly on 32bit machines.
    * Use of io.ReadFull does not properly detect truncated streams
    when the file suddenly ends on a block boundary.
    * The function blindly trusts user input for numEntries and allocates
    memory accordingly.
    * The function does not validate that numEntries is not negative,
    allowing a malicious sparse file to cause a panic during make.

    In general, this function was overly complicated for what it was
    accomplishing and it was hard to reason that it was free from
    bounds errors. Instead, it has been rewritten and relies on
    bytes.Buffer.ReadString to do the main work. So long as invariants
    about the number of '\n' in the buffer are maintained, it is much
    easier to see why this approach is correct.

    Change-Id: Ibb12c4126c26e0ea460ea063cd17af68e3cf609e
    Reviewed-on: https://go-review.googlesource.com/15174
    Reviewed-by: Russ Cox <rsc@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Tue Nov 3 18:12:31 2015 -0800
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Tue Dec 1 20:16:26 2015 +0000

    archive/tar: properly handle header-only "files" in Reader

    Certain special type-flags, specifically 1, 2, 3, 4, 5, 6,
    do not have a data section. Thus, regardless of what the size field
    says, we should not attempt to read any data for these special types.

    The relevant PAX and USTAR specification says:
    <<<
    If the typeflag field is set to specify a file to be of type 1 (a link)
    or 2 (a symbolic link), the size field shall be specified as zero.
    If the typeflag field is set to specify a file of type 5 (directory),
    the size field shall be interpreted as described under the definition
    of that record type. No data logical records are stored for types 1, 2, or 5.
    If the typeflag field is set to 3 (character special file),
    4 (block special file), or 6 (FIFO), the meaning of the size field is
    unspecified by this volume of POSIX.1-2008, and no data logical records shall
    be stored on the medium.
    Additionally, for type 6, the size field shall be ignored when reading.
    If the typeflag field is set to any other value, the number of logical
    records written following the header shall be (size+511)/512, ignoring
    any fraction in the result of the division.
    >>>

    Contrary to the specification, we do not assert that the size field
    is zero for type 1 and 2 since we liberally accept non-conforming formats.

    Change-Id: I666b601597cb9d7a50caa081813d90ca9cfc52ed
    Reviewed-on: https://go-review.googlesource.com/16614
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit 3a3049897c0c62907da398819f61d77199df52ad
Author:     Matt Layher <mdlayher@gmail.com>
AuthorDate: Thu Aug 27 14:52:06 2015 -0400
Commit:     Russ Cox <rsc@golang.org>
CommitDate: Fri Nov 13 02:02:32 2015 +0000

    archive/tar: make output deterministic

    Replaces PID in PaxHeaders with 0.  Sorts PAX header keys before writing
    them to the archive.

    Fixes #12358

    Change-Id: If239f89c85f1c9d9895a253fb06a47ad44960124
    Reviewed-on: https://go-review.googlesource.com/13975
    Reviewed-by: Russ Cox <rsc@golang.org>
    Reviewed-by: Joe Tsai <joetsai@digital-static.net>

commit e3b615fd6c633a05a5d4d46cc0345fdfb82c28e6
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Oct 1 02:30:29 2015 -0700
Commit:     Ian Lance Taylor <iant@golang.org>
CommitDate: Fri Nov 6 04:31:26 2015 +0000

    archive/tar: detect truncated files

    Motivation:
    * Reader.skipUnread never reports io.ErrUnexpectedEOF. This is strange
    given that io.ErrUnexpectedEOF is given through Reader.Read if the
    user manually reads the file.
    * Reader.skipUnread fails to detect truncated files since io.Seeker
    is lazy about reporting errors. Thus, the behavior of Reader differs
    whether the input io.Reader also satisfies io.Seeker or not.

    To solve this, we seek to one before the end of the data section and
    always rely on at least one call to io.CopyN. If the tr.r satisfies
    io.Seeker, this is guarunteed to never read more than blockSize.

    Fixes #12557

    Change-Id: I0ddddfc6bed0d74465cb7e7a02b26f1de7a7a279
    Reviewed-on: https://go-review.googlesource.com/15175
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit e4add8d569d3152a461dbdf6e086dd60c8ca6c27
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Oct 1 01:04:24 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Tue Oct 6 17:49:05 2015 +0000

    archive/tar: fix numeric overflow issues in readGNUSparseMap0x1

    Motivation:
    * The logic to verify the numEntries can overflow and incorrectly
    pass, allowing a malicious file to allocate arbitrary memory.
    * The use of strconv.ParseInt does not set the integer precision
    to 64bit, causing this code to work incorrectly on 32bit machines.

    Change-Id: I1b1571a750a84f2dde97cc329ed04fe2342aaa60
    Reviewed-on: https://go-review.googlesource.com/15173
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit 281eabe46f638139b8d85d87a359880dc0f8ea81
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Tue Oct 6 01:04:18 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Tue Oct 6 17:13:11 2015 +0000

    archive/tar: add missing error checks to Reader.Next

    A recursive call to Reader.Next did not check the error before
    trying to use the result, leading to a nil pointer panic.
    This specific CL addresses the immediate issue, which is the panic,
    but does not solve the root issue, which is due to an integer
    overflow in the base-256 parser.

    Updates #12435

    Change-Id: Ia908671f0f411a409a35e24f2ebf740d46734072
    Reviewed-on: https://go-review.googlesource.com/15437
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit cfb116d4538dcf2455ca59e786396c779019c065
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Oct 1 02:59:49 2015 -0700
Commit:     Andrew Gerrand <adg@golang.org>
CommitDate: Tue Oct 6 05:06:58 2015 +0000

    archive/tar: expand abilities of TestReader

    Motivation:
    * There are an increasing number of "one-off" corrupt files added
    to make sure that package does not succeed or crash on them.
    Instead, allow for the test to specify the error that is expected
    to occur (if any).
    * Also, fold in the logic to check the MD5 checksum into this
    function.

    The following tests are being removed:
    * TestIncrementalRead: Done by TestReader by using io.CopyBuffer
    with a buffer of 8. This achieves the same behavior as this test.
    * TestSparseEndToEnd: Since TestReader checks the MD5 checksums
    if the input corpus provides them, then this is redundant.
    * TestSparseIncrementalRead: Redundant for the same reasons that
    TestIncrementalRead is now redundant
    * TestNegativeHdrSize: Added to TestReader corpus
    * TestIssue10968: Added to TestReader corpus
    * TestIssue11169: Added to TestReader corpus

    With this change, code coverage did not change: 85.3%

    Change-Id: I8550d48657d4dbb8f47dfc3dc280758ef73b47ec
    Reviewed-on: https://go-review.googlesource.com/15176
    Reviewed-by: Andrew Gerrand <adg@golang.org>

commit 02d2db18a79ad2b143fe6501dba22feae5260778
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Oct 1 03:08:18 2015 -0700
Commit:     Brad Fitzpatrick <bradfitz@golang.org>
CommitDate: Thu Oct 1 22:33:33 2015 +0000

    archive/tar: make Reader.Read errors persistent

    If the stream is in an inconsistent state, it does not make sense
    that Reader.Read can be called and possibly succeed.

    Change-Id: I9d1c5a1300b2c2b45232188aa7999e350809dcf2
    Reviewed-on: https://go-review.googlesource.com/15177
    Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
    Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

commit 79480ca07a1515223d49031c59ae37b662f45b5e
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Mon Sep 28 16:38:16 2015 -0700
Commit:     Andrew Gerrand <adg@golang.org>
CommitDate: Thu Oct 1 00:51:15 2015 +0000

    archive/tar: fix bugs with sparseFileReader

    The sparseFileReader is prone to two different forms of
    denial-of-service attacks:
    * A malicious tar file can cause an infinite loop
    * A malicious tar file can cause arbitrary panics

    This results because of poor error checking/handling, which this
    CL fixes. While we are at it, add a plethora of unit tests to
    test for possible malicious inputs.

    Change-Id: I2f9446539d189f3c1738a1608b0ad4859c1be929
    Reviewed-on: https://go-review.googlesource.com/15115
    Reviewed-by: Andrew Gerrand <adg@golang.org>
    Run-TryBot: Andrew Gerrand <adg@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>

commit d1b1487a64be2e3bcd882d03f909c4617403c43d
Author:     Joe Tsai <joetsai@digital-static.net>
AuthorDate: Thu Sep 17 16:07:38 2015 -0700
Commit:     David Symonds <dsymonds@golang.org>
CommitDate: Wed Sep 23 23:55:13 2015 +0000

    archive/tar: remove dead code with USTAR path splitting

    Convert splitUSTARPath to return a bool rather than an error since
    the caller never ever uses the error other than to check if it is
    nil. Thus, we can remove errNameTooLong as well.

    Also, fold the checking of the length <= fileNameSize and whether
    the string is ASCII into the split function itself.

    Lastly, remove logic to set the MAGIC since that's already done on
    L200. Thus, setting the magic is redundant.

    There is no overall logic change.

    Updates #12638

    Change-Id: I26b6992578199abad723c2a2af7f4fc078af9c17
    Reviewed-on: https://go-review.googlesource.com/14723
    Reviewed-by: David Symonds <dsymonds@golang.org>
    Run-TryBot: David Symonds <dsymonds@golang.org>

@bradfitz
Copy link
Contributor Author

I think in aggregate it's worth calling out, even if not each one. If people notice problems with tar files in Go 1.5 vs Go 1.6, it'd be nice to have something in the release notes saying suggesting "yeah, a bunch of work went on there, maybe we fixed something, maybe we introduced a new bug".

@dsnet
Copy link
Member

dsnet commented Dec 17, 2015

IMHO, this is the most user visible change:

  • dd5e14a archive/tar: properly handle header-only "files" in Reader
    • This is an API change where no data may be read for certain Header.Typeflag types (regardless of what the Header.Size says). This fixed b/25418768.

These changes allow tar Reader to detect certain bad files. Previously, it would possibly return junk (but no error), but now it actually errors with an error.

  • e3b615f archive/tar: detect truncated files
  • 281eabe archive/tar: add missing error checks to Reader.Next
  • 02d2db1 archive/tar: make Reader.Read errors persistent
  • 481eedc archive/tar: properly format GNU base-256 encoding
  • 5ebb374 archive/tar: properly parse GNU base-256 encoding

These changes harden the Reader against files that would be obviously wrong (maliciously crafted), so I don't anticipate it failing to parse legitimate tar files. Previously, the Reader would panic, spin forever, or allocate large amounts of memory.

  • 79480ca archive/tar: fix bugs with sparseFileReader
  • e4add8d archive/tar: fix numeric overflow issues in readGNUSparseMap0x1
  • 7823197 archive/tar: fix issues with readGNUSparseMap1x0

These changes clean up and prepare the code for future changes (not worth mentioning).

  • b8a1292 archive/tar: convert Reader.Next to be loop based
  • 38f8e44 archive/tar: move parse/format methods to standalone receiver
  • cfb116d archive/tar: expand abilities of TestReader
  • d1b1487 archive/tar: remove dead code with USTAR path splitting
  • 2ae895c archive/tar: spell license correctly in example

@dsnet
Copy link
Member

dsnet commented Dec 17, 2015

I realized it would be fitting to mention this behavioral change in the documentation of Reader.Read for header-only files. I sent out CL/17913.

Also, I only fixed header-only issues on the Reader side since a bug was filed for it (b/25418768). But the Writer still allows users to generate invalid tar files (by putting data in a header-only "file"). I have a fix for it, but was planning to address it next cycle. Since the Reader and Writer are now asymmetrical in this regard, should I submit a CL for fixing header-only files for Writer for go1.6?

@gopherbot
Copy link

CL https://golang.org/cl/17913 mentions this issue.

rsc pushed a commit that referenced this issue Dec 17, 2015
Commit dd5e14a ensured that no data
could be read for header-only files regardless of what the Header.Size
said. We should document this fact in Reader.Read.

Updates #13647

Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31
Reviewed-on: https://go-review.googlesource.com/17913
Reviewed-by: Russ Cox <rsc@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/18642 mentions this issue.

@golang golang locked and limited conversation to collaborators Jan 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants