Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: overlong linkname and USTAR header, go-1.10 behavioral change/regression #24821

Closed
lucab opened this issue Apr 12, 2018 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@lucab
Copy link
Contributor

lucab commented Apr 12, 2018

Go version

go version go1.10.1 linux/amd64
(current latest stable release)
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"

Any older toolchain (e.g. go1.9) is not affected by this bug.

Bug output/reproducer

Reproducer:

go get github.com/appc/docker2aci
docker2aci -debug docker://docker.elastic.co/kibana/kibana:6.1.3

Error:

Error: conversion error: error generating ACI: archive/tar: cannot encode header:
Format specifies USTAR; and USTAR cannot encode Linkname="rootfs/var/lib/yum/yumdb/a/6f8ba1fc41ecc5f60aac5b4b7aa28d2566c55481-audit-libs-2.7.6-3.el7-x86_64/from_repo"

Additional info

It looks like the archive/tar rework in go-1.10 introduced a regression in header format when an overlong filename/linkname is present.
This affects docker2aci, as rebuilding the same sourcecode with a newer toolchain results in a runtime error. Please note that the consuming code doesn't try to enforce USTAR, and just relies on archive/tar heuristics.

This can be in theory easily workarounded on our side (see https://github.com/appc/docker2aci/pull/260/files), but this behavioral change likely affects other projects as well and addressing it will introduce toolchain-specific codepaths.

I think it should be investigated and possibly addressed in archive/tar itself.

/cc @dsnet @zhsj

lucab added a commit to lucab/coreos-overlay that referenced this issue Apr 12, 2018
This downgrade the golang toolchain used to build rkt, keeping it at
go1.9.
This is a temporary workaround for a go-1.10 regression in `archive/tar`.

Ref: coreos/bugs#2402
Ref: appc/docker2aci#259
Ref: golang/go#24821
lucab added a commit to lucab/coreos-overlay that referenced this issue Apr 12, 2018
This downgrades the golang toolchain used to build rkt, keeping it at
go1.9.
This is a temporary workaround for a go-1.10 regression in `archive/tar`.

Ref: coreos/bugs#2402
Ref: appc/docker2aci#259
Ref: golang/go#24821
lucab added a commit to lucab/coreos-overlay that referenced this issue Apr 12, 2018
This downgrades the golang toolchain used to build rkt, keeping it at
go1.9.
This is a temporary workaround for a go-1.10 regression in `archive/tar`.

Ref: coreos/bugs#2402
Ref: appc/docker2aci#259
Ref: golang/go#24821
@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 12, 2018
@andybons andybons added this to the Go1.11 milestone Apr 12, 2018
@dsnet
Copy link
Member

dsnet commented Apr 12, 2018

Hi, thanks for the report.

The heuristics that archive/tar used to choose which format has always been problematic. Someone may want PAX (and must never use GNU), someone else wants GNU (and must never use PAX). The inability to specify the exact format was the source of other bugs.

The error you are getting seems seems to show that this is working as intended. The header was read in (as USTAR presumably), modified in some way, and written back out. Only the modification happened in such a way that it went beyond the limitations of USTAR, and so the Writer rightly complains that it can't use USTAR anymore.

As a general rule of thumb, the read-modify-write cycle on a tar.Header is not going to be forwards compatible as a result of different fields that may be added. Starting in Go1.10, we started to document on tar.Header:

For forward compatibility, users that retrieve a Header from Reader.Next, mutate it in some ways, and then pass it back to Writer.WriteHeader should do so by creating a new Header and copying the fields that they are interested in preserving.

However, this observation has held even prior to G1.10 and was documented to encourage more forward compatible uses of tar.

zhsj added a commit to zhsj/docker2aci that referenced this issue Apr 12, 2018
After manipulating the Name and Linkname, the orignal Header format may
be invalid.

Start from Go 1.10, the Header format is stored in the Header struct.

>For forward compatibility, users that retrieve a Header from Reader.Next,
>mutate it in some ways, and then pass it back to Writer.WriteHeader should
>do so by creating a new Header and copying the fields that they are
>interested in preserving.

The copied fields are taken from all the fields presented in Go 1.6.

Ref: golang/go#24821
Closes: appc#259
Signed-off-by: Shengjing Zhu <i@zhsj.me>
@lucab
Copy link
Contributor Author

lucab commented Apr 12, 2018

Thanks for the quick response. So the outcome seems to be that this breaking regression is a unfortunate by-product of the package rework, but it is intended to stay and consumers should deploy their own toolchain-dependent fixes, right?

As much as I appreciate the technical rework and I am fine with fixing the consumers, I'd like to point out that this explicitly goes against the Go 1 promise:
"It is intended that programs written to the Go 1 specification will continue to compile and run correctly, unchanged, over the lifetime of that specification".

If archive/tar is supposed to be iterated on further, perhaps it would be wiser to move it under some x/-repo?

zhsj added a commit to zhsj/docker2aci that referenced this issue Apr 12, 2018
After manipulating the Name and Linkname, the orignal Header format may
be invalid.

Starting from Go 1.10, the Header format is stored in the Header struct.

>For forward compatibility, users that retrieve a Header from Reader.Next,
>mutate it in some ways, and then pass it back to Writer.WriteHeader should
>do so by creating a new Header and copying the fields that they are
>interested in preserving.

The copied fields are taken from all the fields presented in Go 1.6.

Ref: golang/go#24821
Closes: appc#259
Signed-off-by: Shengjing Zhu <i@zhsj.me>
@dsnet
Copy link
Member

dsnet commented Apr 12, 2018

It's here to stay unless there's a clean fix that addresses the problem, but I don't see one.

I don't think it's fair to only quote that part of the Go1 compatibility promise. Below that statement is list of conditions:

  • Unspecified behavior. The Go specification tries to be explicit about most properties of the language [and libraries], but there are some aspects that are undefined. Programs that depend on such unspecified behavior may break in future releases.
  • Bugs. If a compiler or library has a bug that violates the specification, a program that depends on the buggy behavior may break if the bug is fixed. We reserve the right to fix such bugs.

The general issue regarding tar output format has been straddling the lines between unspecified behavior and downright buggy behavior. As a result, the inability to specify the desired format has been the source of more bugs. The tar package is going to focus on making sure read-only and write-only works as bug-free as possible. This is the first time I've heard of a read-modify-write use-case in practice, which has never really quite worked properly either in all edge-cases. It be nice to have read-modify-write be as backwards compatible too, but not at the expense of compromising writer-only correctness.

@lucab
Copy link
Contributor Author

lucab commented Apr 12, 2018

That makes sense. Sorry if my comment sounded harsh/unfair, it wasn't meant to. I'm going ahead closing this bug and fixing the consuming code. Thanks for your work and feedback!

@lucab lucab closed this as completed Apr 12, 2018
@dsnet
Copy link
Member

dsnet commented Apr 12, 2018

That makes sense. Sorry if my comment sounded harsh/unfair, it wasn't meant to.

Sounded respectful to me. Thanks for raising the issue. Knowing that a read-modify-write use-case exists helps guide design decisions in the future.

lucab added a commit to lucab/coreos-overlay that referenced this issue Apr 13, 2018
This downgrades the golang toolchain used to build rkt, keeping it at
go1.9.
This is a temporary workaround for a go-1.10 regression in `archive/tar`.

Ref: coreos/bugs#2402
Ref: appc/docker2aci#259
Ref: golang/go#24821
@golang golang locked and limited conversation to collaborators Apr 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants