Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: Header.FileInfo.Mode.IsRegular reports true for non-regular files #22903

Open
dsnet opened this issue Nov 28, 2017 · 3 comments
Open
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Nov 28, 2017

Background

A tar file is a format for archiving a filesystem and is able to represent anything including regular files, directories, block and character devices, symlinks, and hardlinks.

Over the years, many incompatible extensions to tar have become common practice. Specifically, one them is the use of pseudo-files to encode additional information about a file or a set of files. A pseudo-file is encoded in the same way as a normal file, but the Header.Typeflag is set to some special value and the contents of the file are parsed according to some format determined by the typeflag. Both the GNU and PAX formats are heavy users of this approach.

The Go implementation of tar provides automatic parsing of the metadata contained in these pseudo-files for the real file it describes. For example, Go1.1 added support for parsing long filenames for the GNU format; Go1.3 added support for parsing local PAX headers; and Go1.3 added support for parsing GNU sparse headers.

The problem

Parsing the metadata for a single file is easy to represent in the API since we can just represent the additional information in the Header struct.

However, the PAX format adds a feature called "global PAX headers" which encodes metadata intended to modify all subsequent files (#22748 indicates that this feature is actually used; e.g., git archive --format=tgz produces them).

In order to provide the metadata contained within a global PAX header, Reader.Next returns a single Header representing the pseudo-file (in go1.10, we actually parse the PAX records into Header.PAXRecords, but that is orthogonal to this issue).

The problem with global headers (and really any typeflag that the Go implementation does not recognize) is that the following reports true:

Header{Typeflag: TypeXGlobalHeader}.FileInfo().Mode().IsRegular() // true

This is the behavior on all versions of Go thus far.

The problem with IsRegular reporting true is:

  1. It lies. This is not a regular file.
  2. A user relying only on Header.FileInfo alone is unable to handle pseudo-files specially (e.g., ignoring them).

However, it is not trivial to have Header.FileInfo.Mode.IsRegular report false since the logic for os.FileMode.IsRegular only reports false for a very narrow set of types, none of which are an obvious choice for "special" files like metadata or unrecognized typeflags.

Thus, we need to make decision whether to keep or change the current behavior of Header.FileInfo.Mode.IsRegular for special files like TypeXGlobalHeader. If we change it, how do we make it report false? The only approach I see is to define a new constant (e.g., ModeSpecial) in the os package.

Thoughts?

\cc @rasky @bradfitz

@dsnet dsnet added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Nov 28, 2017
@dsnet dsnet added this to the Go1.11 milestone Nov 28, 2017
@bradfitz
Copy link
Contributor

Hm, bummer.

I'd be fine with an os.ModeSpecial defined vaguely.

We could also just document it in archive/tar, that those headers are there.

I'd actually be in favor of skipping such things by default in the tar.Reader and making even iterating over them opt-in. Might be too late to do that now. Adding an opt-out is kinda useless because if they're even aware of the tar format (I wasn't), people can do their own Header.Typeflag check and continue. But I assume most people don't know about these and also don't want to see them, so I'd be fine with trying an opt-in switch. We'd need to grep GitHub and google and see if the nobody-cares theory is correct.

@rasky
Copy link
Member

rasky commented Nov 29, 2017

My vote goes for os.ModeSpecial as well.

I think the opt-in approach is probably a good idea as well, but I think it's orthogonal to ModeSpecial. I think there's still a value in letting the user do file processing using os.FileMode rather than forcing them to switch to Header.Typeflag unless strictly required.

@dsnet dsnet self-assigned this Nov 30, 2017
@gopherbot
Copy link

Change https://golang.org/cl/94856 mentions this issue: os: add ModeUnknown flag

gopherbot pushed a commit that referenced this issue Mar 29, 2018
There is currently no way for os.FileMode.IsRegular to report false
without being one of the following types:
	ModeDir | ModeSymlink | ModeNamedPipe | ModeSocket | ModeDevice

This makes it difficult for custom implementations of os.FileInfo to return
a Mode that is explicitly not regular without resorting to setting one
of the types listed above. However, every one of the aforementioned types
are ill-suited as a general-purpose "not regular" file type.

Thus, add a ModeIrregular to serve exactly for that purpose.
The ModeIrregular type carries no information other than the fact that the
file is not regular.

Updates #22903
Fixes #23878

Change-Id: I4f34d88f960bcb014816d8e7b5de8b1035077948
Reviewed-on: https://go-review.googlesource.com/94856
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@dsnet dsnet modified the milestones: Go1.11, Unplanned May 3, 2018
@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

4 participants