Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: FileHeader.Extra API is problematic #22520

Open
dsnet opened this issue Nov 1, 2017 · 3 comments
Open

archive/zip: FileHeader.Extra API is problematic #22520

dsnet opened this issue Nov 1, 2017 · 3 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Nov 1, 2017

The FileHeader.Extra field is used by the Writer to write the "extra" field for the local file header and the central-directory file header. This is problematic because the Go implementation assumes that the extra bytes used in the two headers are the same. While is this is often the case, it is not always true.

See http://mdfs.net/Docs/Comp/Archiving/Zip/ExtraField and you will notice that it frequently describes a "Local-header version" and a "Central-header version", where the formats sometimes differ.

The Reader does not have this problem because it entirely ignores the local headers.

I haven't thought much about what the right action is moving forward, whether to deprecate this field or add new API. I just want to file this issue, so I remember to address it later.

@dsnet dsnet self-assigned this Nov 1, 2017
@dsnet dsnet added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 1, 2017
@dsnet dsnet added this to the Go1.11 milestone Nov 1, 2017
@ianlancetaylor
Copy link
Contributor

@dsnet This is currently set for the 1.11 milestone. Please update as appropriate. Thanks.

@rsc
Copy link
Contributor

rsc commented Oct 6, 2020

I suggest we add ExtraCentral []byte, where a writer treats nil to mean "same as Extra".

@martin-sucha
Copy link
Contributor

martin-sucha commented Mar 14, 2021

My use case is to read the Extra field from local file headers, as httrack only writes data to the local file headers and the extra data in central directory record is empty. See also #40354 (comment)

How would adding ExtraCentral field work with readers? Would ExtraCentral be always nil when reading from the archive?

Currently when reading, we read the central directory only. Reading the local file header in addition to it is expensive as we'd need to seek to read the data for each file and most of the time users don't care about the differences in local file header/central directory. So I think we don't want to read the local file headers in Reader.init.

However, if we used ExtraCentral also for reading, then we'd probably need to change Extra to contain data from local file header instead of central directory record. That could break some existing code if an archive contains different data in local file header and central directory record.

If we add ExtraLocal field instead of ExtraCentral, we won't have this problem and the semantics when reading would be more clear (Extra is from central directory, ExtraLocal is nil because we didn't read the local file header).

@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants