New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: allow extraction of urls used to download dependencies #35922
Comments
/cc @bcmills @jayconrod |
A starting point could be A better approach might be I'm sure there could be better ways to handle this, though. For example, if you just want to build a subset of the module, you probably don't need to download all of the modules required directly or indirectly by the main module. |
@mvdan My thought is to create a cache, e.g. file://${DISTDIR}/go-cache which could be pointed to by GOPROXY so that when the package manager attempts to build the main module it will not need to download from the network. Is this the best way to handle this? Also, how are the paths in the cache created? |
@mvdan neither of those commands ( Given a
The package manager tracks those RHS filenames, and repopulates the expected directory structure for the FYI The package manager also captures & verifies checksums on the URLs. The trivial case for well-versions stuff I can see producing from this case per the goproxy REST API, but it's the corner cases that I don't follow. E.g, this line from
That version doesn't appear in the list endpoint. |
Are there any specific ASCII characters that are NOT permitted to occur in module strings or version strings? |
Yes, I realise that. Please read the rest of my comment above. I meant these as examples to point you in the right direction, not as your perfect solution. |
@robbat2 Is it not possible to use the cache in the (I wonder if there is any magic flag in The details of the proxy protocol including encoding is https://golang.org/cmd/go/#hdr-Module_proxy_protocol. Currently accepted characters and encoding rule is described in https://godoc.org/golang.org/x/mod/module#hdr-Unicode_Restrictions |
Just to confirm what @mvdan and @hyangah have said: Running You can control the location of the module cache by setting @williamh @robbat2 One thing I was a little unclear on: is there a restriction against using To make a list of URLs for that, you could run
( |
You shouldn't need to cache the list or latest endpoints. Those are needed to find new versions of modules, but if |
Additionally, in the proxy protocol and within the module cache, module paths are case encoded so that the cache can be stored on a case-insensitive file system without conflict. Sorry the documentation is not in great shape right now. I'm working on a module reference specification that will include all this for Go 1.14 (#33637). |
|
Yes, I understood that much already, please see further below. The package manager tooling will re-create layout of the cache, for all specifically declared modules to the package manager (generated by the package maintainer based on go.mod).
Correct, the cache would be pre-populated by the package manager, in the correct layout.
Yes, that example works, but still requires network connectivity. My ask was asking for a trivial modification of Package maintainer steps:
User steps
Related question here. I was reviewing the |
Thanks.
Thanks, as a tidbit there: it tries to describe part of the rules:
Yes, I caught that part already. |
We can't provide a general solution for this. If there are multiple sources in the Also,
That's true: we only hash module contents, not the archives themselves. There's no promise that module zip files have stable hashes over time; for example file order or compression could change. We ignore metadata when creating and extracting zip files. (IMO, it would have been better to hash the zip files themselves, but that ship has sailed).
Maybe we can tighten that up without breaking anyone. It's technically possible to have a module path that isn't a domain name if it's only served from a proxy server (i.e., there's no need to look up the origin repository). There is code that checks that dots are not allowed at the beginning or end of a path element or together. I don't think |
I think it would be helpful to step up a level so that we can understand the higher-level problem that you want to solve. Specifically, I would like to understand the need to download Downloading on the maintainer side of the workflow also seems like it would provide the required checksum stability: if the maintainer, rather than the user, downloads the files, then the maintainer can compute the package mainager's checksum based on that specific instance of those files rather than relying on a specific Go proxy to serve a zipfile with exactly the same bytes. |
@bcmills Consider this situation.
Some see this as a big maintenance cost. |
@bcmills sure, looking at a higher-level is good. Problem Statement: Constraints That's probably too high-level ;-). In other languages, e.g. Perl, Python, C, the common route is to have all of the (build or runtime) dependencies installed on the host system, and then the package just uses them at runtime (interpreted or compiled against dynamic libraries) and/or build-time (compiled against static binaries). For Go, the closest representation here is the build-time model. Go has the additional complexity that packages may use differing versions of dependencies. This needs to include sharing Go module source between packages, and taking advantage of the Gentoo mirroring system (if two different Go-based packages in Gentoo both require the same version of a Go module, the files for that module should be shared). For content not in any public goproxy, the Gentoo package maintainer needs to generate the module files ( I think I have enough figured out to do the rest on the Gentoo side here. Is there easy tooling that can at least convert the full
|
@williamh, if the maintainer has to regenerate the list of dependencies to fetch anyway, it seems like the only significant difference is the need to re-download the resulting tarball. But that seems like a detail for the packaging system: it's also quite common for a bugfix in a project to change only one source file within the project, and won't the user have to re-download all of the source files for the project anyway? |
@robbat2, if you intend to share module source between packages, then it seems like you fundamentally need one of a handful of approaches. The key decision, I think, is whether you want to use the upstream Specifically, you could consider:
|
@bcmills @robbat2 First off, I hope your holidays went well. I looked around and came up with a script which can make a tarball that can be unpacked and pointed to by having the package manager set GOPATH or whatever variable comes out of #33637 during the build. The difference between my script and your option 3 above is my script uses @robbat2 What is your status? |
Thinking about this some more... you're going to need some way to inject the downloaded URL contents back into the And if you're implementing a |
@bcmills Hi! We're making implementation progress on this, but have a few followup questions:
|
After You should find that every (However, if the module author has not run |
Yes, with the caveat that if the |
Any invocation of the The |
Run |
I'll clarify the point of my recent questions: I'm trying to identify the minimal possible set of files to pre-download for any given Go package, such that it can be built offline. Ideally down to ONE file per dependency package. I'm hoping I can get away with this logic:
Given this as an example: The minimal set of files to provide to the offline environment:
Generate the following files based on the above files:
|
The proposed Gentoo eclass & sample ebuilds for building Go based on at least downloading eclass: |
I think that should be sufficient, but of course it is possible that I have missed something. |
Can you point to this existing Golang code? I tried to find it, but came up blank, wondering if it's in some other codebase. |
go/src/cmd/go/internal/modfetch/coderepo.go Lines 824 to 866 in 363bcd0
|
Note that the logic for locating the |
Go 1.18's workspaces may provide a simpler solution here, especially in conjunction with Go 1.17's module graph pruning and lazy module loading. If you construct a workspace containing all needed dependencies (at |
Hello,
I am the go package maintainer on Gentoo Linux, and I maintain several packages written in Go as well.
Our package manager does not allow network access during the build process after downloading the source for a package, so it need to be able to download the .zip files for the modules a package needs in advance.
I believe I can download the .zip files to a path, which I will call DISTDIR, then during the build, set GOPROXY="file://${DISTDIR}" and avoid network access.
To do that, I need a way to extract all of the the URLs for the .zip files for the dependencies of a package so I can put them in a list for the package manager to download.
Is there a way to do this?
Thanks much,
William
I am going to tag @robbat2 on this report also to include him since he was part of my discussion on our IRC channel.
The text was updated successfully, but these errors were encountered: