x/pkgsite: invalid links for internal v2+ github enterprise modules #61404
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
pkgsite
Milestone
What is the URL of the page with the issue?
This is a bug report about a v2+ version of a non-public repository on self-hosted pkgsite. The URL would be for this form:
https://my-pkgsite.mycompany.internal/github.mycompany.internal/myorg/myrepo/v2
What is your user agent?
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
Screenshot
As this is an internal code-base, we would prefer to not share a screenshot.
The primary bug here is also not in the rendering of the page, it is with the link to the code associated with a symbol.
For example, with the hypothetical repo view above:
The pages would include a link to https://github.mycompany.internal/myorg/myrepo.git/blob/v2.0.0/v2/mypkg/myfile.go
Note specifically the
.git
and/v2/
in this link. The latter is included unconditionally, regardless of whether it uses a major-branch or a major subdirectory pattern.What did you do?
Running a local instance of pkgsite, I requested indexing of a v2 package on a private github repository of the form github.mycompany.internal/myorg/myrepo/v2 that follows the major branch layout. The links to the code were broken, following a pattern such as https://github.mycompany.internal/myorg/myrepo.git/blob/v2.0.0/v2/mypkg/myfile.go, which has some problems:
adjustVersionedModuleDirectory
treats any error as if it were a 404.In our internal deployment, pkgsite pulls Go modules from an internal deployment of Athens and does not have direct access to the internal git repositories. We would be happy to give it this access, but that does not appear to be an option today.
What did you expect to see?
FileURL should link to https://github.mycompany.internal/myorg/myrepo/blob/v2.0.0/mypkg/myfile.go
What did you see instead?
https://github.mycompany.internal/myorg/myrepo.git/blob/v2.0.0/v2/mypkg/myfile.go
Analysis
It appears the regular expressions intended to match internal github (and gitlab?) instances do not match module paths that have a major version >= 2. Therefore the repository metadata is fetched dynamically and not stripped of its vcs suffix.
ModuleInfo
then callsadjustVersionedModuleDirectory
to perform a HEAD request on the go.mod file and considers any 200 response successful, even if it is a login page (after a redirect). The repository layout (major branch vs. major subdirectory) question must be resolved by querying the repository itself. For private repositories, there does not seem to be a means to configure authentication so that pkgsite can accurately derive these answers.Proposed solutions
There are many ways to proceed here, one of which is to permit pkgsite to be configured with specific "code hosts."
In this case, we would configure a "code host" for
github.mycompany.internal
and its configuration would supercede the regular expression matching. This "code host" could be configured with its type (e.g. GitHub Enterprise) and API credentials, which would let pkgsite query the API directly (to see if the file v2/go.mod exists) rather than relying on HTTP. Some code hosts may offer a standard means of serving a single file.Another approach would be to use the existing RawURL, if available, so that pkgsite can affirmatively parse the go.mod file (after redirects) to ensure it is a valid go.mod for the package being fetched. However, this would still require solving the authentication issue.
The text was updated successfully, but these errors were encountered: