Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to get full commit hash using go mod #34745

Closed
pranayCodes opened this issue Oct 7, 2019 · 12 comments
Closed

Add ability to get full commit hash using go mod #34745

pranayCodes opened this issue Oct 7, 2019 · 12 comments
Labels
FeatureRequest FrozenDueToAge modules WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@pranayCodes
Copy link

What version of Go are you using (go version)? - 1.13

$ go version
go version go1.13 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

amd64

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/mankad/Library/Caches/go-build"
GOENV="/Users/mankad/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/mankad/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/mankad/go/athens/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/gr/y6wkhscx6mb8nnkvqp48g83h0000gn/T/go-build496226794=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

This is not an issue -- this is an enhancement request

What did you expect to see?

github.com/xi2/xz 48954b6210f8d154cb5f8484d3a3e1f83489309e

What did you see instead?

github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8

@bcmills
Copy link
Contributor

bcmills commented Oct 7, 2019

What's the use-case? (Why is this important enough to build into the go command, rather than a third-party tool?)

CC @jayconrod

@bcmills bcmills added FeatureRequest modules WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Oct 7, 2019
@bcmills bcmills added this to the Unplanned milestone Oct 7, 2019
@pranayCodes
Copy link
Author

The use case is identifying commit hashes the way they are present on upstream sources like Github.

Since previous package managers that Go used (Vndr, Deps) used full commit hashes to identify and download specific checkouts, it provides with the option for Go Mod to support the same.

For a personal use case, it helps reduce the chances of a collision during identification, working with multiple projects using different package managers.

@bcmills
Copy link
Contributor

bcmills commented Oct 7, 2019

For a personal use case, it helps reduce the chances of a collision during identification, working with multiple projects using different package managers.

The pseudo-version encodes a 12-digit hash prefix and a timestamp.

The inclusion of the timestamp makes an accidental hash collision much less likely. And if your dependencies are malicious enough to publish an intentional hash collision, you probably don't want to be depending on them in the first place.

@pranayCodes
Copy link
Author

Fortunately, they don't / aren't made to collide intentionally. We maintain an inventory for static code analysis that ties these projects as git repos, with commit hashes. Now, we have been identifying Go projects since 2015, and hence chose to identify repos without proper tagging with full commit hashes.

  • From an analytics perspective, a full commit hash helps increase consistency across the inventory we have.
  • For static code analysis, it enables us to map the full hash to vulnerabilities or known risks from feeds, and brings uniformity in things as simple as information retrieval.
  • This feature would help us and others from the community to choose from the option of identifying the correct check out with ease.

@bcmills
Copy link
Contributor

bcmills commented Oct 8, 2019

If you're not worried about collisions, why does this need to be in the go tool? It seems like you could easily write a third-party tool that scrapes the output of go list -m, identifies the underlying repos (if they're in supported version control systems), and resolves the versions to commit hashes.

@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Oct 8, 2019
@taikuukaits
Copy link

We run "go mod graph" to generate the dependency list but each of those is just the pseudo version and it seems like a non-trivial task to find the underlying repo for each of those particularly if that information is already known just not available.

I'm curious as to what would be expected in CVE, would you expect the pseudo version or the full commit hash? I looked for an example and this CVE appears to reference a full commit hash. If that is the expected case, then it would make sense to use or at least somehow yield full hashes.

@bcmills
Copy link
Contributor

bcmills commented Oct 11, 2019

@taikuukaits, I would expect a CVE to include ranges of affected releases, dates, and commits.

Given that most users should be consuming released versions, the range of releases should make it nearly trivial to identify whether a given user is affected, and users consuming pseudo-versions could start by checking the date range and then verifying the specific commit if they are near the boundary.

@rsc
Copy link
Contributor

rsc commented Oct 21, 2019

The general argument "previous Go package managers did X so Go modules must too" would lead to modules being a union of all possible features. Instead we are aiming for a simplified set that will be easy to understand and work with moving forward.

There is basically zero chance of an accidental collision in a 12-hex-digit hash in typical repo sizes. And there is basically 100% chance of a malicious collision with a 40-hex-digit hash, since SHA1 is broken. So expanding to 40 digits from 12 would not accomplish anything except making the files harder to read. And of course tags like v1.2.3 record 0 hex digits.

The answer for a real guarantee about avoiding collisions is go.sum, which uses a more secure hash (SHA256) that is VCS-independent and that we can therefore update quickly as needed. The hash also applies we will be able to update as needed and applies to both pseudo-versions and tagged versions.

Given the commit hash prefix/version tag you can avoid collisions (if for some reason you think they are likely) by looking up the prefix/tag, fetching the code, calculating the go.sum checksum, and checking against the go.sum file. Or you can let the go command do this for you, which it does all the time. :-)

@taikuukaits
Copy link

I would agree that just because it used to do it is not sufficient reason. I think it's less about the collisions themselves and more about having to translate a go-specific format into the more widely used full commit hash.

My argument would simply be that full hashes are what is present on the source repository and that the full hashes are present in a CVE so being able to get the full hashes out of go.mod would be helpful as the lookup has to occur somewhere and if the information is already present and go already knows how to do it then it seems reasonable for it to occur there. I wouldn't say full hashes need to be used everywhere but even the ability to add a flag to graph would be sufficient.

For example if we could pass "--long" as a flag to go mod graph.
go mod graph --long

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@ORESoftware
Copy link

ORESoftware commented Apr 13, 2020

I am looking at this and trying to figure out what the commit is on Github for the repo highlighted in orange:

Screenshot from 2020-04-13 12-31-02

how can I figure it out? All I want is to copy the commit hash and go to github and load it, but I can't?

@magical
Copy link
Contributor

magical commented Apr 13, 2020

@ORESoftware The commit hash is 6d1c4477e6b9; however, that repo does not seem to exist.

Unlike many projects, the Go project does not use GitHub Issues for general discussion or asking questions. For questions about using Go, see https://github.com/golang/go/wiki/Questions.

This issue is closed and closed issues are not monitored. If you are still having an issue with Go, please open a new issue.

@golang golang locked and limited conversation to collaborators Apr 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FeatureRequest FrozenDueToAge modules WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants