Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/gopherbot: -2 any CLs with large binary blobs unless declared in commit message #10658

Open
josharian opened this issue May 1, 2015 · 21 comments
Labels
Builders x/build issues (builders, bots, dashboards) FeatureRequest help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@josharian
Copy link
Contributor

josharian commented May 1, 2015

See CL 9560 for context.

There will be cases in which it is legit to add a binary file. In such cases, we could do a force commit. Other suggestions welcomed.

Edit by @dmitshur: This also happened in CL 149604, see #28899.

@josharian josharian added this to the Unreleased milestone May 1, 2015
@bradfitz
Copy link
Contributor

bradfitz commented May 1, 2015

Instead of a force commit, we can say that adding a large binary files requires an explicit magic phrase in the commit message, as acknowledgment of intent.

@nightlyone
Copy link
Contributor

A simple thing would be, to list the to be added binary files in the commit message, to make it visible and communicate intent.

@mwhudson
Copy link
Contributor

mwhudson commented May 3, 2015

I don't know if git-codereview is the best place to enforce this, as its use is not mandatory (I don't use it any more, for example), Gerrit would be better. Codereview would be better than nothing, mind.

@josharian
Copy link
Contributor Author

Slightly off-topic, but why not, @mwhudson? What do you do instead?

If this were added, it would be a pre-commit hook, so if you use the hooks, it'd be picked up.

I agree that we should enforce some simple size rules on the gerrit side, but doing it here too lets us put in more subtle and more restrictive detection logic, which the user can intentionally overrule. I like @nightlyone's suggestion for that. Overruling a gerrit check is way beyond a reviewer sanity-check.

@mwhudson
Copy link
Contributor

mwhudson commented May 3, 2015

On 4 May 2015 at 05:26, Josh Bleecher Snyder notifications@github.com
wrote:

Slightly off-topic, but why not, @mwhudson https://github.com/mwhudson?
What do you do instead?

I think it's because I'm a medium skill git/gerrit user. If I was an
expert, I'd understand what codereview is doing in git terms, if I was a
novice I'd probably just use codereview, but at my level, I find it
confusing -- especially 'git mail' and especially when I was working with
that stack of 15 changes for shared library support. Instead, well, I just
use git (git mail becomes "git push origin HEAD:refs/for/master" and
somehow I find that easier to understand).

If this were added, it would be a pre-commit hook, so if you use the
hooks, it'd be picked up.

Ah yes I still use the hooks (although given I have "(add-hook
'before-save-hook 'gofmt-before-save)" in .emacs, all it does over the
standard gerrit hook is tell me I can't commit on master from time to time).

@adg adg changed the title x/codereview/git-codereview: don't allow binary files to be added x/review/git-codereview: don't allow binary files to be added Aug 25, 2016
@bradfitz
Copy link
Contributor

@josharian,

If this were added, it would be a pre-commit hook, so if you use the hooks, it'd be picked up.

We should do this server-side (in gopherbot-ish things) because we accept GitHub PRs too now. We should do #18548 instead.

/cc @andybons

@josharian
Copy link
Contributor Author

SGTM. #18548 is pretty busy. Let’s leave this open, so it doesn’t get lost in the noise of that issue. M

@bradfitz
Copy link
Contributor

I agree we should leave this open. This is a feature that should be implemented as part of that system.

@bradfitz bradfitz changed the title x/review/git-codereview: don't allow binary files to be added x/build/cmd/reviewbot: don't allow binary files to be added Feb 21, 2018
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Feb 21, 2018
@bradfitz bradfitz added help wanted NeedsFix The path to resolution is known, but the work has not been done. FeatureRequest and removed Builders x/build issues (builders, bots, dashboards) labels Feb 21, 2018
@bradfitz bradfitz changed the title x/build/cmd/reviewbot: don't allow binary files to be added x/build/cmd/gopherbot: -2 any CLs with large binary blobs unless declared in commit message Nov 26, 2018
@bradfitz bradfitz self-assigned this Nov 26, 2018
@bradfitz
Copy link
Contributor

I've retitled this to reflect my new short-term implementation plan in light of all the recent accidents.

We can do this pretty easily and quickly in gopherbot by making it vote -2 on any Gerrit CL it seems with large files if they're not declared in the commit message.

/cc @dmitshur @andybons @bcmills

@bradfitz
Copy link
Contributor

Looking at the data in maintner for https://go-review.googlesource.com/c/go/+/151318 I see the diff summary:

--- PASS: TestBigCLs (15.82s)
    godata_test.go:281: file: {File:src/cmd/compile/internal/gc/noder.go Added:12 Deleted:1 Binary:false}
    godata_test.go:281: file: {File:src/internal/syscall/unix/empty.s Added:0 Deleted:5 Binary:false}
    godata_test.go:281: file: {File:src/runtime/testdata/testprog/empty.s Added:0 Deleted:5 Binary:false}
    godata_test.go:281: file: {File:test/fixedbugs/issue23311.dir/issue23311.dir Added:0 Deleted:0 Binary:true}
    godata_test.go:281: file: {File:test/fixedbugs/issue23311.dir/main.go Added:14 Deleted:0 Binary:false}
    godata_test.go:281: file: {File:test/fixedbugs/issue23311.go Added:7 Deleted:0 Binary:false}
PASS

Note that for issue2331.dir we just see:

    godata_test.go:281: file: {File:test/fixedbugs/issue23311.dir/issue23311.dir Added:0 Deleted:0 Binary:true}

No number for the size of the blob. We need to add that to maintner first. Or have gopherbot just git fetch the thing and run a local git cat-file -s.

@josharian
Copy link
Contributor Author

josharian commented Nov 28, 2018

Seems to me we should just require explicit call out for all binary files regardless of size. They’re rare and are more likely to have license issues anyway. (Unless and until this becomes annoying.)

@bradfitz
Copy link
Contributor

@josharian, worth investigating. I'll collect some stats to see how often that would've fired.

@bradfitz
Copy link
Contributor

bradfitz commented Oct 4, 2019

Another: #34688

@dmitshur dmitshur added the Builders x/build issues (builders, bots, dashboards) label Oct 4, 2019
@bradfitz
Copy link
Contributor

@bradfitz bradfitz removed their assignment Apr 24, 2020
@josharian
Copy link
Contributor Author

@dmitshur who else should be cc’d on this?

@aclements
Copy link
Member

I noticed Gerrit now has a "Checks" tab. I don't really know what it does, but it sounds like something we could use for this.

@dmitshur
Copy link
Contributor

@dmitshur who else should be cc’d on this?

Most people who should be cc'ed here already are, but @toothrot @cagedmantis @FiloSottile should also be aware of this issue.

I noticed Gerrit now has a "Checks" tab. I don't really know what it does, but it sounds like something we could use for this.

It's a feature that was added to Gerrit quite recently (sometime in 2019 I believe). It does seem like it could be a good option to consider. It would need to be investigated whether we can and what it takes. I think it's https://gerrit.googlesource.com/plugins/checks, but I have a hard time finding more documentation about it. There was one example of a Gerrit CL I found that has 3 checks.

@dmitshur dmitshur added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Apr 24, 2020
@ianlancetaylor
Copy link
Contributor

Another case: #40740.

@kevinburke
Copy link
Contributor

kevinburke commented Aug 12, 2020

At the very least, if you had the bot post a comment ("hi you have a very large file in here, please confirm that you really wanted to do this"), it would probably get the human reviewers to spend an extra second or two reviewing the patch contents before submit, and that would be easier to add than some sort of automated commit message parser.

@ianlancetaylor
Copy link
Contributor

I see that there are many implementation ideas here already. That said, here is another one. Something somewhere prevents me from mailing a CL that has a crypto key, unless I use -nokeycheck. As far as I can tell that is on the Gerrit server side, not part of the git-codereview command. So, can we have -nobinarycheck?

@dmitshur
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) FeatureRequest help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

9 participants