Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/maintner: GitHub issue events are not consistent with API values #29396

Open
andybons opened this issue Dec 22, 2018 · 27 comments
Open
Labels
Builders x/build issues (builders, bots, dashboards) help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@andybons
Copy link
Member

Moved from #28226

There are some issues that have events missing within the maintner corpus. This makes it impossible to create an accurate milestone burndown chart where you want to query for the state of an issue at a particular time window.

A few examples of issues in maintner that have incomplete event lists:

=== Issue events for golang.org/issues/28559
             labeled	milestone:         	label:Testing
             labeled	milestone:         	label:help wanted
             labeled	milestone:         	label:OS-OpenBSD
             labeled	milestone:         	label:Builders
             labeled	milestone:         	label:NeedsInvestigation
          milestoned	milestone:   Go1.12	label:

It does not record the final “closed” event: https://api.github.com/repos/golang/go/issues/28559/events

=== Issue events for golang.org/issues/28306
           mentioned	milestone:         	label:
          subscribed	milestone:         	label:
           mentioned	milestone:         	label:
          subscribed	milestone:         	label:
            assigned	milestone:         	label:
             labeled	milestone:         	label:Documentation
             labeled	milestone:         	label:NeedsInvestigation
          milestoned	milestone:   Go1.12	label:
             renamed	milestone:         	label:

The above event log is missing a few milestone-related events: https://api.github.com/repos/golang/go/issues/28306/events

@andybons andybons added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 22, 2018
@andybons andybons added this to the Unreleased milestone Dec 22, 2018
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Dec 22, 2018
@Skarlso
Copy link
Contributor

Skarlso commented Dec 30, 2018

@andybons Hi! I worked on the maintner previously. I'm happy to pick this up for investigation.

@andybons
Copy link
Member Author

@Skarlso go for it 👍

@Skarlso
Copy link
Contributor

Skarlso commented Dec 31, 2018

Coolness. 👍🙂

@Skarlso
Copy link
Contributor

Skarlso commented Jan 6, 2019

@andybons Hi, sorry, I was insanely busy but picking this up now. 👍

Would you be so kind as to point me into the right direction here? I think I know where to start digging.

First off, I want to write a test to reproduce this bug. Ideally a unit test something like:

func TestParseMultipleGithubEvents(t *testing.T) {

}

I think the right code part here is

func parseGithubEvents(r io.Reader) ([]*GitHubIssueEvent, error) {

... if I see that correctly. Somewhere here is the bug buried. Am I poking in the right direction? The manual way of seeing the missing event? I fired up the maintserver and located said issue, but can't find the missing event as the comment is there and the issue is closed. I'm guessing it's somewhere else?

Thanks!

@andybons
Copy link
Member Author

andybons commented Jan 7, 2019

I fired up the maintserver and located said issue, but can't find the missing event as the comment is there and the issue is closed. I'm guessing it's somewhere else?

Is the closed event there? That's different than the issue being closed. Which example are you referring to?

@Skarlso
Copy link
Contributor

Skarlso commented Jan 7, 2019

I can't see "events", I can only see that it's closed. That's why I'm asking, that I must be looking at something else? :) This is my view and the ticket in question:
screenshot 2019-01-07 at 20 41 57

@andybons
Copy link
Member Author

andybons commented Jan 7, 2019

@Skarlso take a look at https://api.github.com/repos/golang/go/issues/28559/events

Notice how there is a "closed" event. This doesn't show up in the list of events when enumerating via https://godoc.org/golang.org/x/build/maintner#GitHubIssue.ForeachEvent

@Skarlso
Copy link
Contributor

Skarlso commented Jan 7, 2019

Oh I see, so this wouldn't show up in the UI but rather through a small code. Something like:

package main

import (
	"context"
	"fmt"
	"log"

	"golang.org/x/build/maintner/godata"
)

func main() {
	corpus, err := godata.Get(context.Background())
	if err != nil {
		log.Fatalln(err)
	}
	proj := corpus.Gerrit().Project("go.googlesource.com", "go")
	cl := proj.CL(101036)
        // enumerate events here.
}

Just not the CL probably, but the Issue. Gonna check that out.

@Skarlso
Copy link
Contributor

Skarlso commented Jan 7, 2019

	corpus.GitHub().Repo("golang", "go").Issue(28559).ForeachEvent(...)

@Skarlso
Copy link
Contributor

Skarlso commented Jan 7, 2019

Alright! Reproduced with a single, small script:

Script:

package main

import (
	"context"
	"fmt"
	"log"

	"golang.org/x/build/maintner"
	"golang.org/x/build/maintner/godata"
)

func main() {
	corpus, err := godata.Get(context.Background())
	if err != nil {
		log.Fatalln(err)
	}
	corpus.GitHub().Repo("golang", "go").Issue(28559).ForeachEvent(func(e *maintner.GitHubIssueEvent) error {
		fmt.Println(e.Type)
		return nil
	})
}

Output:

❯ go run main.go
2019/01/07 21:38:10 Loading data from log *maintner.netMutSource ...
2019/01/07 21:38:11 Downloading 554570 bytes of https://maintner.golang.org/logs/44 ...
2019/01/07 21:38:12 wrote /Users/hannibal/Library/Caches/golang-maintner/0044.growing.mutlog
2019/01/07 21:38:21 Reloaded data from log *maintner.netMutSource.
labeled
labeled
labeled
labeled
labeled
milestoned

Close event is indeed missing.

@Skarlso
Copy link
Contributor

Skarlso commented Jan 7, 2019

The problem is probably further up the chain but going to write a unit test for ForeachEvent anyways...

@Skarlso
Copy link
Contributor

Skarlso commented Jan 9, 2019

So... I have a hunch...

if res.StatusCode != http.StatusOK {
    log.Printf("Fetching %s: %v: %+v", u, res.Status, res.Header)
    // TODO: rate limiting, etc.
    return nil, nil, fmt.Errorf("%s: %v", u, res.Status)
}

Maybe. :)

I would have to see the logs. Can I access some logs somewhere? @andybons

@andybons
Copy link
Member Author

@Skarlso can you provide more context as to what your hunch is? I can’t surmise what you’re trying to figure out. Whether we’re being rate limited? If we were, then a lot of other requests would also fail subsequently and we’d see a lot more issues.

@Skarlso
Copy link
Contributor

Skarlso commented Jan 10, 2019

@andybons I see. I've begun writing unit tests around some of the functions.. Admittedly it's a bit slow going because the setup isn't trivial, but I'm getting there.

I'm tracing back where the events are coming from. The ForeachEvent function works fine. The events it gets are already not containing the missing events... So going back from there, we eventually end up here: https://github.com/golang/build/blob/master/maintner/github.go#L2023. (This is also interesting: https://github.com/golang/build/blob/master/maintner/github.go#L1990. And this: https://github.com/golang/build/blob/master/maintner/github.go#L2005)

And this is the main function that gets the events in the first place. And there are a lot of error possibilities here and this is largely untested code. So my hunch was that there is an error at some point in the function not necessarily because this function might be wrong. But.... I'm writing the unit test for it which either way sheds some light on the matter. Hopefully.

EDIT: The other hunch is my first hunch that I'm following. That parseGithubEvents is incorrectly parsing something. I begun writing a test for that. :D It's just tedious. :)

@Skarlso
Copy link
Contributor

Skarlso commented Jan 10, 2019

Alright first test is good: golang/build@master...Skarlso:issue_29396_github_events

@Skarlso
Copy link
Contributor

Skarlso commented Jan 10, 2019

@Skarlso reminder I should actually chuck the full failing event stream in there. Also, use a fixture as it gets really really large.

@gopherbot
Copy link

Change https://golang.org/cl/157437 mentions this issue: x/build/maintner: GitHub issue events are not consistent with API values

@Skarlso
Copy link
Contributor

Skarlso commented Jan 12, 2019

@andybons Alright. So... In the attached CL I added a bunch of tests around every sync event that happens that parses an incoming github event directly parsed from the API.

If you are aware of any part of the code any where else this could still fail, please tell me.

These tests all pass. What I also added is a retry logic around rate limiting.

I think right now the CL is solid as in adds a bit of logging and a lot of test cases for the area of the code that was not tested before. As is I think it's ready to merge, even though it does not fix the problem yet. ( unless the rate limiting does kick in from time to time, even though you are saying then other stuff should fail. ).

I have not stopped investigating, I just merely proved that THIS part of the code DOES work and correctly parses event logs.

@Skarlso
Copy link
Contributor

Skarlso commented Jan 27, 2019

@andybons Hi. :) Any thoughts on the above? :)

@andybons
Copy link
Member Author

I will let the owners of the code weigh in on the review.

@Skarlso
Copy link
Contributor

Skarlso commented Jan 28, 2019

Cool, thanks!

@Skarlso
Copy link
Contributor

Skarlso commented Jan 30, 2019

EDIT: Nope. :( Red Herring.

OMG, I think I reproduced it:

M before:  owner:"foowner" repo:"bar" number:1001 event:<id:1943301510 event_type:"labeled" actor_id:5200974 created:<seconds:1541174039 > label:<name:"Testing" > > event:<id:1943301512 event_type:"labeled" actor_id:5200974 created:<seconds:1541174039 > label:<name:"help wanted" > > event:<id:1943301514 event_type:"labeled" actor_id:5200974 created:<seconds:1541174039 > label:<name:"OS-OpenBSD" > > event:<id:1943301515 event_type:"labeled" actor_id:5200974 created:<seconds:1541174039 > label:<name:"Builders" > > event:<id:1943301516 event_type:"labeled" actor_id:5200974 created:<seconds:1541174039 > label:<name:"NeedsInvestigation" > > event:<id:1943301518 event_type:"milestoned" actor_id:5200974 created:<seconds:1541174039 > milestone:<title:"Go1.12" > > event:<id:1943494698 event_type:"closed" actor_id:3194333 created:<seconds:1541179673 > > event_status:<server_date:<seconds:1548829177 > >
CORPUS after:  &{0x196ae88 <nil> false  false {{1 0} 0 0 -1073741824 0} false false map[Testing:Testing help wanted:help wanted OS-OpenBSD:OS-OpenBSD Builders:Builders NeedsInvestigation:NeedsInvestigation Go1.12:Go1.12] map[] 0xc0000b6800 <nil> [] [] <nil> {0 0 <nil>} [] map[] map[] map[] map[] map[]}

In the corpus the closed event from above is missing!!!!!

greatsuccess

The bug looks like is in processGithubIssueMutation. Next step, writing a unit test to prove, then fix. 👍

@Skarlso
Copy link
Contributor

Skarlso commented Jan 30, 2019

Disregard the above. :/ Those are labels. The last close event didn't have labels. Continuing on..

@Skarlso
Copy link
Contributor

Skarlso commented Feb 7, 2019

@andybons Hi. I opened a separate issue with the test coverage increase and the github throttle handling.

I searched for a long time, I believe it's either a race condition in sync logic, a timing issue, or github throwing some kind of error somewhere which is not being handled.

I created an issue which fixes the throttling which might fix this issue, but I ran out of ideas. Without logs that I could investigate on what's happening this is an insanely hard task with my code base knowledge.

Sorry, I couldn't help out. :/

@dmitshur
Copy link
Contributor

dmitshur commented Feb 8, 2019

Without logs that I could investigate on what's happening this is an insanely hard task with my code base knowledge.

Sorry, I couldn't help out.

Understandable, and no problem.

On the contrary, it is very helpful that you've investigated this far and shared your findings. This information will make it easier to make further progress towards identifying the root cause. Thank you very much @Skarlso, your efforts are greatly appreciated!

@Skarlso
Copy link
Contributor

Skarlso commented Feb 8, 2019

@dmitshur Thank you! Happy that I could contribute something. 😊

@gopherbot
Copy link

Change https://golang.org/cl/199638 mentions this issue: cloudfns: add issueswebhook to store issue events in GCS

gopherbot pushed a commit to golang/build that referenced this issue Oct 10, 2019
To do historical analysis, it is helpful to have a corpus of
events data from GitHub. Store all GitHub events in a GCS
bucket until golang/go#29396 is fixed.

Updates golang/go#29396

Change-Id: Ia0172db93e019a8d6ab4ccf6e195f3a3e15583bd
Reviewed-on: https://go-review.googlesource.com/c/build/+/199638
Run-TryBot: Andrew Bonventre <andybons@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
codebien pushed a commit to codebien/build that referenced this issue Nov 13, 2019
To do historical analysis, it is helpful to have a corpus of
events data from GitHub. Store all GitHub events in a GCS
bucket until golang/go#29396 is fixed.

Updates golang/go#29396

Change-Id: Ia0172db93e019a8d6ab4ccf6e195f3a3e15583bd
Reviewed-on: https://go-review.googlesource.com/c/build/+/199638
Run-TryBot: Andrew Bonventre <andybons@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants