Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum.golang.org: /latest is slow to return updated tree state #66267

Closed
mjl- opened this issue Mar 12, 2024 · 8 comments
Closed

sum.golang.org: /latest is slow to return updated tree state #66267

mjl- opened this issue Mar 12, 2024 · 8 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. proxy.golang.org
Milestone

Comments

@mjl-
Copy link

mjl- commented Mar 12, 2024

Go version

n/a

Output of go env in your module/workspace:

n/a

What did you do?

Request https://sum.golang.org/latest periodically to stay up to date with the sumdb tree state.

Some context:
I wrote a tool that tracks sum.golang.org, let's you subscribe to module paths/versions, and sends email notifications when they appear. See https://www.gopherwatch.org/ for the online service, and https://github.com/mjl-/gopherwatch for the code.

What did you see happen?

https://sum.golang.org/latest is returning the same tree state for a while, typically around 10 to 20 minutes. It seems heavily cached.

What did you expect to see?

I hoped to get updates through /latest more quickly. Another endpoint to get newly added records from could also work (perhaps even streaming, though that's more than my use case needs).

I have taken the cachedness of responses from /latest as a signal that you don't want the internet to make requests too often. So gopherwatch is currently requesting /latest every 10 to 15 minutes.

Gopherwatch does have an endpoint at /forward that requests latest additions from index.golang.org, looks up the most recent moduleversion on sum.golang.org to get its tree state, and then forwards the log. So it's possible to get updates more quickly, but involves more requests. It's only done when manually requested.

CC @FiloSottile per thread on slack

@FiloSottile
Copy link
Contributor

It sounds reasonable to me to have https://sum.golang.org/latest cached similarly to index.golang.org responses, if at all. Forcing monitors (who are the only ones interested in /latest) to use index.golang.org and /lookup is roundabout and inefficient.

@hyangah
Copy link
Contributor

hyangah commented Mar 13, 2024

What is wrong with using index.golang.org/index if the goal is to discover newly added module/version? The index service and its internal implementation is built exactly for the purpose.

If it is for efficient module checksum verification, /lookup endpoint also includes a signed, encoded tree description that contains the entry too, so in practice, /latest doesn't need to be too perfectly fresh to support the primary use case (i.e. go command). (https://go.dev/ref/mod#checksum-database)

@FiloSottile
Copy link
Contributor

A sumdb monitor that wants to verify the sumdb will generally not trust index.golang.org/index to return every module version (otherwise why have the Merkle tree at all). Using index.golang.org is just a hack to get a fresh tree head by fetching the index with a recent timestamp, taking the most recent module to /lookup, and extracting a tree head from there. It's a pretty roundabout way to do what /latest is supposed to do.

The go command uses /latest? For what? I would have said that sumdb monitoring is its primary (only?) use case.

@hyangah
Copy link
Contributor

hyangah commented Mar 13, 2024

The go command uses /latest? For what?

I mean this rather longer ttl for /latest works for our primary use case (go command) since go command doesn't rely on /latest.

Why does the sumdb monitor require to retrieve a fresh tree head near real time? If that's really necessary for that sumdb monitor, is it that bad for it to send two queries (index.golang.org/index + /lookup)?

I am sorry to tell you this but this is not a change that we can make easily at this moment. (hint: we need to ensure we don't create any hotspot in our database and backendS).

@FiloSottile
Copy link
Contributor

I mean this rather longer ttl for /latest works for our primary use case (go command) since go command doesn't rely on /latest.

Well by that argument anything works for the primary use case, including not having /latest at all.

I'm a bit confused how uncached index.golang.org/index + /lookup could be less of a hotspot than uncached /latest, but I don't know enough about the backend.

The delay is a UX issue, not a dealbreaker, AFAIK.

@hyangah
Copy link
Contributor

hyangah commented Mar 13, 2024

I'm a bit confused how uncached index.golang.org/index + /lookup could be less of a hotspot than uncached /latest, but I don't know enough about the backend.

Hmm, you are right. I imagined index.golang.org/index would have a more efficient serving path since it's already being used by many other services, but it's nothing special than a secondary index lookup. @suzmue said it may be possible to shorten the TTL. We will investigate it.

@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 14, 2024
@cherrymui cherrymui added this to the Unreleased milestone Mar 14, 2024
@hyangah
Copy link
Contributor

hyangah commented Apr 4, 2024

TTL shortened. (Thanks @suzmue )

@hyangah hyangah closed this as completed Apr 4, 2024
@mjl-
Copy link
Author

mjl- commented Apr 4, 2024

Thanks! I seem to be getting a new tree state every five minutes now (noticed it a few days ago), certainly an improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. proxy.golang.org
Projects
None yet
Development

No branches or pull requests

5 participants