Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/metrics: use explicit histogram boundaries (API fix) #43443

Closed
mpx opened this issue Dec 31, 2020 · 5 comments
Closed

runtime/metrics: use explicit histogram boundaries (API fix) #43443

mpx opened this issue Dec 31, 2020 · 5 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Milestone

Comments

@mpx
Copy link
Contributor

mpx commented Dec 31, 2020

This issue refers the current implementation that includes post go1.16beta1 fixes:

go version devel +95ce805d14 Thu Dec 31 02:24:55 2020 +0000 linux/amd64

Proposal: #37112
API Audit: #43407

I've found that using implicit -inf/+inf boundaries with the runtime/metrics.Float64Histogram API is a bit awkward (compared to using explicit boundaries). Implicit boundaries:

  • Require extra logic for the caller since the start/end boundaries need to be interpreted differently
  • The first bucket (-inf) is usually nonsensical/wasted and ignored (negative allocation size, negative duration).

I think it would be much better making the boundaries explicit:

  • Much simpler for API callers to interpret since there is only a single rule and no need to synthesize bucket ranges
  • Metrics can choose the boundaries that make the most sense (math.Inf(-1) is a valid floating point number). Eg, it is impossible to allocate -10 bytes - using [1, 9) or [0, 9) for the first bucket makes more sense.

In practice, the total number of float64s could be the same since the first bucket can usually be dropped. Eg:

Assuming the following counts / buckets:

  • [0, 10): 1
  • [10, 20): 2
  • [20, +inf): 3

Current implicit encoding:

  • Counts: 0, 1, 2, 3
  • Buckets: 0, 10, 20

Proposed explicit encoding:

  • Counts: 1, 2, 3
  • Buckets: 0, 10, 20, +inf

Even if there is a reason to keep the unused -inf bucket, making the boundaries explicit provides more future flexibility and makes it easier for callers.

For comparison, the current API is documented as:

// count[0] is the weight of the range (-inf, bucket[0])
// count[n] is the weight of the range [bucket[n], bucket[n+1]), for 0 < n < N-1
// count[N-1] is the weight of the range [bucket[N-1], inf)

Note: this is actually incorrect, it should be [bucket[n-1], bucket[n]), for 0 < n < N-1.

The explicit boundary API would be:

// count[n] is the weight of the range [bucket[n], bucket[n+1]), for 0 <= n < N

Simplifying the documentation crudely indicates using explicit boundaries provides a cleaner API.

Cc @mknyszek

Sorry about the report late in the cycle, but I think this would be a much better API for the future and it's a simple fix now.

@mknyszek
Copy link
Contributor

mknyszek commented Jan 5, 2021

Thank you so much for taking such a close look at this!

I think I agree with having totally explicit bucket boundaries. Reasoning about indicies is more straightforward for the user this way. I suspect I did this at first to ensure all histograms were sufficiently general, but you're right that some things just really don't make sense (e.g. negative bytes).

I do think there is a reason to keep -inf for time-based histograms (unfortunately negative durations can and do arise in practice even in places you might not expect, as I found recently... time is hard) but as you say that can just be explicit.

CC @prattmic @aclements

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 5, 2021
@mknyszek mknyszek self-assigned this Jan 5, 2021
@toothrot toothrot added this to the Backlog milestone Jan 5, 2021
@mknyszek mknyszek added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 5, 2021
@mknyszek mknyszek modified the milestones: Backlog, Go1.16 Jan 5, 2021
@mknyszek mknyszek added okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker labels Jan 5, 2021
@aclements
Copy link
Member

Making all the boundaries explicit also seems pretty reasonable to me. And I certainly like how it simplifies the documentation, especially since I remember spending a decent amount of time staring that those exact sentences and deciding they were right!

@prattmic
Copy link
Member

prattmic commented Jan 6, 2021

I played with this suggestion a bit by writing some sample code to print out a histogram using three different APIs:

  1. The existing API at HEAD.
  2. The API proposed above by @mpx.
  3. An API where Buckets provides a complete [lower, upper) pair for each bucket (Buckets [][2]float64).

See the code https://gist.github.com/prattmic/c635fe88225fb0995078fa3444456ce6.

IMO, (2) and (3) simplify the code quite nicely and I think I'd prefer writing to that simpler API. Before writing any code, I thought (3) might be even nicer, but in practice it doesn't seem to be any improvement over (2).

@mknyszek
Copy link
Contributor

mknyszek commented Jan 6, 2021

OK, I think we're going to do this. I'm writing up a CL for this today.

@gopherbot
Copy link

Change https://golang.org/cl/281238 mentions this issue: runtime,runtime/metrics: use explicit histogram boundaries

@golang golang locked and limited conversation to collaborators Jan 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Projects
None yet
Development

No branches or pull requests

6 participants