Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math/big: Rat: add FloatPrec() (int, bool) #50489

Closed
mitar opened this issue Jan 7, 2022 · 39 comments
Closed

math/big: Rat: add FloatPrec() (int, bool) #50489

mitar opened this issue Jan 7, 2022 · 39 comments

Comments

@mitar
Copy link
Contributor

mitar commented Jan 7, 2022

Update, Nov 13, 2023: Current API is at #50489 (comment). -gri


Currently if I parse string like 123.34 with SetString() and want to print it back to full precision, one has to manually pick the right precision for FloatString() function to get that. I propose that instead FloatString() should accept negative prec parameter. The full semantics of prec parameter would then be:

  • For positive prec, it formats with prec digits after the radix point, padding with 0 on the right if necessary. The last digit is rounded to nearest, with halves rounded away from zero, if rounding is necessary.
  • When prec is 0, there are no digits after the radix point.
  • If prec is less than 0, then:
    • If there is a finite number of digits after the radix point to precisely represent the number, all these digits are appended after the radix point.
    • If the number of digits is infinite, then abs(prec) number of digits is used after the radix point, rounded the last digit to nearest, with halves rounded away from zero.

So FloatString(3) for the number above returns 123.340, but FloatString(-3) return 123.34.

big.NewRat(1, 3).FloatString(3) returns 0.333 and big.NewRat(1, 3).FloatString(-3) would return 0.333.

@mitar mitar added the Proposal label Jan 7, 2022
@gopherbot gopherbot added this to the Proposal milestone Jan 7, 2022
@robpike
Copy link
Contributor

robpike commented Jan 7, 2022

I wrote a very long and complicated answer to this, but perhaps it's better just to state that rather than change FloatString in such a peculiar and possibly incompatible way, a better answer is to add a function that tells you how many bits of precision you'd need. The problem with that is that there is no simple answer in general, as you may have a repeating decimal, but the function could tell you that:

func (r *Rat) FloatSize() (digitsLeftOfDecimal, digitsRightOfDecimal int, repeats bool)

That solves the problem but I am far from convinced the problem is significant enough to require such a messy fix, either mine or yours. If you want to be clever, you can compute (10 gcd denominator) and get some of the answer yourself, since infinite-length representation only happens when the factors of the denominator are not all factors of the printing base.

@mitar
Copy link
Contributor Author

mitar commented Jan 7, 2022

That is a good point and I like the idea of FloatSize.

I am far from convinced the problem is significant enough to require such a messy fix

With messy fix you mean the negative precision proposal or FloatSize?

@robpike
Copy link
Contributor

robpike commented Jan 7, 2022

Both. As I said, "mine or yours".

@mitar
Copy link
Contributor Author

mitar commented Jan 7, 2022

I see.

The problem I have is that when marshaling and unmarshaling JSON with large decimal numbers as strings it is useful to have a way to parse them and then marshal them back into the same precision as they came in. Because they are parsed from a fixed length strings I know that I can format them back to one as well (so that there will be no repeats or infinite digits). The issue is only that there is no easy way to do that back conversion. On the other hand I do not want to lose precision if it is not necessary.

So having FloatSize would address that for me.

@ianlancetaylor ianlancetaylor changed the title proposal: math/big.Rat: Negative precision for full precision string representation proposal: math/big: Rat: negative precision for full precision string representation Jan 7, 2022
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Jan 7, 2022
@mitar
Copy link
Contributor Author

mitar commented Jan 13, 2022

I made an implementation with signature func RatPrecision(rat *big.Rat) (int, int), where the first int is the number of non-repeating digits after decimal dot, and the second the number of repeating (cycling) digits which follow. Seems to work pretty well, maybe it could be included into stdlib.

@rsc
Copy link
Contributor

rsc commented Feb 2, 2022

This seems like an awkward use of big.Rat. Why use big.Rat in this case at all? Why not use json.Number as the round-trippable representation?

@mitar
Copy link
Contributor Author

mitar commented Feb 2, 2022

Oh, sorry if that was unclear from the proposal. The roundtrip is the sanity check motivation, if I have a data structure and I read it from the JSON into numbers with big.Rat as fields, I would at least like to be able to generate back the original JSON with some ease. And currently that is hard. Of course, the main use is that after parsing JSON number info a struct with big.Rat field, one would take that and do further processing on the struct. Potentially later on try to save it to JSON, but at that point values might not be representable without losing some precision. That is fine, but I would at least like to be able to save it to JSON without losing precision when that is not necessary (e.g., when there is a finite number of digits). But currently it is not possible using stdlib to render the big.Rat to a number knowing that you are not losing precision (when that is possible).

@rsc
Copy link
Contributor

rsc commented Feb 9, 2022

Do we care about the repeat size or the digits on the left?
The repeat size of N/D in particular can be D and therefore requires a big.Int to represent.
And the digits on the left is irrelevant to using FloatPrec.

The finite digits on the right is guaranteed to be 10 log D, which will fit in an int whenever D fits in memory.
So perhaps we should have

func (r *Rat) FloatPrec() (digits int, ok bool)

that returns 0, false for rats with non-finite float representations?

@mitar
Copy link
Contributor Author

mitar commented Feb 9, 2022

Yes, I agree that digits on the left are not important. The implementation I made here has the following signature:

func (r *Rat) FloatPrec() (digits int, repeating int)

So repeating == 0 when your ok == true.

But maybe my implementation is not as elegant as it could be.

But I would be OK with func (r *Rat) FloatPrec() (digits int, ok bool) as well.

In practice, using the signature from my implementation, I generally do l+j number of digits, so all non-repeating digits followed by one cycle of repeating digits.

@rsc
Copy link
Contributor

rsc commented Feb 16, 2022

@mitar the problem is that there can be >2^64 repeating digits.

@rsc rsc moved this from Incoming to Active in Proposals (old) Feb 16, 2022
@rsc
Copy link
Contributor

rsc commented Feb 16, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@mitar
Copy link
Contributor Author

mitar commented Feb 16, 2022

the problem is that there can be >2^64 repeating digits.

Sure, the caller of this function would then decide what limits to apply before passing digits further. Or are you saying that int return value type is not enough? I think yes, we could flag somehow if that would overflow or something.

@rsc
Copy link
Contributor

rsc commented Feb 23, 2022

Yes, the problem is the result does not fit in int, or even int64.
Is the repeat count really necessary? I would argue not, in which case we should do just

func (r *Rat) FloatPrec() (digits int, ok bool)

@mitar
Copy link
Contributor Author

mitar commented Feb 23, 2022

I am fine with that.

@rsc rsc changed the title proposal: math/big: Rat: negative precision for full precision string representation proposal: math/big: Rat: add FloatPrec() (int, bool) Mar 2, 2022
@rsc
Copy link
Contributor

rsc commented Mar 2, 2022

For clarity:

 (1/10).FloatPrec() = 1, true
 (10/100).FloatPrec() = 1, true
 (3/100).FloatPrec() = 2, true
 (1/3).FloatPrec() = 0, false
 (10).FloatPrec() = 0, true

@rsc
Copy link
Contributor

rsc commented Mar 2, 2022

Does anyone object to FloatPrec as described in #50489 (comment)?

@mitar
Copy link
Contributor Author

mitar commented Mar 2, 2022

Few more examples:

(1/3).FloatPrec() = 0, false
(1/6).FloatPrec() = 1, false
(1/7).FloatPrec() = 0, false
(1/9).FloatPrec() = 0, false
(1/28).FloatPrec() = 2, false
(1/67).FloatPrec() = 0, false
(1/81).FloatPrec() = 0, false
(1/96).FloatPrec() = 5, false
(8/13).FloatPrec() = 0, false
(2/14).FloatPrec() = 0, false
(3/30).FloatPrec() = 1, true
(2/3).FloatPrec() = 0, false
(9/11).FloatPrec() = 0, false
(7/12).FloatPrec() = 2, false
(22/7).FloatPrec() = 0, false

@rsc
Copy link
Contributor

rsc commented Mar 9, 2022

I disagree about the above examples. If the bool is false, the count is always zero.
The bool distinguishes only 0, false (some infinite eventually repeating decimal)
from 0, true (an integer with no digits after the decimal point).

The number of digits before the repetition begins may not fit in an int64 (I'm not sure)
but also seems like not a useful number to have. I also don't know how to compute it efficiently.

If there is some trivial computation of the non-repeated prefix and it is guaranteed to fit in int64 and it's useful, then maybe we could think about adding it. But none of those three seem to be true.

@rsc
Copy link
Contributor

rsc commented Mar 9, 2022

With that clarification, does anyone object to adding FloatPrec as in #50489 (comment)?

@robpike
Copy link
Contributor

robpike commented Mar 9, 2022

I am still not sure this problem is worth solving at all due to its rarity in practice.

@rsc
Copy link
Contributor

rsc commented Mar 16, 2022

The original message said this was the use case:

Currently if I parse string like 123.34 with SetString() and want to print it back to full precision

The roundtrip is the sanity check motivation, if I have a data structure and I read it from the JSON into numbers with big.Rat as fields, I would at least like to be able to generate back the original JSON with some ease.

I am sympathetic to that and can see it arising in a variety of programs.

In fact, it is a bit annoying to me in Ivy that 0.1 + 0.1 = 1/5 instead of 0.2.
This would give Ivy a way to print shorter decimals when that's still precise.

@rsc rsc moved this from Active to Likely Accept in Proposals (old) Mar 23, 2022
@rsc
Copy link
Contributor

rsc commented Mar 23, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@gopherbot
Copy link

Change https://go.dev/cl/521235 mentions this issue: math/big: Rat: add FloatPrec() (int, bool)

@panjf2000 panjf2000 self-assigned this Aug 20, 2023
@panjf2000 panjf2000 modified the milestones: Backlog, Go1.22 Aug 20, 2023
@gopherbot
Copy link

Change https://go.dev/cl/539299 mentions this issue: math/big: implement Rat.FloatPrec

@panjf2000 panjf2000 removed their assignment Nov 3, 2023
@griesemer
Copy link
Contributor

To be clear about what we accepted here, I believe the boolean result should state whether the rational number can be represented exactly as a decimal number with a fractional part of the returned length, or not. If the boolean is false, a decimal representation of the number will be rounded.
Specifically, I propose the following signature:

// FloatPrec returns the number n of non-repeating digits immediately
// following the decimal point of the decimal representation of x.
// The boolean result indicates whether a decimal representation of x
// with that many fractional digits is exact or rounded.
//
// Examples:
//
//	x      n    exact    decimal representation n fractional digits
//	0      0    true     0
//	1      0    true     1
//	1/2    1    true     0.5
//	1/3    0    false    0       (0.333... rounded)
//	1/4    2    true     0.25
//	1/6    1    false    0.2     (0.166... rounded)
func (x *Rat) FloatPrec() (n int, exact bool)

This is subtly different from what has been discussed so far. Per this earlier comment:

I disagree about the above examples. If the bool is false, the count is always zero.
The bool distinguishes only 0, false (some infinite eventually repeating decimal)
from 0, true (an integer with no digits after the decimal point).

For instance, the boolean result for 1/6 would need to be true (if it were false, the count n would have to be 0 which is incorrect for 1/6). By changing the boolean result as proposed, it always has a meaning (true means exact decimal representation is possible), instead of only being meaningful to identify integers.

@rsc
Copy link
Contributor

rsc commented Nov 4, 2023

Hmm. Above I had written:

The number of digits before the repetition begins may not fit in an int64 (I'm not sure)
but also seems like not a useful number to have. I also don't know how to compute it efficiently.

If there is some trivial computation of the non-repeated prefix and it is guaranteed to fit in int64 and it's useful, then maybe we could think about adding it. But none of those three seem to be true.

Do you have an algorithm for computing this value efficiently? And does it always fit in an int64?

@rsc
Copy link
Contributor

rsc commented Nov 4, 2023

https://math.stackexchange.com/questions/726969/how-do-you-calculate-how-many-decimal-places-there-are-before-the-repeating-digi has a lot to say about this problem but I haven't figured out where the answer is.

@mitar
Copy link
Contributor Author

mitar commented Nov 4, 2023

So the function I implemented returns both the number of digits in the non-repeated prefix and the number of repeating digits which cyclicly follow. I am not sure if that is efficient implementation for you though.

@griesemer
Copy link
Contributor

@rsc A better link might be https://en.wikipedia.org/wiki/Repeating_decimal (section on "Reciprocals of integers not coprime to 10").

The length returned by FloatPrec is the length of the "initial transient" of the decimal representation of the fraction. This length is surprisingly simple to compute and only depends on the denominator: it is the max of the the number of factors 2 and 5 appearing in the denominator. That number cannot be longer than the denominator length in bits. The method Int.BitLen() int returns an int, so returning an int here is fine, too.

Here's an implementation: CL 539299. Determining the number of 2 factors is very fast (counting the trailing zero bits), determining the number of 5 factors can be sped up if need be by dividing by higher powers of 5 as needed.

Regarding your comment here: I think the number of digits before the repetition is the most useful number: if a fraction can be represented exactly, this is the precision needed; if the fraction cannot be represented exactly, this is the number of digits we may want to print before rounding.

@rsc
Copy link
Contributor

rsc commented Nov 4, 2023

Interesting. As long as it works and isn't too slow, I'm fine with the new definition of FloatPrec.

For the 5s, we know the answer is less than the number of bits B consumed by D, and we divide by 5 that many times, and each division is O(B), so the overall complexity of FloatPrec is O(B^2)?

The interesting benchmark would be something like (1/5^1e6).FloatPrec(). I wondered if there is some repeated-squaring kind of thing where you divide by 5, and if that works move on to 5^2, and then 5^4, and then come back down once you run out. But that probably ends up being O(B^2) too.

There's at least an easy factor of 10+ by dividing out 5^13 (<2^32) or 5^27 (<2^64) first.

@griesemer
Copy link
Contributor

griesemer commented Nov 6, 2023

Since multiplication (squaring) is much faster then division, a simpler approach is to repeatedly square an initial factor (5^13) until it is larger than the denominator (after the factors of 2 were removed), keep all the intermediate results in a table, and then repeatedly divide using the table entries. For small numbers this doesn't cost extra because no table is built, but for large numbers it can reduce the cost a lot if there's many multiples of 5. On the other hand, if there are no multiples of 5 at all, building the table is where all the cost goes. On the other hand, a decent-size table could be pre-computed once, or perhaps even be statically allocated if so desired. Lots of options here.

Using this approach, for 1/(5 ^ 1e6), using just factors of 5 takes about ~120s (3.2 GHz 6-Core Intel Core i7), using the factor 5^13 reduces this to about ~9s, and using the full table we end up at ~0.1s (!). This is an improvement of ~1200x from slowest to fastest.

But for 1/(5 ^ 1e6 + 1), which doesn't have any factors of 5, the corresponding numbers are ~0.3ms, ~0.6ms, and ~0.15s. This is a slowdown of ~500x from fastest to slowest.

Great for special cases, but possibly horrible in other cases.

Using a repeated squaring and divide approach looks more promising.

For 1/(5 ^ 1e6), using the full table takes ~3s which is 3x better than the ~9s for the easy gain (just using factor 5^13) approach.
For 1/(5 ^ 1e6 + 1), the time is ~0.6ms (same as before for factor 5^13, because that's the start factor), so this case doesn't get worse which is what we want.

@zephyrtronium
Copy link
Contributor

I think the number of digits before the repetition is the most useful number: ... if the fraction cannot be represented exactly, this is the number of digits we may want to print before rounding.

This feels wrong to me. 5/9 = 0.555... has zero non-repeating digits, but rounding after zero digits (whether to 1 or 0.5, whichever is the interpretation) seems misleading. Maybe I'm misunderstanding?

@griesemer
Copy link
Contributor

griesemer commented Nov 7, 2023

You're free to round to n+1 if the number cannot be represented exactly. For 5/9, n = 0, exact = false, and if you round to n+1 digits you get 0.6 (or 0.5 dep. on rounding mode).

But I think the way FloatPrec is defined, you have the information needed to make the rounding decisions (whichever way that decision goes) - or use the info for something else not related to rounding.

@robpike
Copy link
Contributor

robpike commented Nov 10, 2023

Best issue thread ever.

@gopherbot
Copy link

Change https://go.dev/cl/542115 mentions this issue: math/big: update comment in the implementation of FloatPrec

@mitar
Copy link
Contributor Author

mitar commented Nov 14, 2023

Awesome! Really fast implementation!

gopherbot pushed a commit that referenced this issue Nov 14, 2023
Follow-up on CL 539299: missed to incorporate the updated
comment per feedback on that CL.

For #50489.

Change-Id: Ib035400038b1d11532f62055b5cdb382ab75654c
Reviewed-on: https://go-review.googlesource.com/c/go/+/542115
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/542756 mentions this issue: math/big: faster FloatPrec implementation

gopherbot pushed a commit that referenced this issue Nov 15, 2023
Based on observations by Cherry Mui (see comments in CL 539299).
Add new benchmark FloatPrecMixed.

For #50489.

name                         old time/op  new time/op  delta
FloatPrecExact/1-12           129ns ± 0%   105ns ±11%  -18.51%  (p=0.008 n=5+5)
FloatPrecExact/10-12          317ns ± 2%   283ns ± 1%  -10.65%  (p=0.008 n=5+5)
FloatPrecExact/100-12        1.80µs ±15%  1.35µs ± 0%  -25.09%  (p=0.008 n=5+5)
FloatPrecExact/1000-12       9.48µs ±14%  8.32µs ± 1%  -12.25%  (p=0.008 n=5+5)
FloatPrecExact/10000-12       195µs ± 1%   191µs ± 0%   -1.73%  (p=0.008 n=5+5)
FloatPrecExact/100000-12     7.31ms ± 1%  7.24ms ± 1%   -0.99%  (p=0.032 n=5+5)
FloatPrecExact/1000000-12     301ms ± 3%   302ms ± 2%     ~     (p=0.841 n=5+5)
FloatPrecMixed/1-12           141ns ± 0%   110ns ± 3%  -21.88%  (p=0.008 n=5+5)
FloatPrecMixed/10-12          767ns ± 0%   739ns ± 5%     ~     (p=0.151 n=5+5)
FloatPrecMixed/100-12        4.93µs ± 2%  3.73µs ± 1%  -24.33%  (p=0.008 n=5+5)
FloatPrecMixed/1000-12       90.9µs ±11%  70.3µs ± 2%  -22.66%  (p=0.008 n=5+5)
FloatPrecMixed/10000-12      2.30ms ± 0%  1.92ms ± 1%  -16.41%  (p=0.008 n=5+5)
FloatPrecMixed/100000-12     87.1ms ± 1%  68.5ms ± 1%  -21.42%  (p=0.008 n=5+5)
FloatPrecMixed/1000000-12     4.09s ± 1%   3.58s ± 1%  -12.35%  (p=0.008 n=5+5)
FloatPrecInexact/1-12        92.4ns ± 0%  66.1ns ± 5%  -28.41%  (p=0.008 n=5+5)
FloatPrecInexact/10-12        118ns ± 0%    91ns ± 1%  -23.14%  (p=0.016 n=5+4)
FloatPrecInexact/100-12       310ns ±10%   244ns ± 1%  -21.32%  (p=0.008 n=5+5)
FloatPrecInexact/1000-12      952ns ± 1%   828ns ± 1%  -12.96%  (p=0.016 n=4+5)
FloatPrecInexact/10000-12    6.71µs ± 1%  6.25µs ± 3%   -6.83%  (p=0.008 n=5+5)
FloatPrecInexact/100000-12   66.1µs ± 1%  61.2µs ± 1%   -7.45%  (p=0.008 n=5+5)
FloatPrecInexact/1000000-12   635µs ± 2%   584µs ± 1%   -7.97%  (p=0.008 n=5+5)

Change-Id: I3aa67b49a042814a3286ee8306fbed36709cbb6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/542756
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/546357 mentions this issue: doc: add release note for math/big.Rat.FloatPrec

gopherbot pushed a commit that referenced this issue Dec 5, 2023
For #50489.

Change-Id: I4544a24327196eb3ed62af64ae5ddb1f60441d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/546357
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
For golang#50489.

Change-Id: I4544a24327196eb3ed62af64ae5ddb1f60441d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/546357
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

7 participants