New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strconv: inaccurate string to float64 conversion ParseFloat #36657
Comments
I can confirm. It seems like the fast path in strconv/atof.go:627, particularly the |
This behavior has existed since go 1, so it wasn't introduced in any recent release. Digging into extfloat now. |
The ParseFloat doc says:
So it appears that it doesn't unconditionally guarantee to always return the closest IEEE754 number... only if the argument is "near a valid floating-point number" (whatever that means). I believe that your number does not qualify as "near a valid floating-point number". |
Also, the CL associated with #15672 solves this. |
This has been like this forever. Removing release-blocker label. |
We didn’t cover much progress on the related CL unfortunately because folks were swamped. I shall move this to the Unplanned milestone given that the parallel issue is in the Unplanned milestone too. |
@nigeltao Given your recent work, is this still an issue? |
This is still an issue as of 1.15.3. I'm still losing precision. I have a float string of |
Hey Joshua, Go1.15.3 doesn’t have Nigel’s work based off Daniel’s
algorithm, aka it hasn’t yet been officially released. However if you write
a playground repro, I can help you run it with go tip aka what’s going to
become Go1.16, or you can git clone the Go source tree and run it locally
too.
…On Fri, Oct 23, 2020 at 11:20 AM Joshua Strobl ***@***.***> wrote:
This is still an issue as of 1.15.3. I'm still losing precision. I have a
float string of -0.63571428571428568 and it's getting truncated to
-0.6357142857142857, losing 1428568.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#36657 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V2YI6IU2WX7NJH4MMLSMHCNXANCNFSM4KJGWUDQ>
.
|
The reproducer in the first post still prints
at tip, so yes, this is still an issue.
|
Right because the input string has many more than 17 digits. I was being hopeful. |
I have basically ported the Go code more or less as-is to C++ from @nigeltao's own C port... and I don't have this issue. See https://github.com/lemire/fast_float/pull/6 The code follows the same logic, so it ought to be work. I mean... there is no big conceptual gap. |
Ok. So the problem goes away entirely if you comment out this third fast path in atof.go... // Try another fast path.
/*ext := new(extFloat)
if ok := ext.AssignDecimal(mantissa, exp, neg, trunc, &float64info); ok {
b, ovf := ext.floatBits(&float64info)
f = math.Float64frombits(b)
if ovf {
err = rangeError(fnParseFloat, s)
}
return f, n, err
}*/ Looking at // Errors (in the "numerical approximation" sense, not the "Go's error
// type" sense) in this function are measured as multiples of 1/8 of a ULP,
// so that "1/2 of a ULP" can be represented in integer arithmetic.
//
// The C++ double-conversion library also uses this 8x scaling factor:
// https://github.com/google/double-conversion/blob/f4cb2384/double-conversion/strtod.cc#L291
// but this Go implementation has a bug, where it forgets to scale other
// calculations (further below in this function) by the same number. The
// C++ implementation does not forget:
// https://github.com/google/double-conversion/blob/f4cb2384/double-conversion/strtod.cc#L366
//
// Scaling the "errors" in the "is mant_extra in the range (halfway ±
// errors)" check, but not scaling the other values, means that we return
// ok=false (and fall back to a slower atof code path) more often than we
// could. This affects performance but not correctness.
The part where it says Maybe it is helpful to have some perspective. Prior to the recent changes made by @nigeltao, this third fast path was the second fast path. I would argue that it was absolutely needed. Now you have effectively three fast paths followed by a fallback. The first and second fast paths will catch nearly everything in the real world. Like 99% of anything that has no more than 19 digits. So what the third fallback does is protect against the fallback. But that may not be useful:
I think point 4 is maybe key. Assuming that the fallback is "fast enough", and assuming that it is correct (my port of it sure appears correct), then when not go to it directly? |
Change https://golang.org/cl/264518 mentions this issue: |
Coincidentally, that's what https://golang.org/cl/264518 does, which I sent out yesterday. |
Thank you for the change and foresight @nigeltao! Thank you @lemire for the rationale in #36657 (comment) and also for the algorithms! @lemire, it would be a delight to have you as a Go contributor/reviewer, and Nigel's CL could use the rationale you gave for points 1 and 4 in his commit message, but would be nice if you could post those up and reflect in the record, and have you also approve that change :) |
@lemire said:
OTOH, separately on https://go-review.googlesource.com/c/go/+/264518/3#message-7fda4a76b2d909bb21794b84f9c3d30f4730adb5 @remyoudompheng said:
In the same thread, @remyoudompheng also mentions that https://golang.org/cl/264677 should help with the more-than-19-digits case. |
In any case, if it's through a JSON API, there might be more worrisome DoS attacks where decoding lots of empty JSON arrays or JSON objects might be more expensive (due to mallocs and garbage collection) than decoding long JSON numbers. |
Always happy to help. @nigeltao I think https://golang.org/cl/264677 is what I had in mind in an independently sent private message. As I also privately communicated, we could tighten further the second fast path if we wanted to catch even more cases (e.g., subnormals and round-to-even cases) but it is only worth doing if too many fall through. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Not sure. But https://play.golang.org/p/izkaZ-XBog5 reproduces the bug.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Here's the code: https://play.golang.org/p/izkaZ-XBog5
I'm converting string "1090544144181609348835077142190" to float64 with
strconv.ParseFloat("1090544144181609348835077142190", 64)
and it returns incorrect value.What did you expect to see?
It is expected that 1090544144181609419040633126912 should be returned as the closest value representable by float64. The difference between string value and correct float64 value is -70205555984722 (see https://www.wolframalpha.com/input/?i=1090544144181609348835077142190-1090544144181609419040633126912)
What did you see instead?
strconv.ParseFloat(...)
actually returns 1090544144181609278303144771584. The difference between string value and returned float64 value is 70531932370606 (see https://www.wolframalpha.com/input/?i=1090544144181609348835077142190-1090544144181609278303144771584)Additional info
Go compiler correctly converts float64 literals to float64 constants, as you can see in Playground example https://play.golang.org/p/izkaZ-XBog5
The text was updated successfully, but these errors were encountered: