strconv: inaccurate string to float64 conversion ParseFloat #36657

g7r · 2020-01-20T16:16:41Z

What version of Go are you using (`go version`)?

$ go version
version go1.13 linux/amd64

Does this issue reproduce with the latest release?

Not sure. But https://play.golang.org/p/izkaZ-XBog5 reproduces the bug.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

Here's the code: https://play.golang.org/p/izkaZ-XBog5

I'm converting string "1090544144181609348835077142190" to float64 with strconv.ParseFloat("1090544144181609348835077142190", 64) and it returns incorrect value.

What did you expect to see?

It is expected that 1090544144181609419040633126912 should be returned as the closest value representable by float64. The difference between string value and correct float64 value is -70205555984722 (see https://www.wolframalpha.com/input/?i=1090544144181609348835077142190-1090544144181609419040633126912)

What did you see instead?

strconv.ParseFloat(...) actually returns 1090544144181609278303144771584. The difference between string value and returned float64 value is 70531932370606 (see https://www.wolframalpha.com/input/?i=1090544144181609348835077142190-1090544144181609278303144771584)

Additional info

Go compiler correctly converts float64 literals to float64 constants, as you can see in Playground example https://play.golang.org/p/izkaZ-XBog5

The text was updated successfully, but these errors were encountered:

randall77 · 2020-01-20T17:49:55Z

I can confirm. It seems like the fast path in strconv/atof.go:627, particularly the AssignDecimal call, is to blame. If I turn the optimize flag off, the bug goes away.
This has been an issue since at least 1.11.
@griesemer

smasher164 · 2020-01-22T19:01:51Z

This behavior has existed since go 1, so it wasn't introduced in any recent release. Digging into extfloat now.

ALTree · 2020-01-23T15:00:57Z

The ParseFloat doc says:

If s is well-formed and near a valid floating-point number, ParseFloat returns the nearest floating-point number rounded using IEEE754 unbiased rounding.

So it appears that it doesn't unconditionally guarantee to always return the closest IEEE754 number... only if the argument is "near a valid floating-point number" (whatever that means). I believe that your number does not qualify as "near a valid floating-point number".

smasher164 · 2020-01-24T01:26:38Z

Also, the CL associated with #15672 solves this.

andybons · 2020-04-16T18:22:06Z

This has been like this forever. Removing release-blocker label.

odeke-em · 2020-05-13T12:14:19Z

We didn’t cover much progress on the related CL unfortunately because folks were swamped. I shall move this to the Unplanned milestone given that the parallel issue is in the Unplanned milestone too.

lemire · 2020-10-23T16:23:11Z

@nigeltao Given your recent work, is this still an issue?

JoshStrobl · 2020-10-23T18:19:53Z

This is still an issue as of 1.15.3. I'm still losing precision. I have a float string of -0.63571428571428568 and it's getting truncated to -0.6357142857142857, losing 1428568.

odeke-em · 2020-10-23T18:22:57Z

Hey Joshua, Go1.15.3 doesn’t have Nigel’s work based off Daniel’s algorithm, aka it hasn’t yet been officially released. However if you write a playground repro, I can help you run it with go tip aka what’s going to become Go1.16, or you can git clone the Go source tree and run it locally too.

…

On Fri, Oct 23, 2020 at 11:20 AM Joshua Strobl ***@***.***> wrote: This is still an issue as of 1.15.3. I'm still losing precision. I have a float string of -0.63571428571428568 and it's getting truncated to -0.6357142857142857, losing 1428568. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#36657 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABFL3V2YI6IU2WX7NJH4MMLSMHCNXANCNFSM4KJGWUDQ> .

ALTree · 2020-10-23T18:26:50Z

The reproducer in the first post still prints

1.0905441441816094e+30 is not equal to 1.0905441441816093e+30
expected: 0x462b8779f2474dfb     7748782196739579p+47 1090544144181609419040633126912.000000000
actual  : 0x462b8779f2474dfa     7748782196739578p+47 1090544144181609278303144771584.000000000

at tip, so yes, this is still an issue.

go version devel +c3fe874f25 Fri Oct 23 18:02:16 2020 +0000 linux/amd64

lemire · 2020-10-23T18:31:50Z

Right because the input string has many more than 17 digits. I was being hopeful.

smasher164 · 2020-10-23T18:33:48Z

IIRC, CL 170078 (implement Ryū-like algorithm for atof) did address this issue, but it was abandoned in favor of CL 260858 (use the Eisel-Lemire ParseFloat algorithm).

lemire · 2020-10-23T18:38:29Z

I have basically ported the Go code more or less as-is to C++ from @nigeltao's own C port... and I don't have this issue. See https://github.com/lemire/fast_float/pull/6

The code follows the same logic, so it ought to be work. I mean... there is no big conceptual gap.

lemire · 2020-10-23T20:48:51Z

Ok. So the problem goes away entirely if you comment out this third fast path in atof.go...

		// Try another fast path.
		/*ext := new(extFloat)
		if ok := ext.AssignDecimal(mantissa, exp, neg, trunc, &float64info); ok {

			b, ovf := ext.floatBits(&float64info)
			f = math.Float64frombits(b)
			if ovf {
				err = rangeError(fnParseFloat, s)
			}
			return f, n, err
		}*/

Looking at ext.AssignDecimal, it says...

	// Errors (in the "numerical approximation" sense, not the "Go's error
	// type" sense) in this function are measured as multiples of 1/8 of a ULP,
	// so that "1/2 of a ULP" can be represented in integer arithmetic.
	//
	// The C++ double-conversion library also uses this 8x scaling factor:
	// https://github.com/google/double-conversion/blob/f4cb2384/double-conversion/strtod.cc#L291
	// but this Go implementation has a bug, where it forgets to scale other
	// calculations (further below in this function) by the same number. The
	// C++ implementation does not forget:
	// https://github.com/google/double-conversion/blob/f4cb2384/double-conversion/strtod.cc#L366
	//
	// Scaling the "errors" in the "is mant_extra in the range (halfway ±
	// errors)" check, but not scaling the other values, means that we return
	// ok=false (and fall back to a slower atof code path) more often than we
	// could. This affects performance but not correctness.

The part where it says we return ok=false (and fall back to a slower atof code path) more often than we could. This affects performance but not correctness might be incorrect, as illustrated by the example (1090544144181609348835077142190). In that case, it appears to me that double-conversion does reject the fast path but not the Go code?

Maybe it is helpful to have some perspective. Prior to the recent changes made by @nigeltao, this third fast path was the second fast path. I would argue that it was absolutely needed.

Now you have effectively three fast paths followed by a fallback.

The first and second fast paths will catch nearly everything in the real world. Like 99% of anything that has no more than 19 digits.

So what the third fallback does is protect against the fallback. But that may not be useful:

Nobody that cares about performance will generate floats with more than 19 digits of accuracy (mantissa). It is an awkward thing to do in the best of times.
People who do throw long mantissa at the parser, would probably prefer to get correct rounding given a choice. That is, they are probably not in the business of ingesting large CSV files filled with 100 digit numbers. That's really an edge case.
Having more layers can sometimes make things worse. It certainly increases the surface area for maintenance.
The fallback is not that slow. I have not benchmarked the Go version, but the direct C++ port runs at an acceptable speed. You may be rightfully concerned with denial of service attacks where someone throws numbers with very long strings at you... but that's not an effective attack... in fact, beyond a certain point, the parsing speeds goes up. My C++ port of the fallback runs at 2 GB/s for streams of long numbers. (Asymptotically, parsing is O(N).)

I think point 4 is maybe key. Assuming that the fallback is "fast enough", and assuming that it is correct (my port of it sure appears correct), then when not go to it directly?

gopherbot · 2020-10-23T22:34:25Z

Change https://golang.org/cl/264518 mentions this issue: strconv: remove extfloat.go atof code path

nigeltao · 2020-10-23T22:51:13Z

Ok. So the problem goes away entirely if you comment out this third fast path in atof.go...

Coincidentally, that's what https://golang.org/cl/264518 does, which I sent out yesterday.

odeke-em · 2020-10-23T22:57:24Z

Thank you for the change and foresight @nigeltao! Thank you @lemire for the rationale in #36657 (comment) and also for the algorithms!

@lemire, it would be a delight to have you as a Go contributor/reviewer, and Nigel's CL could use the rationale you gave for points 1 and 4 in his commit message, but would be nice if you could post those up and reflect in the record, and have you also approve that change :)
@nigeltao please take a look at @lemire's rationale in the various points and perhaps also include that reasoning that he raised in those points.

nigeltao · 2020-10-23T23:08:22Z

@lemire said:

Nobody that cares about performance will generate floats with more than 19 digits of accuracy (mantissa). It is an awkward thing to do in the best of times.

OTOH, separately on https://go-review.googlesource.com/c/go/+/264518/3#message-7fda4a76b2d909bb21794b84f9c3d30f4730adb5 @remyoudompheng said:

It's not for malicious inputs. It's for inputs which are too long. In my environment I have old apps which produce non-shortest floats and they are a pain.

In the same thread, @remyoudompheng also mentions that https://golang.org/cl/264677 should help with the more-than-19-digits case.

nigeltao · 2020-10-23T23:15:15Z

concerned with denial of service attacks where someone throws numbers with very long strings at you

In any case, if it's through a JSON API, there might be more worrisome DoS attacks where decoding lots of empty JSON arrays or JSON objects might be more expensive (due to mallocs and garbage collection) than decoding long JSON numbers.

lemire · 2020-10-23T23:27:04Z

Always happy to help.

@nigeltao I think https://golang.org/cl/264677 is what I had in mind in an independently sent private message. As I also privately communicated, we could tighten further the second fast path if we wanted to catch even more cases (e.g., subnormals and round-to-even cases) but it is only worth doing if too many fall through.

randall77 added this to the Go1.15 milestone Jan 20, 2020

ianlancetaylor added NeedsFix The path to resolution is known, but the work has not been done. release-blocker labels Jan 20, 2020

odeke-em changed the title ~~Inaccurate string to float64 conversion in strconv.ParseFloat~~ strconv: inaccurate string to float64 conversion ParseFloat Jan 22, 2020

andybons removed the release-blocker label Apr 16, 2020

odeke-em modified the milestones: Go1.15, Unplanned May 13, 2020

gopherbot closed this as completed in 7fe2a84 Oct 29, 2020

quetzyg mentioned this issue Mar 28, 2021

int64 id value changed while decoding google/jsonapi#202

Open

golang locked and limited conversation to collaborators Oct 29, 2021

gopherbot added the FrozenDueToAge label Oct 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

strconv: inaccurate string to float64 conversion ParseFloat #36657

strconv: inaccurate string to float64 conversion ParseFloat #36657

g7r commented Jan 20, 2020 •

edited

randall77 commented Jan 20, 2020

smasher164 commented Jan 22, 2020

ALTree commented Jan 23, 2020 •

edited

smasher164 commented Jan 24, 2020

andybons commented Apr 16, 2020

odeke-em commented May 13, 2020

lemire commented Oct 23, 2020

JoshStrobl commented Oct 23, 2020

odeke-em commented Oct 23, 2020 via email

ALTree commented Oct 23, 2020 •

edited

lemire commented Oct 23, 2020

smasher164 commented Oct 23, 2020

lemire commented Oct 23, 2020

lemire commented Oct 23, 2020 •

edited

gopherbot commented Oct 23, 2020

nigeltao commented Oct 23, 2020

odeke-em commented Oct 23, 2020

nigeltao commented Oct 23, 2020

nigeltao commented Oct 23, 2020

lemire commented Oct 23, 2020

strconv: inaccurate string to float64 conversion ParseFloat #36657

strconv: inaccurate string to float64 conversion ParseFloat #36657

Comments

g7r commented Jan 20, 2020 • edited

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Additional info

randall77 commented Jan 20, 2020

smasher164 commented Jan 22, 2020

ALTree commented Jan 23, 2020 • edited

smasher164 commented Jan 24, 2020

andybons commented Apr 16, 2020

odeke-em commented May 13, 2020

lemire commented Oct 23, 2020

JoshStrobl commented Oct 23, 2020

odeke-em commented Oct 23, 2020 via email

ALTree commented Oct 23, 2020 • edited

lemire commented Oct 23, 2020

smasher164 commented Oct 23, 2020

lemire commented Oct 23, 2020

lemire commented Oct 23, 2020 • edited

gopherbot commented Oct 23, 2020

nigeltao commented Oct 23, 2020

odeke-em commented Oct 23, 2020

nigeltao commented Oct 23, 2020

nigeltao commented Oct 23, 2020

lemire commented Oct 23, 2020

g7r commented Jan 20, 2020 •

edited

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

ALTree commented Jan 23, 2020 •

edited

ALTree commented Oct 23, 2020 •

edited

lemire commented Oct 23, 2020 •

edited