Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: time: allow scientific notation in ParseDuration #67076

Open
spakin opened this issue Apr 27, 2024 · 12 comments
Open

proposal: time: allow scientific notation in ParseDuration #67076

spakin opened this issue Apr 27, 2024 · 12 comments
Labels
Milestone

Comments

@spakin
Copy link

spakin commented Apr 27, 2024

Proposal Details

Issue

I encountered this issue while parsing the output of an application I was working with:

  • strconv.ParseFloat accepts numbers expressed in scientific notation. strconv.ParseFloat("3.336e-6", 64), for example, works as expected.
  • time.ParseDuration does not accept numbers expressed in scientific notation. time.ParseDuration("3.336e-6s"), for example, fails with unknown unit "e-" in duration "3.336e-6s".

A Go Playground demonstration of this limitation in time.ParseDuration can be found here: https://go.dev/play/p/G-1FveHxpZ3

Proposal

I propose that time.ParseDuration be enhanced to allow durations expressed using scientific notation.

@gopherbot gopherbot added this to the Proposal milestone Apr 27, 2024
@seankhliao
Copy link
Member

What sort of application represents time duration like this?

@spakin
Copy link
Author

spakin commented Apr 27, 2024

In this case, it was MiniSat, an old but still often used C++ code. MiniSat normally doesn't output times in scientific notation, but I suspect that it uses printf("%g", …) or similar. On one particular run, it finished in a tiny fraction of a second, output its elapsed time in scientific notation, and caused my Go wrapper to crash.

@benhoyt
Copy link
Contributor

benhoyt commented Apr 28, 2024

I would be opposed to adding this complexity to ParseDuration. It already has units (a scale, kind of like an exponent, so kind of doubles up that functionality) and this change would make the values harder to read. Plus, ParseDuration is documented clearly to parse a decimal with fraction and unit but not more. It seems like it's intended to parse what Duration.String produces -- and that never produces exponents. So this would only be to parse values produced by other systems. My recommendation is if you have a custom format to parse, use a custom parser.

@jfrech
Copy link

jfrech commented Apr 28, 2024

Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.

@spakin
Copy link
Author

spakin commented Apr 28, 2024

Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.

But how is that different from "0.0000000001s", which is equally non-representable?

@jfrech
Copy link

jfrech commented Apr 28, 2024

"0.0000000001s" looks funky. "100ps" and "1e-10s" don't. The former doesn't parse.

@spakin
Copy link
Author

spakin commented Apr 28, 2024

"0.0000000001s" looks funky.

I don't follow. We're talking about parsing input, not producing output. Who cares what inputs look funky? They'll be normalized on output.

"100ps" and "1e-10s" don't. The former doesn't parse.

"0.0000000001s" parses as 0ns, as expected for an integer type with nanosecond resolution. "100ps" doesn't parse because units smaller than nanoseconds aren't recognized. This is already documented:

Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".

Despite time.ParseDuration's input accurately being documented as "a possibly signed sequence of decimal numbers, each with optional fraction, I argue that this violates the Principle of Least Surprise because why should some parts of Go (e.g., strconv.ParseFloat and literal floating-point numbers) accept scientific notation but others not?

My proposal is simply that time.ParseDuration accept the same values that strconv.ParseFloat accepts but followed by one of the preceding units. The documentation would then read,

ParseDuration parses a duration string. A duration string is a floating-point number with a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".

@jfrech
Copy link

jfrech commented Apr 30, 2024

I guess my interpretation of Principle of Least Surprise differs from yours, but would you also want time.ParseDuration to accept any of "0xap1us", "0x.dp+0h" or "+InfhNANns"?

@spakin
Copy link
Author

spakin commented May 1, 2024

I wouldn't expect it to accept "+InfhNANns" because that can't be represented by a time.Duration (int64 nanoseconds). The others, while unlikely ever to be used, ideally should be accepted because they're valid floating-point numbers. Again, why should some parts of Go inconsistently accept different floating-point numbers from other parts of Go?

That said, my main concern is scientific notation because programs in many languages, including Go, have an equivalent of C's printf("%g", …). I encountered a case in the wild of this being used to output a time, which Go can't currently parse because time.Duration processes numbers differently from strconv.ParseFloat when it in fact does not need to.

@glycerine
Copy link

glycerine commented May 4, 2024

This feels like asking to hiding alot of magic and rounding heuristics inside time.ParseDuration(), since there are alot of floating point numbers that will not have an exact int64 nanosecondsrepresentation.

What should ParseDuration() do then? It could hide the sloppy inexact conversion, returning an approximation. Or it could return an error. What are you going to do with the error? Probably fall back on the following preferred approach (below) anyway. What are you going to do with a sloppy inexact conversion? Will you even notice it? Will not noticing it lead to bugs? Some, almost surely. Will the new heuristic be acceptable for all current users of time.ParseDuration()? Unlikely, if they are depending on the current set error producing inputs.

I would much rather read Go application code that makes explicit what is going on, like first using f := strconv.ParseFloat(s, 64) on a string s that has the 's' (for seconds duration) suffix verified and removed, and then deliberately displaying the rounding/conversion to the int64 domain. Then I can readily write tests for the cases that matter to my application, and easily tweak the behaivor to what the application at hand requires.

In other words, it is pretty easy to write in user/application code already, using strconv.ParseFloat() itself, and it would be better practice to do that.

@spakin
Copy link
Author

spakin commented May 5, 2024

time.ParseDuration already uses floats (see line 1668) and already performs inexact conversions. Consider parsing "0.0000000000005h", for example: https://go.dev/play/p/gvs16-UT0N8

Once again, the core of my proposal is for time.ParseDuration to accept scientific notation so "0.0000000000005h" would be parsed the same as "5e-13h". If going through strconv.ParseFloat further reduces the precision of the conversion to int64 nanoseconds—I don't know that it would—then sacrificing its additional features may be warranted for the proposed enhancement to time.ParseDuration.

@peterGo
Copy link
Contributor

peterGo commented May 8, 2024

@spakin: An addendum to your proposal.


The time.ParseDuration function parses Go decimal numbers with an optional fraction part. The function should also parse Go decimal floating-point numbers with an exponent part. Scientific notation is widely used. JSON data interchange is very widely used. Use of the IEEE 754 Standard for Binary Floating-Point Arithmetic in hardware is near universal.

"A [Go] decimal floating-point literal consists of an integer part (decimal digits), a decimal point, a fractional part (decimal digits), and an exponent part (e or E followed by an optional sign and decimal digits). One of the integer part or the fractional part may be elided; one of the decimal point or the exponent part may be elided. An exponent value exp scales the mantissa (integer and fractional part) by 10ᵉˣᵖ."

"A [JSON] number is a sequence of decimal digits with no superfluous leading zero. It may have a preceding minus sign (U+002D). It may have a fractional part prefixed by a decimal point (U+002E). It may have an exponent, prefixed by e (U+0065) or E (U+0045) and optionally + (U+002B) or – (U+002D). The digits are the code points U+0030 through U+0039."

For example:

time.ParseDuration(s string) (d Duration, err error):

format: "%.4f"
s: "0.0000s"       d: 0ns 0s  err: <nil>
s: "0.5000s"       d: 500000000ns 500ms  err: <nil>
s: "1.0000s"       d: 1000000000ns 1s  err: <nil>
s: "-1.0000s"      d: -1000000000ns -1s  err: <nil>
s: "100.0000s"     d: 100000000000ns 1m40s  err: <nil>
s: "1000.0000s"    d: 1000000000000ns 16m40s  err: <nil>
s: "10000.0000s"   d: 10000000000000ns 2h46m40s  err: <nil>
s: "12345.0000s"   d: 12345000000000ns 3h25m45s  err: <nil>
s: "100000.0000s"  d: 100000000000000ns 27h46m40s  err: <nil>
s: "123456.0000s"  d: 123456000000000ns 34h17m36s  err: <nil>

format: "%.4e"
s: "0.0000e+00s"   d: 0ns 0s  err: <nil>
s: "5.0000e-01s"   d: 500000000ns 500ms  err: <nil>
s: "1.0000e+00s"   d: 1000000000ns 1s  err: <nil>
s: "-1.0000e+00s"  d: -1000000000ns -1s  err: <nil>
s: "1.0000e+02s"   d: 100000000000ns 1m40s  err: <nil>
s: "1.0000e-02s"   d: 10000000ns 10ms  err: <nil>
s: "1.0000e+03s"   d: 1000000000000ns 16m40s  err: <nil>
s: "1.0000e-03s"   d: 1000000ns 1ms  err: <nil>
s: "1.0000e+04s"   d: 10000000000000ns 2h46m40s  err: <nil>
s: "1.0000e-04s"   d: 100000ns 100µs  err: <nil>
s: "1.2345e+04s"   d: 12345234500000ns 3h25m45.2345s  err: <nil>
s: "1.2345e-04s"   d: 123450ns 123.45µs  err: <nil>
s: "1.0000e+05s"   d: 100000000000000ns 27h46m40s  err: <nil>
s: "1.0000e-05s"   d: 10000ns 10µs  err: <nil>
s: "1.2346e+05s"   d: 123460234600000ns 34h17m40.2346s  err: <nil>
s: "1.2346e-05s"   d: 12346ns 12.346µs  err: <nil>

format: "%.4g"
s: "0s"            d: 0ns 0s  err: <nil>
s: "0.5s"          d: 500000000ns 500ms  err: <nil>
s: "1s"            d: 1000000000ns 1s  err: <nil>
s: "-1s"           d: -1000000000ns -1s  err: <nil>
s: "100s"          d: 100000000000ns 1m40s  err: <nil>
s: "1000s"         d: 1000000000000ns 16m40s  err: <nil>
s: "1e+04s"        d: 10000000000000ns 2h46m40s  err: <nil>
s: "1e-04s"        d: 100000ns 100µs  err: <nil>
s: "1.234e+04s"    d: 12340234000000ns 3h25m40.234s  err: <nil>
s: "1.234e-04s"    d: 123400ns 123.4µs  err: <nil>
s: "1e+05s"        d: 100000000000000ns 27h46m40s  err: <nil>
s: "1e-05s"        d: 10000ns 10µs  err: <nil>
s: "1.235e+05s"    d: 123500235000000ns 34h18m20.235s  err: <nil>
s: "1.235e-05s"    d: 12350ns 12.35µs  err: <nil>

Format examples:
The GNU C Library Reference Manual
12.12.5 Floating-Point Conversions

The strconv.ParseFloat function is not a solution. The time.Duration type is an integer type.

Adapting the current time.Duration functions to detect and apply an optional exponent is simple, easy, accurate, and readable.

All current time.Duration tests pass.

Test results for decimal floating-point numbers are shown in the C library floating-point examples.

Test results for @spakin's Go Playground time.ParseDuration examples with scientific notation allowed:

First number is 12.34.
First duration is 12.34s.
Second number is 0.0001234.
Second duration is 123.4µs.

and

d: 2.777777777777778e-13h s: "0.0000000000005h"
d: 2.777777777777778e-13h s: "5e-13h"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants