time: Parse too greedy when parsing some formats #12919

ksshannon · 2015-10-13T17:28:44Z

What version of Go are you using (go version)?

$ go version go version devel +2d823fd Tue Oct 13 01:04:42 2015 +0000 linux/amd64
What operating system and processor architecture are you using?

linux/amd64
What did you do?

Attempted to parse this time: "618 AM MDT TUE OCT 13 2015" with this format string: "304 PM MST Mon Jan 02 2006"
What did you expect to see?

No parsing error, and a time.Time representing that moment.
What did you see instead?

The parsing fails, and the error reported: parsing time "618 AM MDT TUE OCT 13 2015": hour out of range. The hour is interpreted as 61.

The format works properly with a two digit hour. It seems getnum() in time/format.go takes up to two characters whenever the second character is valid which leads to the issue.
Playground example with examples that fail and pass for the given format:

http://play.golang.org/p/mZ-m7gJlvy

Apologies if my code is messy...

The format is used as a time/version stamp for the US National Weather Service in their forecast discussion products, for a real world example, http://www.wrh.noaa.gov/total_forecast/getprod.php?wfo=boi&pil=AFD&sid=BOI&version=1

Thank you.

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2015-10-13T17:38:22Z

Basically, this means that we should change getnum to only take a single digit if taking two digits would cause the value to be out of range. Right now it always takes two digits if there are two digits, and then later checks whether they are out of range. I can't decide whether this change would be a good idea or not.

ksshannon · 2015-10-13T19:01:39Z

That feels like it would invalidly parse (or attempt to) a lot of values. I think that would cause more problems that solve. Without field width, or field separators, I don't think you can solve all cases for this format (without making assumptions that may make some invalid strings valid). Thank you.

nerdatmath · 2015-10-13T21:17:50Z

This particular format is unambiguous, since "304 PM" indicates a 1-2 digit hour followed by exactly 2 digits for minutes, then whitespace, then an AM/PM indicator. The problem is, when you're looking at a starting digit of 1 for the hour, you don't know whether you're dealing with hour 1, 10, 11, or 12. Without the AM/PM indicator (24 hour clock), we would have the same issue with starting digits of 0 or 1.

One way to resolve this would be to look at 304 as a special case, in which we gather 3-4 digits and then take the last 2 as minutes and the first 1-2 as the hour. 30405 could be treated the same way (5-6 digits). But then there's also 102, 106, 12006, 10206, 1022006, and so on. There are 5 variable width numeric fields and 5 corresponding fixed width numeric fields, plus 2 numeric year representations, so 5 * (2^4-1) * 3 + 5 * 2 = 235 unambiguous representations that we don't handle currently. Even more if you allow the variable width field to be somewhere other than at the front.

Still, I can envision code that could handle all those situations gracefully. One option: when encountering a single digit format specifier, verify that the remaining adjacent numeric fields are fixed width, and look ahead to count the number of digits available. If there is an extra digit, parse 2, otherwise parse 1.

If you have adjacent variable width numeric fields in the format, such as "34", this should probably be an "ambiguous layout" error, at least when parsing.

b1tfury · 2015-10-22T09:01:55Z

Instead of treating all ambiguities I suggest that getnum function is doing its job perfectly only problem I see is that we need to throw a different error when input layout for HH:MM is given as H:MM / HH:M.
As our program is failing in only these cases only. And also changing getnum will have multiple side effects.

ianlancetaylor changed the title ~~time.Parse() too greedy when parsing some formats~~ time: Parse too greedy when parsing some formats Oct 13, 2015

ianlancetaylor added this to the Unplanned milestone Oct 13, 2015

agnivade added the NeedsDecision label Nov 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

time: Parse too greedy when parsing some formats #12919

time: Parse too greedy when parsing some formats #12919

ksshannon commented Oct 13, 2015

ianlancetaylor commented Oct 13, 2015

ksshannon commented Oct 13, 2015

nerdatmath commented Oct 13, 2015

b1tfury commented Oct 22, 2015

time: Parse too greedy when parsing some formats #12919

time: Parse too greedy when parsing some formats #12919

Comments

ksshannon commented Oct 13, 2015

ianlancetaylor commented Oct 13, 2015

ksshannon commented Oct 13, 2015

nerdatmath commented Oct 13, 2015

b1tfury commented Oct 22, 2015