You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Regexp.SubexpIndex was added in a recent proposal (#32420).
This is a very helpful addition but doesn't address the use case where regexps are made in a top-level var block using MustCompile. The issue I have with the current implementation is that SubexpIndex can't be used alongside MustCompile without introducing the risk of deferred runtime slice bounds panics, as it returns -1 on error instead of panicking on startup.
To be used safely in a global context, you either need to bring a helper function along every time (thus negating the value of the original proposal, which was to save carrying around a helper function and encourage better practices via the stdlib), or capture the indexes in an init() and error check each one, which doesn't scan well and degrades the readability benefits of colocating the pattern compilation with the subexp index extraction.
When used naively, without checks or helpers, pattern compilation and subexp index extraction can go out of sync without being noticed, turning into a runtime panic possibly a long time after program startup:
I concede this example may look contrived when viewed in its distilled form, but this situation is very real and has occurred to me in more complex patterns when groups have moved around, or typos were introduced. Regexps are easy to mangle and can be tricky to format in Go when they capture more than a couple of small groups.
The helper function version I am currently using:
funcmustSubexpIndex(p*regexp.Regexp, namestring) int {
idx:=p.SubexpIndex(name)
ifidx<0 {
panic(fmt.Errorf("regexp %q does not have named capture %q", p, name))
}
returnidx
}
var (
pattern=regexp.MustCompile(`(?P<foo>bork)`)
patternIndex=mustSubexpIndex(pattern, "bar") // Fails at startup
)
This works, of course, and it's what I'm doing currently, but does not save us any real effort over the examples that justified the original proposal as we still need to carry a function around with us wherever we go.
The init version:
var (
pattern=regexp.MustCompile(`(?P<foo>bork)`)
patternIndex=pattern.SubexpIndex("bar")
)
funcinit() {
ifpatternIndex<0 {
panic("pattern missing named capture")
}
}
This also works, but requires us to remember to keep it all perfectly in sync and things start to get further and further away if the pattern and capturing gets more complex.
So my proposal would be to add the following API to the regexp package:
// MustSubexpIndex is like SubexpIndex but panics if SubexpIndex would
// return -1. It simplifies safe initialization of global variables holding
// subexpression indexes.
func (*Regexp) MustSubexpIndex(name string) int
I think I read somewhere a while back that there's a preference to avoid more of the Must idiom (but I couldn't find the specific reference), but in this case I think it's warranted as SubexpIndex can't be changed to panic without compromising backward compatibility. It also pairs well with MustCompile and serves part of the same use case.
The text was updated successfully, but these errors were encountered:
In general we try not to have must of every possible API, only the ones that are very likely to be used with initializers.
I understand that your use case is with an initializer, but I am not sure that is common enough to warrant even more methods on Regexp.
The same argument that justifies MustSubexpIndex would justify a Must form of essentially every regexp method.
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group
Regexp.SubexpIndex was added in a recent proposal (#32420).
This is a very helpful addition but doesn't address the use case where regexps are made in a top-level var block using
MustCompile
. The issue I have with the current implementation is that SubexpIndex can't be used alongside MustCompile without introducing the risk of deferred runtime slice bounds panics, as it returns -1 on error instead of panicking on startup.To be used safely in a global context, you either need to bring a helper function along every time (thus negating the value of the original proposal, which was to save carrying around a helper function and encourage better practices via the stdlib), or capture the indexes in an
init()
and error check each one, which doesn't scan well and degrades the readability benefits of colocating the pattern compilation with the subexp index extraction.When used naively, without checks or helpers, pattern compilation and subexp index extraction can go out of sync without being noticed, turning into a runtime panic possibly a long time after program startup:
I concede this example may look contrived when viewed in its distilled form, but this situation is very real and has occurred to me in more complex patterns when groups have moved around, or typos were introduced. Regexps are easy to mangle and can be tricky to format in Go when they capture more than a couple of small groups.
The helper function version I am currently using:
This works, of course, and it's what I'm doing currently, but does not save us any real effort over the examples that justified the original proposal as we still need to carry a function around with us wherever we go.
The init version:
This also works, but requires us to remember to keep it all perfectly in sync and things start to get further and further away if the pattern and capturing gets more complex.
So my proposal would be to add the following API to the
regexp
package:I think I read somewhere a while back that there's a preference to avoid more of the
Must
idiom (but I couldn't find the specific reference), but in this case I think it's warranted asSubexpIndex
can't be changed to panic without compromising backward compatibility. It also pairs well withMustCompile
and serves part of the same use case.The text was updated successfully, but these errors were encountered: