Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: wrong match found without capturing parens #13812

Closed
rsc opened this issue Jan 4, 2016 · 2 comments
Closed

regexp: wrong match found without capturing parens #13812

rsc opened this issue Jan 4, 2016 · 2 comments
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Jan 4, 2016

http://play.golang.org/p/n5XbC7kTDt

package main

import (
    "fmt"
    "regexp"
)

func main() {
    s := "abc"

    rePass := regexp.MustCompile("(a.*?c)|a.*?b")
    reFail := regexp.MustCompile("a.*?c|a.*?b")

    fmt.Printf("%v\n", rePass.FindStringIndex(s))
    // Writes "[0 3]"
    fmt.Printf("%v\n", reFail.FindStringIndex(s))
    // Writes "[0 2]"
}

The overall match found should not depend on capturing parens. In this case, the a.*?c branch of the alternative should take priority over the a.*?b branch in both expressions, so both should match [0 3] not [0 2].

This happens in C++ RE2 as well.

@rsc rsc added this to the Go1.7 milestone Jan 4, 2016
@rsc rsc changed the title regexp: wrong match found without regexp: wrong match found without capturing parens Jan 4, 2016
@junyer
Copy link
Contributor

junyer commented Jan 6, 2016

Summarised from commentary elsewhere:

The problem is caused by factoring of common prefixes in alternations. Complex subexpressions (e.g. involving quantifiers) aren't safe to factor because that collapses their distinct paths through the automaton.

The fix that I've proposed for RE2 is very simple. I'll port it to Go with @rsc's blessing.

@gopherbot
Copy link

CL https://golang.org/cl/18357 mentions this issue.

@rsc rsc closed this as completed in 5ccaf02 Jan 8, 2016
@golang golang locked and limited conversation to collaborators Jan 7, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants