Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: CompilePOSIX does not appear to match leftmost-longest #9684

Closed
rogpeppe opened this issue Jan 25, 2015 · 5 comments
Closed

regexp: CompilePOSIX does not appear to match leftmost-longest #9684

rogpeppe opened this issue Jan 25, 2015 · 5 comments

Comments

@rogpeppe
Copy link
Contributor

Run the following program:

package main

import (
    "regexp"
    "fmt"
)

func main() {
    pat := regexp.MustCompilePOSIX(`^([^:=]*)(:|:=)(.*)$`)
    m := pat.FindStringSubmatch("x:=y")
    fmt.Printf("%q\n", m)
}

It prints:

["x:=y" "x" ":" "=y"]

I would expect it to print

["x:=y" "x" ":=" "y"]

choosing the longer of the two alternatives, as for example
acme's regexp engine does.

@cznic
Copy link
Contributor

cznic commented Jan 25, 2015

FWIW, the "normalized", though equal form

package main

import (
    "fmt"
    "regexp"
)

func main() {
    pat := regexp.MustCompilePOSIX(`^([^:=]*)(:?=)(.*)$`)
    m := pat.FindStringSubmatch("x:=y")
    fmt.Printf("%q\n", m)
}

prints

["x:=y" "x" ":=" "y"]

(http://play.golang.org/p/2ogIBqdnpd)

@mattn
Copy link
Member

mattn commented Jan 26, 2015

AFAIK, match-group should work with specified order. not longest. At least, perl seems to work specified order. ^([^:=]*)(:|:=)(.*)$

@rogpeppe
Copy link
Contributor Author

My understanding of POSIX leftmost-longest is that it should pick the longest possible alternative. Switching the alternatives in the example from this issue does make it pick a different alternative, which doesn't seem right.

@mattn
Copy link
Member

mattn commented Jan 26, 2015

regexp engine of vim works as ordered.

@rsc
Copy link
Contributor

rsc commented Apr 10, 2015

Yes, the leftmost-longest only applies to the overall match. Within the match, it is the "first match" semantics of Perl. When POSIX mandated the submatch leftmost-longest rule they had no idea how to implement it without exponential time. It's not trivial and not worth it.

(It's possible but no one does except maybe Haskell.)

@rsc rsc closed this as completed Apr 10, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants