-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regexp: backreference to capturing group breaks if followed by underscore #39594
Comments
This may be counter-intuitive, but if I interpret the documentation correctly, I think this is the way it is supposed to work:
So, the template in the example "$2_$1" is the same as "${2_}${1}", not "${2}_${1}". |
JavaScriptconsole.log('foo,bar'.replace(/(\w+),(\w+)/, '$2_$1')); Result is Perlmy $a = 'foo,bar';
$a =~ s/(\w+),(\w+)/\2_\1/;
warn $a; Result is Rubyputs 'foo,bar'.sub(/(\w+),(\w+)/, '\2_\1') Result is So, I propose to fix the behavior of Go. diff --git a/src/regexp/all_test.go b/src/regexp/all_test.go
index be7a2e7111..7d944d4844 100644
--- a/src/regexp/all_test.go
+++ b/src/regexp/all_test.go
@@ -227,6 +227,7 @@ var replaceTests = []ReplaceTest{
{"(a)(((b))){0}c", ".$1.", "xacxacx", "x.a.x.a.x"},
{"((a(b){0}){3}){5}(h)", "y caramb$2", "say aaaaaaaaaaaaaaaah", "say ay caramba"},
{"((a(b){0}){3}){5}h", "y caramb$2", "say aaaaaaaaaaaaaaaah", "say ay caramba"},
+ {"(Hello)_(World)", "$2_$1", "Hello_World!", "World_Hello!"},
}
var replaceLiteralTests = []ReplaceTest{
diff --git a/src/regexp/regexp.go b/src/regexp/regexp.go
index b547a2ab97..7bab7a5d81 100644
--- a/src/regexp/regexp.go
+++ b/src/regexp/regexp.go
@@ -981,12 +981,24 @@ func extract(str string) (name string, num int, rest string, ok bool) {
str = str[1:]
}
i := 0
- for i < len(str) {
- rune, size := utf8.DecodeRuneInString(str[i:])
- if !unicode.IsLetter(rune) && !unicode.IsDigit(rune) && rune != '_' {
- break
+ b := str[0]
+ if !brace && '0' <= b && b <= '9' {
+ i++
+ for i < len(str) {
+ rune, size := utf8.DecodeRuneInString(str[i:])
+ if !unicode.IsLetter(rune) && !unicode.IsDigit(rune) {
+ break
+ }
+ i += size
+ }
+ } else {
+ for i < len(str) {
+ rune, size := utf8.DecodeRuneInString(str[i:])
+ if !unicode.IsLetter(rune) && !unicode.IsDigit(rune) && rune != '_' {
+ break
+ }
+ i += size
}
- i += size
}
if i == 0 {
// empty name is not okay |
Closing as this is consistent with how RE2 handles variables in other implementations as well. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Minimal case: Play
What did you expect to see?
I expect the pattern
"$2_$1"
to work without needing to escape into"${2}_$1"
, as in python etc.The text was updated successfully, but these errors were encountered: