Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: MustCompile does not panic on invalid UTF-8 sequences #19173

Closed
mholt opened this issue Feb 18, 2017 · 1 comment
Closed

regexp: MustCompile does not panic on invalid UTF-8 sequences #19173

mholt opened this issue Feb 18, 2017 · 1 comment

Comments

@mholt
Copy link

mholt commented Feb 18, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.8 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/matt/Dev"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/86/75b471wx6sn8bjtd4jz48kbm0000gn/T/go-build041041257=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

This issue was reported earlier that turns out might be a bug, or at least, unexpected behavior from regexp.MustCompile:

https://play.golang.org/p/5X4yiZrUza

package main

import (
	"fmt"
	"regexp"
)

func main() {
	test_data := []byte{0x03, 0x65, 0xd0}
	pattern1 := regexp.MustCompile(`\x03\x65`)
	pattern2 := regexp.MustCompile(`\x03\x65\xd0`)

	// both pattern can be compiled annd both should match
	if pattern1.Match(test_data) {
		fmt.Printf("Pattern1 matched\n")
	} else {
		fmt.Printf("Pattern1 failed\n")
	}

	if pattern2.Match(test_data) {
		fmt.Printf("Pattern2 matched\n")
	} else {
		fmt.Printf("Pattern2 failed\n")
	}
}

This might be related to #11185, but this test case is set up slightly differently.

What did you expect to see?

A panic, I guess? My encoding-fu is a bit weak but I think \x03\x65\xd0 is invalid UTF-8.

What did you see instead?

Pattern1 matched
Pattern2 failed
@mholt
Copy link
Author

mholt commented Feb 18, 2017

D'oh. The regexp pattern has raw strings as inputs, rather than double quotes around the strings. Changing them to double quotes makes the panic as expected: https://play.golang.org/p/qJMgs0xFxU

Sorry for the noise!

@mholt mholt closed this as completed Feb 18, 2017
@golang golang locked and limited conversation to collaborators Feb 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants