Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp/syntax: "single character" docs are misleading #8505

Closed
gopherbot opened this issue Aug 10, 2014 · 9 comments
Closed

regexp/syntax: "single character" docs are misleading #8505

gopherbot opened this issue Aug 10, 2014 · 9 comments
Milestone

Comments

@gopherbot
Copy link

by steven.hartland@multiplay.co.uk:

The docs for regexp/syntax ASCII characters classes should make it clear they are only
available in POSIX mode.

Simple fix rename the section header from:
ASCII character classes:
to
POSIX character classes:

Was scratching my head for a few mins till I looked at the code to find they where POSIX
only.
@ianlancetaylor
Copy link
Contributor

Comment 1:

As far as I can tell the character classes do work by default, even though the default
is Perl mode.  Can you show an example of a failing program?

Status changed to WaitingForReply.

@gopherbot
Copy link
Author

Comment 2 by steven.hartland@multiplay.co.uk:

Here's an example:
http://play.golang.org/p/uX83Si_90p
The output is:
Match by Set (Perl): true
Match by Class (Perl): false
Match by Class (POSIX): true
So the character class, in this case [:xdigit:] only works in POSIX mode not in Perl mode

@gopherbot
Copy link
Author

Comment 3 by steven.hartland@multiplay.co.uk:

Here's an example:
http://play.golang.org/p/uX83Si_90p
The output is:
Match by Set (Perl): true
Match by Class (Perl): false
Match by Class (POSIX): true
So the character class, in this case [:xdigit:], only works in POSIX mode not in Perl
mode

@ianlancetaylor
Copy link
Contributor

Comment 4:

I'm sorry, I don't understand how your test case demonstrates that.  What I'm looking
for is a test case where you use the same regexp for Compile and CompilePOSIX, but get
different results.
What your test case appears to demonstrate is that [:xdigit:] is not equivalent to
[0-9A-Fa-f], which may be a bug but is not the same as the bug you are reporting.

@gopherbot
Copy link
Author

Comment 5 by steven.hartland@multiplay.co.uk:

Sorry, that was a cut and paste error, which caused my initial confusion as to the real
nature of the issue.
On further investigation it appears that the actual problem is that "named" ASCII
Character classes can only be used within a character class.
So if you want to match an xdigit you must use [[:xdigit:]] and not just [:xdigit:]
Example:
http://play.golang.org/p/iGziut5Vi6
The confusion comes from regexp/syntax docs where in the "Single characters:" block at
the top it lists items such as:
\d             Perl character class
[:alpha:]      ASCII character class
\d can be used directly in a regexp but apparently not so for named ASCII character
classes such as [:alpha:]
Additionally in the docs there is:
Perl character classes:
\d             digits (== [0-9])
...
ASCII character classes:
[:digit:]      digits (== [0-9])
Here we see the Perl character class \d, which can be used directly, documented as being
identical to the ASCII character class [:digit:] which can not be used directly and
needs enclosing []'s
An example of this is:
http://play.golang.org/p/240L99E6F4
In Stefan Schroeder's Golang Regex Tutorial:
https://github.com/StefanSchroeder/Golang-Regex-Tutorial/blob/master/01-chapter1.markdown
He clearly states this quirk of ASCII Character classes with: "Note that you have to
wrap an ASCII character class in []. "
Hope this clarifies the issue and sorry again for my initial inaccurate report.

@ianlancetaylor
Copy link
Contributor

Comment 6:

Thanks for sorting this out.  I agree that the docs seem rather misleading by listing
[:alpha:] as a single character, when in fact it can only appear in a character class.

Labels changed: added repo-main, release-go1.4.

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented Aug 12, 2014

Comment 7:

Thanks. This is also re2 issue #116.

@gopherbot
Copy link
Author

Comment 8:

CL https://golang.org/cl/155890043 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Oct 6, 2014

Comment 9:

This issue was closed by revision 85fd0fd.

Status changed to Fixed.

@rsc rsc added this to the Go1.4 milestone Apr 14, 2015
@rsc rsc removed the release-go1.4 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
Generated using re2/doc/mksyntaxgo.

Fixes golang#8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 26, 2018
Generated using re2/doc/mksyntaxgo.

Fixes golang#8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 9, 2018
Generated using re2/doc/mksyntaxgo.

Fixes golang#8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 30, 2018
Generated using re2/doc/mksyntaxgo.

Fixes golang#8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants