New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regexp/syntax: improve printing of flags #57950
Comments
why wouldn't you just keep the original, like cc @rsc |
How are you printing it? The existing code does exactly what you want: https://go.dev/play/p/R5sPE02FYKJ
|
We generate a single regular expression out of multiple smaller expressions. Here's an example:
Each line is treated as an alternation. The result would be This process is also recursive in some cases, so the output from
import "regexp/syntax"
r, _ := syntax.Parse("(?i:ab*c|d?e)", syntax.ClassNL|syntax.PerlX)
println(r.String()) The result is To finalize:
|
Change https://go.dev/cl/507015 mentions this issue: |
I think this is a simple enough change that it doesn't need a proposal. I sent https://go-review.googlesource.com/c/go/+/507015. It will probably be in Go 1.22 (too late for the upcoming Go 1.21). You can copy the code if you want to use it ahead of then. |
Wow, that's amazing @rsc! I wouldn't call that a "small change" but I'm not complaining 😄. |
1.22 improves printing of regular expressions. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Go 1.22 includes improved printing of flags. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Go 1.22 includes improved printing of regular expression flags in regexp/syntax. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Go 1.22 includes improved printing of regular expression flags in regexp/syntax. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Go 1.22 includes improved printing of regular expression flags in regexp/syntax. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Go 1.22 includes improved printing of regular expression flags in regexp/syntax. See golang/go#57950 See https://go-review.googlesource.com/c/go/+/507015
Parsing and printing expressions with flags produces pretty bad output. Example:
(?i:ab*c|d?e)
->(?i:A)(?i:B)*(?i:C)|(?i:D)?(?i:E)
. While the output is semantically correct, it causes the resulting expression to grow in size by approximately factor 3.I am a core developer of https://github.com/coreruleset and we use Go to generate regular expressions that can become very large (hundreds of KB). A growth of an expression by factor 3 is not acceptable for us. The issue for us is mostly about the fold case flag (
i
) but other flags are affected as well, e.g.,ab*c.|d?e.
->ab*c(?-s:.)|d?e(?-s:.)
(better:(?:ab*c|d?e)(?-s:.)
, or(?-s)(?:ab*c|d?e).
).I propose to improve printing of regular expressions, so that flags are only repeated if necessary.
The text was updated successfully, but these errors were encountered: