-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: go/doc: ignore leading and trailing whitespace in the directive arguments #68092
Comments
Similar Issues (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
The current gc compiler requires at least one space character after the directive name. However, for directives that take arguments, the compiler consistently treats a sequence of whitespace characters (per For what it's worth the current version of gccgo is stricter: it ignores sequences of space and tab characters, but does not permit arbitrary Unicode whitespace characters. So if we made this change, one compatibility concern would be people who wrote a directive followed by a tab character. Such directives would have their expected meaning in a new version of Go, and would be ignored by an old version of Go. This could potentially lead to confusion if someone wrote code for a newer version and then compiled it with an older version. I don't think that is worth worrying about but I want to call it out. The other compatibility concern if we adopt the proposal precisely as written is that currently Unicode whitespace characters like U+00A0 are treated as space, but this proposal suggests just space and tab. So I do think that the regexp we use, if we change this, should replace |
I'm certainly fine with |
So the API is now looking like: package ast // "go/ast"
func ParseDirective(line string) (Directive, bool)
type Directive struct {
Tool string // may be "" if Name is "line", "extern", "export"
Name string
Args string // may contain whitespace
} and the question is how to further specify the syntax so that we know when to split It seems like #68092 should be folded into that discussion. (Can we go further and normalize all medial runs of whitespace to a single Can someone collect non-Go directives from the module proxy so we can see how people are defining/using directives in the real world? |
It is important that leading spaces are not permitted, as otherwise routine paragraph-fill operations on typical comment text could easily cause a comment to trigger a spurious match.
I agree, this is a little confusing. It would be nice if the regular expression's subgroups corresponded to the three components. On that topic, it says:
It should be clarified that it means a left-anchored match (so directive comments cannot be indented) but not a right-anchored one. Adding a caret would address this: There are several parts to this proposal:
Changes would be required to cmd/compile, gccgo, and ast.ParseDirective. I think the risk that @ianlancetaylor flags is real but not worth worrying about. |
Sorry for any confusion: I only ever meant leading the argument not the directive itself.
|
A related question is the argument-leading whitespace required before an argument? The current regex only defines a prefix for custom directives. Consider |
Proposal Details
#51082 and #37974 standardized Go directive comments as line comments whose content matches
^(line |extern |export |[a-z0-9]+:[a-z0-9])(.*)$
where the first group is the directive and the second group is the argument to the directive.This makes leading and trailing whitespace part of directive comments.
Trailing whitespace is almost always removed and should never be semantic so it is already de facto not part of the argument.
One space is required for the old (line/export/extern) directives and additional leading whitespace is presumably ignored [have not tested this, will update if incorrect].
Someone writing a general
//x:y
directive in their code will likely add whitespace between the directive and argument even if not syntactically required so anyone parsing directives will likely ignore at least one leading space regardless.Someone reading the directive
//x:yZ
would assume thatx:yZ
is the directive name rather than it is the directivex:y
with argumentZ
so having whitespace here would clarify.I propose changing the argument syntax from
.*$
to[ \t]*([^ \t]*)[ \t]*$
where the leading and trailing whitespace is ignored.This would allow two things
The text was updated successfully, but these errors were encountered: