Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

path,path/filepath: clarify that patterns don't conform to the POSIX Shell spec #23456

Closed
mvdan opened this issue Jan 16, 2018 · 8 comments
Closed
Labels
Documentation FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Jan 16, 2018

What version of Go are you using (go version)?

go version devel +7e054553ad Tue Jan 16 15:11:05 2018 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What did you do?

Use some shell patterns with path.Match.

https://play.golang.org/p/aTUkgwmS-Z6

What did you expect to see?

true <nil>
true <nil>
true <nil>
true <nil>

What did you see instead?

true <nil>
false syntax error in pattern
false syntax error in pattern
false <nil>

For reference, this is a program one can use with Bash to reproduce the expected output:

$ cat f3.sh
[[ '[' == [[ab] ]]; echo $?
[[ ']' == []ab] ]]; echo $?
[[ '-' == [-ab] ]]; echo $?
[[ '\' == [\\ab] ]]; echo $?
$ bash f3.sh
0
0
0
0

There are three small issues here. I'll be quoting the POSIX spec that covers bracket expressions, as seen in 9.3.5: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05

In order:

  • ] should not have special meaning if it's the first in the list. This is not done properly, as can be seen by the []ab] test case. Quoting:

The ( ']' ) shall lose its special meaning and represent itself in a bracket expression if it occurs first in the list (after an initial ( '^' ), if any). Otherwise, it shall terminate the bracket expression

  • - should not have special meaning if it's the first or last in the list. The relevant test case is [-ab]. Quoting:

The character shall be treated as itself if it occurs first (after an initial '^', if any) or last in the list, or as an ending range point in a range expression. As examples, the expressions "[-ac]" and "[ac-]" are equivalent

  • Escaping should not happen within a bracket expressions. The relevant test case is [\ab] (Bash source needs double quoting to get the correct pattern). Quoting:

The special characters '.', '*', '[', and '\' ( , , , and , respectively) shall lose their special meaning within a bracket expression.

I know that not all POSIX Shell pattern matching rules are implemented, like collating symbols and locale. However, these seem like fairly basic stuff that would be easy to fix. I am more than happy to submit patches to fix these in Go 1.11.

The first two changes would simply fix some patterns that used to error. All programs working today would continue to work.

The third change is a bit trickier. If someone is using [\ab] as a pattern nowadays, it currently means "a, or b". But with the change, it would then mean "\, or a, or b" (as in a shell pattern match). This could potentially break some programs.

I would argue that we should apply the third change anyway. The change can be carefully noted in the release notes, and the previous behavior is likely confusing users that expect pattern matching to work like in the shell. I realise that the syntax is specified in the godoc, but still.

@titanous titanous added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 16, 2018
@titanous titanous added this to the Go1.11 milestone Jan 16, 2018
@mvdan
Copy link
Member Author

mvdan commented Jan 16, 2018

I forgot to mention that this also likely applies to its counterpart in filepath. Any applied changes could be mirrored.

@robpike
Copy link
Contributor

robpike commented Jan 16, 2018

The implementation and documentation of Match make no claim to be POSIX-compatible. The patterns it accepts are fully defined, and if the pattern is bad, it reports an error.

Match is working as intended, in my opinion. Moreover, changing its behavior would be a breaking change. You are free to provide an alternate implementation outside the standard library.

@mdempsky
Copy link
Member

I agree that path.Match is working as documented.

I think the godoc mentioning "the shell file name pattern" might be misleading though. I think it's understandable that users may think "the shell" would refer to their OS's shell (particularly for package path/filepath, which behaves "in a way compatible with the target operating system-defined file paths").

Perhaps replace "shell" with "glob" (ErrBadPattern refers to "globbing patterns")? Or just remove "shell" altogether and refer to them simply as "file name patterns"?

@robpike
Copy link
Contributor

robpike commented Jan 16, 2018

@mvdan
Copy link
Member Author

mvdan commented Jan 16, 2018

Yes, that CL is somewhat related. One made me come up with the other.

I thought about simply changing the godoc. But I was wondering if it was simply buggy software, given how the godoc clearly mentioned shells.

Dropping "shell" is an option, but that could make the APIs harder to find for users. Perhaps a distinction saying that this is a different syntax than POSIX/bash would be enough. That warning underneath the syntax definition in the godoc would have left it clear to me that the differences are by design.

I think glob is a bad term to settle for, though. See the CL for the reasoning.

@0xmohit
Copy link
Contributor

0xmohit commented Jan 17, 2018

This came up earlier too: #13396

@mvdan
Copy link
Member Author

mvdan commented Jan 17, 2018

Thanks, @0xmohit. It seems clear that the changes will be rejected unanimously, then.

In that case, I propose to add the clarification that I mentioned above. I'll modify the CL to include it, and repurpose the issue.

@mvdan mvdan changed the title path: Match breaks with the POSIX Shell spec in various ways path,path/filepath: clarify that patterns don't conform to the POSIX Shell spec Jan 17, 2018
@mvdan mvdan self-assigned this Jan 19, 2018
@rsc
Copy link
Contributor

rsc commented Jan 29, 2018

path doesn't even claim to be for file systems. filepath defines the pattern precisely. It doesn't need to say its not POSIX. It also doesn't say its not Windows.

@rsc rsc closed this as completed Jan 29, 2018
@golang golang locked and limited conversation to collaborators Jan 29, 2019
@rsc rsc unassigned mvdan Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

7 participants