-
Notifications
You must be signed in to change notification settings - Fork 18k
path/filepath: confusion about semantics of Clean vs Dot-Dot #4382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
As far as I can see this is operating as expected, according to http://golang.org/pkg/path/filepath/#Clean: "Clean returns the shortest path name equivalent to path by purely lexical processing." Meaning that it only processes the given string to make it "clean", it doesn't during any part of its processing check to see if the files/folders actually exist. IE: if you did none of the file operations listed in 1-4, or did a sequence of operations which created /tmp/y/z, you would still get the same result. |
Comment 2 by s@syam.in: >> "Clean returns the shortest path name equivalent to path by purely lexical processing." The key word is <equivalent>. The problem in this case is that the returned path is not equivalent. As a user of the function, i think it is reasonable to assume that returned path is functionally equivalent to the path provided in argument. It perhaps is a bug in code and documentation too. |
Comment 3 by eam@frap.net: Avoidance of this bug in another language's libraries is explained here: http://perldoc.perl.org/File/Spec.html#METHODS It does seem to be both an implementation and documentation bug. In fact, the documentation refers to Rob Pike, “Lexical File Names in Plan 9 or Getting Dot-Dot Right,” which more or less explains why the documented behavior is incorrect for unix systems, and why ".." cannot be collapsed when operating on those path types. Plan9 changed their filesystem architecture and redefined the lexical semantics to get around this. The documentation conflicts with its own references. |
Comment 5 by eam@frap.net: No, that is not what is described here. Clean() is a lexical parse of a path and must not interact with any particular filesystem. No one is suggesting otherwise. The summary update is incorrrect. The bug (in both the documentation and the implementation) is that Clean() is improperly processing the path. A path on a unix system cannot have the /../ sequence removed in this manner, as it results in a non-equivalent path. The feature request is not to evaluate symlinks. It is to remove step #3 from both the documentation of Clean() as well as the implementation. Step 3 applies to plan9 systems, but cannot be applied to unix systems. I took a look at lib9/cleanname.c -- it appears this code was lifted from plan9 and makes assumptions about the plan9 path handling which are invalid on unix systems. This is pretty clearly a problem. |
Comment 8 by evan@squareup.com: Thanks for the pointer to EvalSymlinks. Unfortunately this function is not a lexical cleanup of a path -- it reaches out into the filesystem and reads links, which is very different (eg, cannot clean up non-local paths; orders of magnitude performance difference). It appears to me that this is a specification error and should be correctable per Go 1 API guarantee, as the documentation first guarantees an equivalent path then later describes an algorithm that cannot return equivalent paths. I would expect developers to care more about the primary guarantee of an equivalent path than a minor detail of the implementation and whether it removes a ".." sequence in some cases. |
Comment 9 by eam@frap.net: Thanks for the pointer to EvalSymlinks. Unfortunately this function is not a lexical cleanup of a path -- it reaches out into the filesystem and reads links, which is very different (eg, cannot clean up non-local paths; orders of magnitude performance difference). It appears to me that this is a specification error and should be correctable per Go 1 API guarantee, as the documentation first guarantees an equivalent path then later describes an algorithm that does not return equivalent paths. I would expect developers to care more about the primary guarantee of an equivalent path than a minor detail of the implementation and whether it removes a ".." sequence in some cases. |
This is absolutely not a specification error as per the Go 1 API guarantee. The documentation for Clean is very clear about what it does. We cannot change it now, as existing programs may depend on the behavior. I could see a case for introducing a new version of Clean with a different name that is the same as Clean but without this step: // 3. Eliminate each inner .. path name element (the parent directory) // along with the non-.. element that precedes it. but the utility of this would be limited, as filepath's Abs, Rel, Dir, Join all use Clean internally, so we would need to provide replacements for those too. |
The code was not lifted from Plan 9, although the same authors wrote the Go and Plan 9 implementations. We are aware that removing .. can in the presence of symlinks result in a different path. That is why the docs go out of their way to say that Clean only considers lexical equivalence, not actual file system equivalence. Go is not alone in this. Bash and other tools make the assumption too. See the first page of the dot-dot paper for an example. Status changed to Unfortunate. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
by s@syam.in:
The text was updated successfully, but these errors were encountered: