Skip to content

path/filepath: confusion about semantics of Clean vs Dot-Dot #4382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gopherbot opened this issue Nov 14, 2012 · 12 comments
Closed

path/filepath: confusion about semantics of Clean vs Dot-Dot #4382

gopherbot opened this issue Nov 14, 2012 · 12 comments
Labels
Documentation Issues describing a change to documentation. FrozenDueToAge Unfortunate
Milestone

Comments

@gopherbot
Copy link
Contributor

by s@syam.in:

What steps will reproduce the problem?
1. mkdir -p /tmp/x/y
2. touch /tmp/x/y/z
2. ln -s /tmp/x/y /tmp/foo
3. ls -l /tmp/foo/../y/z 
4. Run the following go program 

package main
import "path/filepath"
import "fmt"

func main() {
    fmt.Println(filepath.Clean("/tmp/foo/../y/z"));
}

5. Output is /tmp/y/z which is incorrect. (it points to a file that doesn't exist)

What is the expected output?

/tmp/foo/../y/z

What do you see instead?

/tmp/y/z

Which compiler are you using (5g, 6g, 8g, gccgo)?

gccgo

Which operating system are you using?


Linux 2.6

Which version are you using?  (run 'go version')

go version go1.0.2

Please provide any additional information below.
@cookieo9
Copy link
Contributor

Comment 1:

As far as I can see this is operating as expected, according to
http://golang.org/pkg/path/filepath/#Clean:
"Clean returns the shortest path name equivalent to path by purely lexical processing."
Meaning that it only processes the given string to make it "clean", it doesn't during
any part of its processing check to see if the files/folders actually exist. IE: if you
did none of the file operations listed in 1-4, or did a sequence of operations which
created /tmp/y/z, you would still get the same result.

@gopherbot
Copy link
Contributor Author

Comment 2 by s@syam.in:

>> "Clean returns the shortest path name equivalent to path by purely lexical
processing."
The key word is <equivalent>. The problem in this case is that the returned path
is not equivalent. As a user of the function, i think it is reasonable to assume that
returned path is functionally equivalent to the path provided in argument. 
It perhaps is a bug in code and documentation too.

@gopherbot
Copy link
Contributor Author

Comment 3 by eam@frap.net:

Avoidance of this bug in another language's libraries is explained here:
http://perldoc.perl.org/File/Spec.html#METHODS
It does seem to be both an implementation and documentation bug. In fact, the
documentation refers to Rob Pike, “Lexical File Names in Plan 9 or Getting Dot-Dot
Right,” which more or less explains why the documented behavior is incorrect for unix
systems, and why ".." cannot be collapsed when operating on those path types. Plan9
changed their filesystem architecture and redefined the lexical semantics to get around
this.
The documentation conflicts with its own references.

@remyoudompheng
Copy link
Contributor

Comment 4:

The documentation for Clean is pretty clear. You seem to want to evaluate symlinks, and
there is precisely a function EvalSymlinks for that purpose.

@gopherbot
Copy link
Contributor Author

Comment 5 by eam@frap.net:

No, that is not what is described here. Clean() is a lexical parse of a path and must
not interact with any particular filesystem. No one is suggesting otherwise. The summary
update is incorrrect.
The bug (in both the documentation and the implementation) is that Clean() is improperly
processing the path. A path on a unix system cannot have the /../ sequence removed in
this manner, as it results in a non-equivalent path.
The feature request is not to evaluate symlinks. It is to remove step #3 from both the
documentation of Clean() as well as the implementation. Step 3 applies to plan9 systems,
but cannot be applied to unix systems.
I took a look at lib9/cleanname.c -- it appears this code was lifted from plan9 and
makes assumptions about the plan9 path handling which are invalid on unix systems. This
is pretty clearly a problem.

@minux
Copy link
Member

minux commented Nov 15, 2012

Comment 6:

Clean does what it is documented to do.
If you need a way to properly handle Unix paths, you might want
to try EvalSymlinks.
Besides, even if the behavior of Clean is wrong, we can't change
that without also changing its name, as per Go 1 API guarantee.

@ianlancetaylor
Copy link
Member

Comment 7:

I agree that we should not change the function.  I also agree that it is worth adding a
note to the function doc that there are cases where the result is not equivalent to the
input.

Labels changed: added priority-later, documentation, removed priority-triage.

@gopherbot
Copy link
Contributor Author

Comment 8 by evan@squareup.com:

Thanks for the pointer to EvalSymlinks. Unfortunately this function is not a lexical
cleanup of a path -- it reaches out into the filesystem and reads links, which is very
different (eg, cannot clean up non-local paths; orders of magnitude performance
difference).
It appears to me that this is a specification error and should be correctable per Go 1
API guarantee, as the documentation first guarantees an equivalent path then later
describes an algorithm that cannot return equivalent paths.
I would expect developers to care more about the primary guarantee of an equivalent path
than a minor detail of the implementation and whether it removes a ".." sequence in some
cases.

@gopherbot
Copy link
Contributor Author

Comment 9 by eam@frap.net:

Thanks for the pointer to EvalSymlinks. Unfortunately this function is not a lexical
cleanup of a path -- it reaches out into the filesystem and reads links, which is very
different (eg, cannot clean up non-local paths; orders of magnitude performance
difference).
It appears to me that this is a specification error and should be correctable per Go 1
API guarantee, as the documentation first guarantees an equivalent path then later
describes an algorithm that does not return equivalent paths.
I would expect developers to care more about the primary guarantee of an equivalent path
than a minor detail of the implementation and whether it removes a ".." sequence in some
cases.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 10:

Labels changed: added size-m.

@adg
Copy link
Contributor

adg commented Jan 17, 2013

Comment 11:

This is absolutely not a specification error as per the Go 1 API guarantee. The
documentation for Clean is very clear about what it does. We cannot change it now, as
existing programs may depend on the behavior.
I could see a case for introducing a new version of Clean with a different name that is
the same as Clean but without this step:
//      3. Eliminate each inner .. path name element (the parent directory)
//         along with the non-.. element that precedes it.
but the utility of this would be limited, as filepath's Abs, Rel, Dir, Join all use
Clean internally, so we would need to provide replacements for those too.

@rsc
Copy link
Contributor

rsc commented Jan 31, 2013

Comment 12:

The code was not lifted from Plan 9, although the same authors wrote the Go and Plan 9
implementations. We are aware that removing .. can in the presence of symlinks result in
a different path. That is why the docs go out of their way to say that Clean only
considers lexical equivalence, not actual file system equivalence.
Go is not alone in this. Bash and other tools make the assumption too. See the first
page of the dot-dot paper for an example.

Status changed to Unfortunate.

@gopherbot gopherbot added Unfortunate priority-later Documentation Issues describing a change to documentation. labels Jan 31, 2013
@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation Issues describing a change to documentation. FrozenDueToAge Unfortunate
Projects
None yet
Development

No branches or pull requests

7 participants