Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: need a theory of URls #64147

Closed
adonovan opened this issue Nov 15, 2023 · 6 comments
Closed

x/tools/gopls: need a theory of URls #64147

adonovan opened this issue Nov 15, 2023 · 6 comments
Assignees
Labels
gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Nov 15, 2023

Recently we finally got rid of almost everything in the golang.org/x/tools/gopls/internal/span package. Only span.URI remains, and I would like to get rid of it too, either by deleting it, moving it to a sensible location, or fusing it with something else. The question is: what does the span.URI abstraction represent? The disappointing current answer is: "just a URI"; it needn't be a "file:" URI, nor need it be non-empty. Calling URI.Filename on an invalid or non-file-scheme span.URI panics.

The protocol.DocumentURI type represents a URI of an editor document. In practice it is always a "file:" scheme URI (though this isn't required by the LSP docs). However, according to those docs, it may be URL-encoded, so colons in Windows drive letters do not appear literally, and there is evidence in the code that it may contain "file://" URIs with only 2 slashes (not 3, and no hostname). So, there is a logical need to convert DocumentURIs as they arrive in requests into some kind of processed internal form, such as span.URI, and this operation is fallible, so its errors must be handled.

This suggests that we should retain the distinction between URI types and audit the codebase (possibly with the help of a static analysis in the style of the safetoken tests) for conversions of DocumentURI to span.URI without validation. Should span.URI, renamed to something suitable like protocol.FileURI, be made stricter about the invariant that it represents a valid "file:" URL? It's a large change, and the status quo doesn't currently cause any obvious problems, but currently there is little rhyme or reason to when we validate DocumentURIs.

@findleyr

@gopherbot gopherbot added the gopls Issues related to the Go language server, gopls. label Nov 15, 2023
@findleyr
Copy link
Contributor

I think there are the following three categories:

  • An arbitrary URI, which may have non-file scheme (a protocol.URI)
  • A user-specified file URI (a protocol.DocumentURI)
  • A normalized file URI (a span.URI)

These categorize do make sense to me. However, I'm not sure how well we honor them.
One real problem with enforcing this in the type system is that it leads to a proliferation of internal types that look almost the same as the protocol types. An alternative that we've discussed in the past would be to enforce the normalization of protocol.DocumentURIs at API boundaries. As a thought experiment, suppose that inside the protocol layer we mutated DocumentURIs to be canonical (keeping a record of the mapping involved in this mutation), and then inverted this mapping inside any response objects. Assuming the mapping were bijective, this would work, and we could just assume that DocumentURIs are canonical inside the server. Is an aspect-oriented design like this achievable?

@adonovan
Copy link
Member Author

As a thought experiment, suppose that inside the protocol layer we mutated DocumentURIs to be canonical (keeping a record of the mapping involved in this mutation), and then inverted this mapping inside any response objects. Assuming the mapping were bijective, this would work, and we could just assume that DocumentURIs are canonical inside the server.

Funny, I was thinking along those lines too! You could imagine a simple reflective traversal over the params JSON object that mutated the DocumentURI fields in place to normalize %XX and file://. I don't think it's actually necessary to encode the results in the opposite way: my impression from the docs is that VS Code can handle "correct" URLs but it emits a mixture of correct and overescaped ones. (The whole matter is another clear case of VS Code externalizing their bugs.)

Then we could use DocumentURI everywhere. Let's try it out. (How do we test it?)

A few little helpers (like filepath.Dir and Base and Join) for URLs would also go a long way to avoiding the need for conversions.

@findleyr
Copy link
Contributor

A few little helpers (like filepath.Dir and Base and Join) for URLs would also go a long way to avoiding the need for conversions.

Yes, this has come up in the past (I thought we had an issue for a 'uripath' library). (Aside: I suppose it's not fair to say "this has come up in the past", because it feels like everything has come up in the past!).

It's a good idea. I just never had the patience to convince myself of the algorithmic correctness.

@adonovan
Copy link
Member Author

A few little helpers (like filepath.Dir and Base and Join) for URLs would also go a long way to avoiding the need for conversions.

Yes, this has come up in the past (I thought we had an issue for a 'uripath' library). (Aside: I suppose it's not fair to say "this has come up in the past", because it feels like everything has come up in the past!).

It's a good idea. I just never had the patience to convince myself of the algorithmic correctness.

The implementations can convert to and from strings and call filepath, just like we do today. At least it'll centralize any existing bugs in one place.

@bcmills
Copy link
Contributor

bcmills commented Nov 15, 2023

See also:

@gopherbot
Copy link

Change https://go.dev/cl/542616 mentions this issue: gopls/internal/span: identify span.URI and protocol.DocumentURI

@seankhliao seankhliao changed the title gopls: need a theory of URls x/tools/gopls: need a theory of URls Nov 15, 2023
@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Nov 15, 2023
@gopherbot gopherbot added this to the Unreleased milestone Nov 15, 2023
@suzmue suzmue modified the milestones: Unreleased, gopls/backlog Nov 16, 2023
@adonovan adonovan self-assigned this Nov 16, 2023
@findleyr findleyr modified the milestones: gopls/backlog, gopls/v0.15.0 Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

5 participants