Skip to content
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.

Support cross-repository go-to-definition #8

Closed
felixfbecker opened this issue Nov 9, 2018 · 13 comments
Closed

Support cross-repository go-to-definition #8

felixfbecker opened this issue Nov 9, 2018 · 13 comments
Labels

Comments

@felixfbecker
Copy link
Contributor

felixfbecker commented Nov 9, 2018

  • Extension makes go-to-definition request
  • Server detects definition URI is in node_modules (location will point to a .d.ts file)
  • Server reads and parses package.json
  • Server maps package to clone URL and rev from the package.json repository and gitHead fields (gitHead requires an npm registry request because of support gitHead property in package.json yarnpkg/yarn#2978)
  • Server resolves cloneURL and rev to a Sourcegraph raw target root URI (needs new API, e.g. https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123)
  • Server determines repo-root-relative package.json location from directory field proposed in npm RFC Add Monorepo Subdirectory Declaration RFC npm/rfcs#19)
    • If not present
      • MVP: assume root of repo (works for most libraries)
      • search all package.jsons in repo for the one with the correct name field (TODO: how? maybe raw API like /raw/**/package.json)
  • Server reads and parses .d.ts.map file for the definition URI
  • Server maps definition location in .d.ts to position in .ts file (using source-map library)
  • Server resolves package-root-relative .ts. location from package root in node_modules
  • Server resolves target URI from target root URI, repo-root-relative package.json location and package-root-relative .ts location
  • Server returns target URI to extension
  • Extension returns target URI to Sourcegraph
@nicksnyder
Copy link
Contributor

Server resolves cloneURL and rev to a Sourcegraph raw target root URI (needs new API, e.g. https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123) ⚠️

I assume this is primarily necessary to support private repos dependencies? Public repositories could just be cloned directly from the code host, right?

If this is the constraint, then instead of having a dependency back to Sourcegraph could the requirement be that the user configured the server container to have the appropriate git credentials for cloning private things (e.g. .netrc, SSH keys, etc.)?

@felixfbecker
Copy link
Contributor Author

Please look at the bold terms - the target root URI that is the result of that step is used to form the final definition URI in the last step, so Sourcegraph can navigate to it. It is not used anywhere to fetch contents from that repo, so additional credentials for cloning would not help. We just need to tell the client "the definition is at this location". The only alternative I see to this is return a made-up URI (not a URL) that only has meaning to Sourcegraph, like our legacy repo URI git://github.com/some/repo?rev#file. I prefer the way in this proposal, because the URI returned is a retrievable URL that can be used by any client. That the public Sourcegraph resolve API is used to resolve that URL is just an implementation detail, it could be any other public service that provides the same functionality. We wouldn't even have to strictly call it and just return the resolve call URL which would redirect, but resolving it first and returning the target is cleaner.

@nicksnyder
Copy link
Contributor

Can you give an example of what a target root URI looks like?

@felixfbecker
Copy link
Contributor Author

It's a URL pointing to the Sourcegraph raw API for the repo root. E.g.
https://[email protected]/github.com/sourcegraph/javascript-typescript-langserver/-/raw/

@nicksnyder
Copy link
Contributor

Given the flow above (if I understand correctly), I think I would favor the URI approach to an extra round trip. In other words, delete this step:

  • Server resolves cloneURL and rev to a Sourcegraph raw target root URI (needs new API, e.g. https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123) ⚠️

Modify these steps:

  • Server resolves temp target URI from cloneURL, rev, repo-root-relative package.json location and package-root-relative .ts location
  • Server returns temp target URI to extension
  • Extension returns temp target URI to Sourcegraph
  • Sourcegraph translates temp target URI to the target URI that would have been returned by the proposed endpoint.

@felixfbecker
Copy link
Contributor Author

So you would return https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123&path=some/file.ts? Or what is temp target URI?
Can you elaborate why you favour this?
It would be more complex because now the client needs to know that some URLs need to be called first to be resolved

@nicksnyder
Copy link
Contributor

I must be missing some understanding, but just reading the issue description it seems like the proposed flow is:

  • network_call(cloneURL, rev) returns a target root URI
  • server_local_resolution(target root uri, repo-root-relative package.json location, package-root-relative .ts location) returns a target URI
  • network_call(target URI) to send to server

If we treat these like variables in algebra, then it seems like we could eliminate the first network call.

  • server_local_resolution(cloneURL, rev, repo-root-relative package.json location, package-root-relative .ts location) returns a target URI 2
  • network_call(target URI 2) to send to Sourcegraph

The only reason I call it target URI 2 is to clarify that this is a potentially different URI from the original proposal.

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Nov 10, 2018

You can eliminate the first network call by returning all context needed for the computation of the final URI and shifting the responsibility to do that computation to the client. You would have to put clone URL, rev and repo-root relative .ts location all into the target URI (2).

That means either return the HTTP resolve API URL, e.g.
a) https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123&path=some/file.ts
or b) a special URI like repo:[email protected]/sourcegraph/sourcegraph:rev:some/file.
Which option did you have in mind?

@nicksnyder
Copy link
Contributor

(b) makes the most sense to be between the two because it seems strictly simpler.

I wouldn't hesitate to use regular query params so you don't have to do special parsing with the :
repo:[email protected]/sourcegraph/sourcegraph?rev=abc&path=some/file.ts

@felixfbecker
Copy link
Contributor Author

That is basically what we are currently doing with git://repo?rev#file URIs (e.g. the TypeScript langserver currently returns such a URL for xrepo go-to-def).

The goal is to remove Sourcegraph-specific things from public API of language servers. That's the reason we are moving from git: URIs to HTTP raw API URLs, because HTTP is not Sourcegraph-specific, and every client knows HTTP and can make a GET request to the URL without having to parse the URL in any way or know anything about Sourcegraph / that it is a sourcegraph.com URL.
But with returning a Sourcegraph-specific custom URI like repo:[email protected]/sourcegraph/sourcegraph?rev=abc&path=some/file.ts, the client needs to understand this kind of URI for it to be useful. Basically my thinking is "if the URI is not useful to all clients there is no point in returning it at all". I am also thinking of this not just for TypeScript but as a blue print for all other languages.

@nicksnyder
Copy link
Contributor

Sorry, I am still catching up in understanding here. I now understand the desire to return a URL so I would change my answer to (a), and I actually like your idea to have it redirect instead of resolving it a priori (I am curious to hear why you think the redirect isn't clean).

For example:
https://sourcegraph.com/.api/[email protected]/sourcegraph/sourcegraph&rev=abc123&path=some/file.ts
redirects to a URL like
https://[email protected]/github.com/sourcegraph/sourcegraph/-/raw/some/file.ts?rev=abc123 (I don't know what this API actually looks like)

I guess my question about your proposal is "how does the language server know to contact https://sourcegraph.com/.api/resolve. I assume this base URL would need to be advertised by the LSP client (e.g. Sourcegraph extension) somehow and the language server would need to know what query params it is allowed to add (e.g. clone_url, rev, path)? Can you explain the details here?

This sounds like it would be a "Sourcegraph proposed extension to LSP" as a general way to resolve cross repo definitions. The fact that the language server ends up calling a Sourcegraph endpoint isn't really the problem. The design problem is can we spec this out in a way that feels like a natural extension to LSP that anyone could use and isn't Sourcegraph specific (I think we can).

@felixfbecker
Copy link
Contributor Author

Thanks for staying with me!
My thinking is that if we return a URL that redirects to the canonical URL, then why should the server not resolve it first to the canonical URL? I don't see a reason why it shouldn't. The Sourcegraph extension eventually needs the raw-API URL, because that is the canonical form which includes the repo name to jump to. So it will have to do a HEAD on the URL to resolve it. But since the majority of URLs already point to the raw API, the extension would have to parse and inspect every single URL to determine whether it needs to resolve it (i.e. check whether it points to the /resolve endpoint) and then do a HEAD request to resolve the real location. Alternatively, it could just HEAD all URLs it receives, but that would be a huge perf overhead and be a noop in 99% of the cases.
The server on the other hand already knows that this is a /resolve URL that is not canonical, so it can easily resolve it without the parsing/inspecting that the extension would have to do.
For other clients, it is also hard to consult a cache for the URI since there may already be an item in the cache with the canonical version but the client doesn't necessarily know that the URI returned is not canonical.

I guess my question about your proposal is "how does the language server know to contact https://sourcegraph.com/.api/resolve. I assume this base URL would need to be advertised by the LSP client (e.g. Sourcegraph extension) somehow and the language server would need to know what query params it is allowed to add (e.g. clone_url, rev, path)? Can you explain the details here?

It would know that Sourcegraph exposes a public API to resolve clone URLs and uses it internally as an implementation detail.

This sounds like it would be a "Sourcegraph proposed extension to LSP" as a general way to resolve cross repo definitions. The fact that the language server ends up calling a Sourcegraph endpoint isn't really the problem. The design problem is can we spec this out in a way that feels like a natural extension to LSP that anyone could use and isn't Sourcegraph specific (I think we can).

No, Sourcegraph extensions to LSP are forbidden in the new world. We tried speccing these out to feel like "natural extensions to LSP" in the past and they never got adoption. We need to work with what LSP already offers, one of which is arbitrary URIs.

@nicksnyder
Copy link
Contributor

Ok, I see the value in returning a canonical url, so not using a redirect seems like a good solution.

It would know that Sourcegraph exposes a public API to resolve clone URLs and uses it internally as an implementation detail.

Do you think a non-Sourcegraph language server author would merge a PR that embeds a Sourcegraph URL? This seems strictly more Sourcegraph specific than my proposal to define a (hopefully) simple capability for doing this generally (i.e. this would support anyone providing a cross reference definition service).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants