Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always load Rascal modules and Java classes of std from the current rascal (attempt 2) #2112

Open
wants to merge 4 commits into
base: replace-lib-by-mvn-and-others
Choose a base branch
from

Conversation

sungshik
Copy link
Contributor

@sungshik sungshik commented Dec 23, 2024

Overview

This PR implements the following approach for loading Rascal modules of rascal:

/**
* Adds the source folders of the `rascal` project to `srcsWriter`.
*
* For each source folder, there are at most two version to choose from:
* - The version in the "current" `rascal` (as per
* `resolveCurrentRascalRuntimeJar()`). Alway available.
* - The version in the "other" `rascal` (as per the
* `URIResolverRegistry`). Available when `rascal` itself is the main
* project, or when `rascal` is open in the workspace.
*
* The version of each source folder of the `rascal` project to add, is
* chosen based on the kind of that source folder:
* - Kind #1 (e.g., `.../library`, i.e., `|std:///|`): Add the version of
* the source folder in the current `rascal`
* - Kind #2 (e.g., `.../typepal`): Add the version of the source folder
* in the other `rascal` (if available) or the current `rascal` (else)
* - Kind #3 (e.g., `test/...`): Add the version of the source folder in
* the other `rascal` (if available)
*
* Thus, source folders of kinds #1 and #2 are added always, while source
* folders of kind #3 are added only when the `rascal` project itself is
* being developed.
*
* @throws IOException When the current `rascal` can't be resolved
*/
private static void addRascalToSourcePath(IListWriter srcsWriter) throws IOException {

Java classes of rascal are always loaded from the "current" rascal (as per resolveCurrentRascalRuntimeJar()).

Thus, after this PR:

  • ✔️ Rascal modules and Java classes of rascal inside std must be loaded from the same version. This partly fixes Rascal modules and Java classes of rascal are loaded from different versions rascal-language-servers#538.

  • ⚠️ Rascal modules and Java classes of rascal outside std may be loaded from different versions. However, currently, there are no Rascal modules of rascal outside std that rely on Java classes. Moreover, after the merge of typepal and rascal-core into rascal: (1) there are no Rascal modules of typepal that rely on Java classes; (2) there are a few Rascal modules of rascal-core that rely on Java classes, but the impact seems negligible in the context of this PR.

  • ⚠️ REPLs created in VS Code always load the version of std of the deployed rascal.jar (part of the VS Code extension), which is put on the class path of the REPL's JVM. Even when rascal itself is open in the workspace, changes made to std will not be loaded in the REPL. To test such changes, a standalone Rascal shell needs to be created (but it doesn't have debugging support).

Examples (VS Code)

Create REPL in rascal

Module paths:
    |std:///| at |jar+file:///C:/Users/sung-/Desktop/rascal3/rascal/target/rascal-0.40.8-SNAPSHOT.jar!/org/rascalmpl/library/|
    |project://rascal/src/org/rascalmpl/typepal| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/src/org/rascalmpl/typepal|
    |project://rascal/test/org/rascalmpl/benchmark| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/benchmark|
    |project://rascal/test/org/rascalmpl/test/data| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/test/data|
    |jar+file:///C:/Users/sung-/Desktop/rascal-language-servers3/rascal-vscode-extension/assets/jars/rascal-lsp.jar!/|

Notes:

  • .../library is loaded from the deployed jar (rascal-0.40.8-SNAPSHOT.jar)
  • The remaining source folders (including .../typepal) of rascal are loaded from the workspace

Create REPL in some project a (with rascal open in the workspace)

Module paths:
    |std:///| at |jar+file:///C:/Users/sung-/Desktop/rascal3/rascal/target/rascal-0.40.8-SNAPSHOT.jar!/org/rascalmpl/library/|
    |project://rascal/src/org/rascalmpl/typepal| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/src/org/rascalmpl/typepal|
    |project://rascal/test/org/rascalmpl/benchmark| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/benchmark|
    |project://rascal/test/org/rascalmpl/test/data| at |file:///c:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/test/data|
    |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/a-0.0.1/a/src/main/rascal|
    |jar+file:///C:/Users/sung-/Desktop/rascal-language-servers3/rascal-vscode-extension/assets/jars/rascal-lsp.jar!/|

Notes:

  • .../library is loaded from the deployed jar (rascal-0.40.8-SNAPSHOT.jar)
  • The remaining source folders (including .../typepal) of rascal are loaded from the workspace

Create REPL in some project a (without rascal open in the workspace)

Module paths:
    |std:///| at |jar+file:///C:/Users/sung-/Desktop/rascal3/rascal/target/rascal-0.40.8-SNAPSHOT.jar!/org/rascalmpl/library/|
    |jar+file:///C:/Users/sung-/Desktop/rascal3/rascal/target/rascal-0.40.8-SNAPSHOT.jar!/org/rascalmpl/typepal|
    |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/a-0.0.1/a/src/main/rascal|
    |jar+file:///C:/Users/sung-/Desktop/rascal-language-servers3/rascal-vscode-extension/assets/jars/rascal-lsp.jar!/|

Notes::

  • .../library and .../typepal are loaded from the deployed jar (rascal-0.40.8-SNAPSHOT.jar)
  • The remaining source folders of rascal aren't loaded

Examples (RascalShell)

Create REPL in rascal (using a RascalShell in the same rascal)

Module paths:
    |std:///| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/library/|
    |project://rascal/src/org/rascalmpl/typepal| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/src/org/rascalmpl/typepal|
    |project://rascal/test/org/rascalmpl/benchmark| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/test/org/rascalmpl/benchmark|
    |project://rascal/test/org/rascalmpl/test/data| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/test/org/rascalmpl/test/data|

Notes:

  • .../library is loaded from the current rascal (~/Desktop/rascal3/rascal)
  • The remaining source folders (including .../typepal) of rascal are loaded from the main project (~/Desktop/rascal3/rascal), which happens to be the same rascal

Create REPL in rascal (using a RascalShell in a different rascal)

Module paths:
    |std:///| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/library/|
    |project://rascal/src/org/rascalmpl/typepal| at |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/rascal/src/org/rascalmpl/typepal|
    |project://rascal/test/org/rascalmpl/benchmark| at |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/benchmark|
    |project://rascal/test/org/rascalmpl/test/data| at |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/rascal/test/org/rascalmpl/test/data|

Notes:

  • .../library is loaded from the current rascal (~/Desktop/rascal3/rascal)
  • The remaining source folders (including .../typepal) of rascal are loaded from the main project (~/Desktop/rascal-which-version/projects/rascal), which happens to be a different rascal

Create REPL in some project a

Module paths:
    |std:///| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/library/|
    |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/typepal|
    |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/a-0.0.1/a/src/main/rascal|

Notes:

  • .../library and .../typepal are loaded from the current rascal (~/Desktop/rascal3/rascal)
  • The remaining source folders of rascal aren't loaded

@sungshik
Copy link
Contributor Author

This PR is intended to replace #2103. The main change relative to that PR is, per @jurgenvinju's suggestion, non-std source folders of rascal will never be loaded in any REPL, except when the main project of the REPL is rascal itself.

Instead of giving this special treatment only to std, the implementation of this PR is a bit more general and currently gives the same special treatment to (a temporary placeholder for) typepal. The idea is that, beside std, there may be other source folders in rascal that are useful to "normal" users.

@DavyLandman
Copy link
Member

I think I like this approach, but I don't think typepal should be treated in the future like std lib. Specifically, you want to be able to work on typepal itself while inside VS Code. No need to treat it in a special way like we do the std lib.

@sungshik
Copy link
Contributor Author

Ah yes, good point. I'll do that.

@jurgenvinju
Copy link
Member

jurgenvinju commented Dec 24, 2024

Great work!

Rascal modules of rascal inside special source folders are always loaded from the current rascal (as per resolveCurrentRascalRuntimeJar).

This one I don't understand. I'm expecting we will read Rascal modules from std:/// for as long as we have the interpreter in play, and by extension we'd use typepal:///, etc. to refer to libraries inside the rascal.jar. This |std:///| stuff is necessary because resolveCurrentRascalRuntimeJar + path produces a jar+file URI that is specific for the local machine, breaking source debugging and references in the LSP to library functions and globals and modules.

The references to other modules in library jars are going to be via the mvn:/// scheme which was made to abstract from the local machine location of the maven repository, for the same reason.

but I don't think typepal should be treated in the future like std lib.

I think we do need a special scheme for the typepal sources, for the same reason as we have std and mvn: source level debugging and LSP references.

If we look at this example:

Module paths:
    |std:///| at |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/library/|
    |file:///C:/Users/sung-/Desktop/rascal3/rascal/target/classes/org/rascalmpl/typepal| // here the debugger breaks and LSP references as well.
    |file:///C:/Users/sung-/Desktop/rascal-which-version/projects/a-0.0.1/a/src/main/rascal|

The LSP breaks because the .tpl files for typepal will have different hard locations for every declaration; those locations will be either mvn:/// or std:/// or typepal:/// but never file:///C:/Users/sung-/

@sungshik
Copy link
Contributor Author

I've implemented Davy's suggestion and updated the issue description accordingly. I think Jurgen's main remaining point is to introduce a typepal scheme. The same thought also crossed my mind yesterday, but after talking with Davy about it briefly, the idea was abandoned. Let's first discuss this a bit more before I start implementing it 😎.

Comment on lines +647 to +655
for (String srcName : manifest.getSourceRoots(manifestRoot)) {
var srcFolder = URIUtil.getChildLocation(manifestRoot, srcName);

if (!reg.exists(srcFolder) || !reg.isDirectory(srcFolder)) {
messages.append(Messages.error("Source folder " + srcFolder + " does not exist.", getRascalMfLocation(rascalProject)));
}

srcsWriter.append(srcFolder);
srcsWriter.append(srcFolder);
}
Copy link
Contributor Author

@sungshik sungshik Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Exactly the same as the base branch, but indented because of the new outer conditional)

@DavyLandman
Copy link
Member

Let me explain my stance on this a bit more verbosely (although I've tried to keep it short).

  1. I dislike |std:///| and |lib://typepal| as it doesn't tell you which version/where it came from. For that reason I much prefer the mvn://... scheme, and also the jar+file:// or |project:// ones (even though project can be a bit confusing). As in, std:/// and lib://typepal are a bit magic for end users to figure out where files came from.
  2. If there is a problem with typecheckers that requires us to introduce typepal:/// (and others), I think that it would also break for external libraries (such as salix & rascal-lsp).
  3. It looks a bit like we're going from lib://typepal to typepal://, which defeats the purpose of getting rid of the lib scheme a bit?

The LSP breaks because the .tpl files for typepal will have different hard locations for every declaration; those locations will be either mvn:/// or std:/// or typepal:/// but never file:///C:/Users/sung-/

To me this looks like more a typepal/rascal-core feature that should be able to relatize it's locations (as it already does, since it stores the locations in a logical form, and maps it to a physical at the last moment). But not something that we should try and solve by introducing a new kind of special location.

@jurgenvinju
Copy link
Member

I dislike |std:///| and |lib://typepal| as it doesn't tell you which version/where it came from. For that reason I much prefer the mvn://... scheme, and also the jar+file:// or |project:// ones (even though project can be a bit confusing). As in, std:/// and lib://typepal are a bit magic for end users to figure out where files came from.

I don't like it that much either, but we need a installation-location-independent URI for the libraries inside rascal.jar. Since they end up in the extension folder and not in the mvn repo, the std:/// location was invented. The same location is also workable for rascal.jar in its independent mode ("java -jar rascal.jar"), etc.

Features like the debugger and browsing through the standard libraries with M3 and .tpl summaries depend on this.

@jurgenvinju
Copy link
Member

jurgenvinju commented Dec 26, 2024

It is (since we have PathConfig.getCurrentRascalRuntime()) possible to turn std:/// and typepal:/// and stuff into logical URI resolvers. That way they become transparant and can be queried for their actual location. Since authorities for logical resolvers are also keys, we could have one scheme for the rascal jar and then an authority per nested library:

  • |boot://std/| for the standard library
  • |boot://typepal| for the typepal sources
  • |boot://compiler| for the compiler sources

When we resolve these logical locations you will get jar+file:///path/to/rascal.jar!/src/org/rascalmpl/library, etc. So we can explain to the user which exact versions are used at the moment.

The difference in efficiency for loading modules should be neglibable as compared to the original std scheme, but that is something to investigate.

I like "boot" since there is a bootstrap dependency on these libraries for making rascal work. It is very similar to the bootstrap bytecode/dll classpath of the JVM.

The packager can simply rewrite all source URIs to boot:/// URIs during mvn package, and after that it is smooth sailing. The rascal.jar will remain relocatable in any deployment location, and still the location of the standard library is transparantly visible.

@jurgenvinju
Copy link
Member

By the way if we change std:// to boot://std we have to do a big bang release of everything, i just realized. So better stick with std:// for the next few releases.

@DavyLandman
Copy link
Member

I feel like we're moving towards a solution that has the same problems as the lib schema. While there is a case for the rascal.jar, I think we don't need the special case (as it hides a current limitation of rascal-core). But I do think a few issues are getting mixed here.

Different issues that we're trying to squash (using lib://clair as an example):

  1. Inside of VS Code, lib://clair is not resolvable. As in, which version of clair does it point to? especially since the VFS resolver is part of the JVM that runs the typechecker & outliner, it doesn't even have clair on it's class/source path. So, our fix was (in this PR series): let's replace that with mvn://.../clair-0.1.0.jar (or project://clair).
  2. When starting a REPL, you have to very carefully read all the output to know where clair actually came from. Our fix (in this PR series): let's get rid of lib://clair and the whole resolved lib://clair... message and use only the path config data.
  3. While clair was typechecked, the sources were located in file:///.../path/on/ci/clair/src. That path is not accurate outside of the build environment. Historically we fixed this by rewriting those path (in the package phase) to the lib://clair scheme. However in VS Code, this doesn't work, as lib://clair was not something that VS Code could navigate to (see point 1), meaning we cannot navigate to clair or any other third party libraries.

I think point 3 is not an issue we should try solve in rascal itself with logical source locations. I think the work @PaulKlint has done to store logical locations (rascal:///util/Monitor instead of file:///..../util/Monitor.rsc) can be adapted to generate a proper source paths for any tpl based on the context of where the tpl is found.

sketch on how I imagine this would work in rascal-core
  1. we use rascal://util//Monitor like we currently do
  2. the mapping to physical locations doesn't contain file:///..../util/Monitor.rsc locations, but relative:///util/Monitor.rsc locations. (optionally this is only done during packaging)
  3. When we ask for use defs/errors, we translate the relative locations based on the current location of the tpls/sources (most likely a source path)

I'll leave the details of this to @PaulKlint but based on our discussions, this should be possible.

How does this relate to boot://rascal or std:/// or boot://typepal? I think for 1 and 2 they confuse the end user of where stuff is coming from and are not needed. For 3 it's a workaround that only works for things prepackaged with the extension.

I'm sorry I couldn't phrase this as a reply to previous messages. But I'll now try to link it to some specific quotes by @jurgenvinju

but we need a installation-location-independent URI for the libraries inside rascal.jar
Features like the debugger and browsing through the standard libraries with M3 and .tpl summaries depend on this.

Why specifically? Can you give a concrete example of why it needs to be installation-location-independent? I know we currently depend on the std:/// but I think we do not have to.

It is (since we have PathConfig.getCurrentRascalRuntime()) possible to turn std:/// and typepal:/// and stuff into logical URI resolvers.

This only holds as long as everyone has to use the exact same version of rascal (and typepal) as the rascal extension is using. I know that's a current requirement, but I would like to move to a place where we don't need this. I think if we don't depend on std:/// being the same for all REPLs it would make life easier down the road. (Same is true for typechecking code with a slightly older/newer version of rascal than the typechecker is running under).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants