Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #4357: spoon doesn't handle paths containinig spaces correctly #4358

Merged
merged 8 commits into from
Jan 6, 2022

Conversation

xzel23
Copy link
Contributor

@xzel23 xzel23 commented Dec 16, 2021

Fix #4357

please review and apply

@slarse
Copy link
Collaborator

slarse commented Dec 16, 2021

Hi @xzel23,

Could you add a test case to demonstrate that this fix solves the problem? As per the contributing guidelines, bugfix PRs must contain a test case that reproduces the bug.

@@ -418,7 +420,7 @@ public void setInputClassLoader(ClassLoader aClassLoader) {
if (onlyFileURLs) {
List<String> classpath = new ArrayList<>();
for (URL url : urls) {
classpath.add(url.getPath());
classpath.add(URLDecoder.decode(url.getPath(), StandardCharsets.UTF_8));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is better or something like

classpath.add(Path.of(url.toURI()).toAbsolutePath().toString());

Copy link
Contributor Author

@xzel23 xzel23 Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unconvinced. Now that you remind me: Path.of(URL) will throw an exception if the URL is not pointing to something that lies on a filesystem or if no FileSystemProvider supporting the URL's protocol is present. For example to create a Path instance that points to a file located inside a Jar file (not the Jar itself), there has to be a FileSystem instance for the Jar file. In the case of the "jar:" protocol, Path.of(URL) will make a disk access and try to read the Jar content and fail if the file is not present or cannot be read. You can try this with jshell:

jshell> Path.of(new URL("jar:file:/hello.jar!/hello.txt").toURI())
|  Exception java.nio.file.FileSystemNotFoundException
|        at ZipFileSystemProvider.getFileSystem (ZipFileSystemProvider.java:156)
|        at ZipFileSystemProvider.getPath (ZipFileSystemProvider.java:142)
|        at Path.of (Path.java:208)
|        at (#3515 

Copy link
Collaborator

@I-Al-Istannen I-Al-Istannen Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt the existing code can handle files inside JARs either? URL-Decoding the getPath() would return file:/hello.jar!/hello.txt which likely isn't handled correctly. I'd need to have a look at it, but failing early instead of maybe failing later on sounds like a plus to me?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this assumes utf8 encoding. How does this work on Windows when there's a non-ASCII character in the filepath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slarse Maybe someone with access to a windows box could test. In any case, URLs should use UTF-8. Qutoe taken from W3.org:

Note. Some older user agents trivially process URIs in HTML using the bytes of the character encoding in which the document was received. Some older HTML documents rely on this practice and break when transcoded. User agents that want to handle these older documents should, on receiving a URI containing characters outside the legal set, first use the conversion based on UTF-8. Only if the resulting URI does not resolve should they try constructing a URI based on the bytes of the character encoding in which the document was received.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully convinced, that JDT accepts file paths referring to files inside a jar or other weird things correctly, so I am not sure raw URL decoding really is a benefit here over stricter Path.of validation/conversion.

Either way, I tried it out @slarse and the URLDecode javadoc states that it only uses the charset for percent-encoded data. new URL("öäüß") is not stored percent encoded, so the string is returned unchanged in the default JVM windows charset. The UTF-8 setting has no relevance for it, but if the URL does contain other percent encoded data it could maybe end up with a String using two encodings.

Copy link
Collaborator

@slarse slarse Dec 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xzel23 I did not know that, good to know what the spec says (although I doubt all file systems follow that, a filepath is not necessarily a URI). I just don't trust Windows not to be the odd one out on.. everything.

@I-Al-Istannen Thanks for checking it out, that's 100% confusing :D

@xzel23
Copy link
Contributor Author

xzel23 commented Dec 20, 2021

Could you add a test case to demonstrate that this fix solves the problem? As per the contributing guidelines, bugfix PRs must contain a test case that reproduces the bug.

Test case added.

ClassLoader classLoader = new URLClassLoader(classpath);
launcher.getEnvironment().setInputClassLoader(classLoader);
launcher.getEnvironment().setNoClasspath(false);
assertDoesNotThrow(() -> launcher.buildModel());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a stronger assert would be checking for the presence of CtType Foo in the model. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's next on my list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this was a bit more complicated than expected because sppon doesn't load java files via classloaders. So I added a jar file that is used during the test. After some trial and error (I accidentally compiled the jar using JDK 17 and other oversights) the test finally passed in all tested configurations.

Copy link
Collaborator

@slarse slarse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production code looks fine to me. As URL.toURI() throws a checked exception, using URLDecoder.decode() seems more convenient to use.

The test should be moved to an existing test class (required change), and I strongly suggest making the assertion more specific to the scenario being tested (optional change).

Ah, this was a bit more complicated than expected because sppon doesn't load java files via classloaders. So I added a jar file that is used during the test. After some trial and error (I accidentally compiled the jar using JDK 17 and other oversights) the test finally passed in all tested configurations.

Class loaders are only for binary code (e.g. jars and class files), not for source code. You could have added a single .class file as well. I don't think it matters if it's a .jar or a .class, both are easy to inspect, so you can keep it as-is.

Comment on lines 28 to 31
CtModel model = launcher.buildModel();

assertTrue(model.getAllTypes().stream().anyMatch(ct -> ct.getQualifiedName().equals("Foo")),
"CtTxpe 'Foo' not present in model");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional change: This only indirectly verifies that we correctly handle spaces in classpath URLs. I'd suggest actually verifying that the reference to bar.Bar is correctly resolved.

Suggested change
CtModel model = launcher.buildModel();
assertTrue(model.getAllTypes().stream().anyMatch(ct -> ct.getQualifiedName().equals("Foo")),
"CtTxpe 'Foo' not present in model");
launcher.buildModel();
CtType<?> foo = launcher.getFactory().Type().get("Foo");
CtMethod<?> methodWithBarReturnType = foo.getMethod("foo");
CtTypeReference<?> barRef = methodWithBarReturnType.getType();
assertThat(barRef.getQualifiedName(), equalTo("bar.Bar"));
assertThat(barRef.getTypeDeclaration(), notNullValue());
assertTrue(barRef.getTypeDeclaration().isShadow(), "expected bar.Bar to be a shadow class");

Another alternative would be to test this with a unit test of StandardEnvironment instead, which is where the problem actually lies. Since this is tested so "far away" from the problem, the assertion should be very specific to IMO.

public class TestIssue4357 {

@Test
public void testClasspathURLWithSpaces() throws MalformedURLException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required change: Put this test in an existing test class instead of creating a new one. Due to the phrasing of the contract, I'd put this in LauncherTest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slarse done.

Copy link
Collaborator

@slarse slarse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @xzel23

@slarse slarse merged commit fa6b014 into INRIA:master Jan 6, 2022
@xzel23 xzel23 deleted the bug4357 branch October 4, 2023 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] StandardEnvironment does not work if path contains spaces
4 participants