Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messages bigger than 65535 utf code points crash the server #155

Closed
michael-simons opened this issue Oct 26, 2020 · 5 comments
Closed

Messages bigger than 65535 utf code points crash the server #155

michael-simons opened this issue Oct 26, 2020 · 5 comments
Assignees
Milestone

Comments

@michael-simons
Copy link

common/Message uses DataOutputStream#writeUTF which throws an expiation if the string to write is longer than 65535 utf code points

https://github.com/mvndaemon/mvnd/blob/master/common/src/main/java/org/jboss/fuse/mvnd/common/Message.java#L302

I stumbled upon this while trying mvnd with Neo4j-OGM. It happens while our build triggers the Maven Java doc plugin.

I was able to reproduce it with the following test:

package com.example.mvndtest;

import java.io.IOException;

import org.junit.jupiter.api.Test;

class MvndTestApplicationTests {

	@Test
	void contextLoads() throws IOException {

		var stringToWrite = new StringBuilder();
		for (int i = 0; i < 65535 /* 65509 works on my machine */; ++i) {
			stringToWrite.append("a");
		}

		System.out.println(stringToWrite.toString());
	}
}

Having such an output somewhere in the build - doesn't matter via plugin or code - will crash the server.
I wasn't able to reproduce it with exec maven or ant run plugins and the reason is simple: Those plugins redirect sys out line by line. The java doc plugin however collects all the log from the javadoc binary and passes it directly to the maven log.

That's basically the same like I do in the example above.

I have attached a reproducer.

mvn clean package works nicely with a lot of useless output, mvnd clean package will end with:

[INFO] Running com.example.mvndtest.MvndTestApplicationTests
Exception in thread "main" org.jboss.fuse.mvnd.common.DaemonException$StaleAddressException: Could not receive a message from the daemon.
	at org.jboss.fuse.mvnd.client.DaemonClientConnection.receive(DaemonClientConnection.java:107)
	at org.jboss.fuse.mvnd.client.DefaultClient.execute(DefaultClient.java:198)
	at org.jboss.fuse.mvnd.client.DefaultClient.main(DefaultClient.java:72)
Caused by: java.io.IOException: No message received within 3000ms, daemon may have crashed
	at org.jboss.fuse.mvnd.client.DaemonClientConnection.receive(DaemonClientConnection.java:100)
	... 2 more

and attached logs.

Reproducer:
mvnd-test.zip

Logs:
logs.zip

@gnodet
Copy link
Contributor

gnodet commented Oct 26, 2020

@ppalaga I think the removal of the utf8 encode/decode methods is the cause, so maybe we should revert this commit a71033c#diff-07780342186ffa696b0e9d532cb9e9cf11f62cae11d2980e44c7be22c0619892L332-L419.

This is because the DataInputStream uses a 2-byte for the length, while the ones in message were using 4-bytes.

@ppalaga
Copy link
Contributor

ppalaga commented Oct 26, 2020

Interesting! Thanks for the report, @michael-simons!

@ppalaga
Copy link
Contributor

ppalaga commented Oct 26, 2020

Yeah, let me have a look what we can do about this.

@michael-simons
Copy link
Author

michael-simons commented Oct 26, 2020

UTF-8 varies in length between 2 and 4. The 💩 emoji for example is a 4 byte utf character I used on various occasions to break things:
https://info.michael-simons.eu/2013/01/21/java-mysql-and-multi-byte-utf-8-support/

The serializer looks actually sensible.

@gnodet
Copy link
Contributor

gnodet commented Oct 26, 2020

UTF-8 varies in length between 2 and 4. The 💩 emoji for example is a 4 byte utf character I used on various occasions to break things:
https://info.michael-simons.eu/2013/01/21/java-mysql-and-multi-byte-utf-8-support/

The serializer looks actually sensible.

No, I was talking about the length of the string. The DataOutputStream uses an unsigned short (2 bytes) so is limited to 64k. The DataOutputStream code code has the following https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/DataOutputStream.java#L363-L365:

        if (utflen > 65535)
            throw new UTFDataFormatException(
                "encoded string too long: " + utflen + " bytes");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants