-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8321053: Use ByteArrayInputStream.buf directly when parameter of transferTo() is trusted #16893
Conversation
…sferTo() is trusted
👋 Welcome back bpb! A progress list of the required criteria for merging this PR into |
Webrevs
|
@@ -207,10 +207,20 @@ public int readNBytes(byte[] b, int off, int len) { | |||
public synchronized long transferTo(OutputStream out) throws IOException { | |||
int len = count - pos; | |||
if (len > 0) { | |||
byte[] tmp; | |||
if ("java.io".equals(out.getClass().getPackageName())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this protection defeated with:
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
UntrustedOutputStream uos = new UntrustedOutputStream();
bais.transferTo(new java.io.DataOutputStream(uos));
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch: that in fact defeats the protection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 176d516 not to trust FilterOutputStream
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other alternative would be to walk ((FilterOutputStream)out).out
and if everything in the out chain is in the "java." package then the out can be trusted.
byte[] tmp = null;
for (OutputStream os = out; os != null;) {
if (os.getClass().getPackageName().startsWith("java.")) {
if (os instanceof FilterOutputStream fos) {
//loops in this chain is going to cause this code to never end.
// self reference A -> A or transitive reference A -> B -> C ->A
os = fos.out;
continue;
}
break;
}
tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)];
break;
}
I don't like the approach of deny list, walking the chain as (subjectively) it seems too fragile.
Also I think I can break this version of the code with ChannelOutputStream. I didn't run this through a compiler nor test it but the idea is that ChannelOutputStream calls ByteBuffer.wrap(bs) and doesn't call ByteBuffer.asReadOnlyBuffer. So a malicious WritableByteChannel should be able to gain access to the original array:
WritableByteChannel wolf = new WritableByteChannel() {
public int write(ByteBuffer src) throws IOException {
src.array()[0] = '0'; //oh no!
return 0;
}
};
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
OutputStream wolfInSheepSuitAndTie = Channels.newOutputStream(wolf);
bais.transferTo(wolfInSheepSuitAndTie);
However, the ChannelOutputStream is in sun.nio.ch so on second thought it shouldn't break. The pattern is repeated in Channels.newOutputStream(AsynchronousByteChannel ch) so that should fail as it is in the "java." namespace.
I think an allow list would be safer but that brings all the drawbacks that Alan was talking about before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have done this incorrectly, but with this version of the above wolf
I do not see any corruption:
java.nio.channels.WritableByteChannel wolf =
new java.nio.channels.WritableByteChannel() {
private boolean closed = false;
public int write(java.nio.ByteBuffer src) throws IOException {
int rem = src.remaining();
Arrays.fill(src.array(), src.arrayOffset() + src.position(),
src.arrayOffset() + src.limit(),
(byte)'0');
src.position(src.limit());
return rem;
}
public boolean isOpen() {
return !closed;
}
public void close() throws IOException {
closed = true;
}
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the problem that unless we have an explicit whitelist, we do open the risk of accidentially adding another wrapper stream in future to the JDK somewhere and forget to add it to the blacklist. So for safety, I would pleae for not using .startsWith() but explitly mention the actively proven-as-safe classes only. That way, the code might be slower (sad but true) but inherently future-proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case of Channels.newOutputStream(AsynchronousByteChannel)
could be handled by changing the return value of that method. For example, sun.nio.ch.Streams
could have a method OutputStream of(AsynchronousByteChannel)
added to it which returned something like an AsynChannelOutputStream
and we could use that.
That said, it is true that a deny list is not inherently future-proof like an allow list, as stated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that a sufficiently future-proof deny list could be had by changing
211 if (out.getClass().getPackageName().startsWith("java.") &&
back to
211 if ("java.io".equals(out.getClass().getPackageName()) &&
That would for example dispense with the problematic Channels.newOutputStream(AynsynchronousByteChannel)
case:
jshell> AsynchronousSocketChannel asc = AsynchronousSocketChannel.open()
asc ==> sun.nio.ch.UnixAsynchronousSocketChannelImpl[unconnected]
jshell> OutputStream out = Channels.newOutputStream(asc)
out ==> java.nio.channels.Channels$2@58c1670b
jshell> Class outClass = out.getClass()
outClass ==> class java.nio.channels.Channels$2
jshell> outClass.getPackageName()
$5 ==> "java.nio.channels"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if scope is limited to java.io
you have deal with FilterOutputStream and ObjectOutputStream. I still haven't done a complete search so there could be other adapters I've yet to review.
Thinking of a different approach, what if ByteArrayInputStream actually recorded and used readlimit
of the mark
method? This allows us to safely leak or poison 'this.data' because once transferTo is called we safely change owner of the byte array if we know this stream is allowed to forget it existed. Effectively you could do optimizations like this (didn't test or compile this):
public synchronized long transferTo(OutputStream out) throws IOException {
int len = count - pos;
if (len > 0) {
byte[] data = this.data;
byte[] tmp = null;
if (this.readLimit == 0) { //<- recorded by mark method, initial value on construction of this would be zero.
data = this.data; //swap owner of bytes
this.data = new byte[0];
Arrays.fill(data, 0, pos, (byte) 0); // hide out of bounds data.
Arrays.fill(data, count, data.length, (byte) 0);
} else {
tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)];
}
while (nwritten < len) {
int nbyte = Integer.min(len - nwritten, MAX_TRANSFER_SIZE);
out.write(buf, pos, nbyte);
if (tmp != null) {
System.arraycopy(buf, pos, tmp, 0, nbyte);
out.write(tmp, 0, nbyte);
} else
out.write(buf, pos, nbyte);
pos += nbyte;
nwritten += nbyte;
}
assert pos == count;
if (data.length ==0) { //uphold rules of class.
pos = count = mark = 0;
}
}
return len;
}
This would approach avoids having to maintain an allow or deny list. The downside of this approach and that is the constructor of ByteInputStream doesn't copy the byte[] parameter. The caller is warned about this in the JavaDocs but it might be shocking to have data escape ByteArrayInputStream. Maybe that is deal breaker? Obviously there a compatibility issue with recording readLimit in the mark method as it states it does nothing.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is getting too complicated. For the time being, I think it would be better simply to have a conservative allow-list and trust only the classes in it. The approach can always be broadened at a later date, but at least for now there would be protection against untrustworthy OutputStream
s
@@ -207,10 +207,20 @@ public int readNBytes(byte[] b, int off, int len) { | |||
public synchronized long transferTo(OutputStream out) throws IOException { | |||
int len = count - pos; | |||
if (len > 0) { | |||
byte[] tmp; | |||
if ("java.io".equals(out.getClass().getPackageName())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should trust all classes in java.*
packages, i.e. the check should be
out.getClass().getPackageName().startsWith("java.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change in 176d516 to use java.
instead of java.io
.
int nwritten = 0; | ||
while (nwritten < len) { | ||
int nbyte = Integer.min(len - nwritten, MAX_TRANSFER_SIZE); | ||
out.write(buf, pos, nbyte); | ||
if (tmp != null) { | ||
System.arraycopy(buf, pos, tmp, 0, nbyte); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the overall performance of transferTo will be faster if we use System.arraycopy only once in line 215 to create a safe copy of the complete buf instead of calling it multiple times in a loop to create copies per slice. In that case we can omit the tmp == null case but simply use tmp = buf, making the code in the loop if-free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a tradeoff here between number of invocations of arraycopy
and amount of memory allocated for tmp
. (We have seen this before in #14981 which I have allowed to languish.) The allocation limit is MAX_TRANSFER_SIZE
which is presently 128 kB, so any transfer of size less than this will invoke arraycopy
only once already.
@@ -207,10 +207,21 @@ public int readNBytes(byte[] b, int off, int len) { | |||
public synchronized long transferTo(OutputStream out) throws IOException { | |||
int len = count - pos; | |||
if (len > 0) { | |||
byte[] tmp; | |||
if (out.getClass().getPackageName().startsWith("java.") && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has anybody actually estimated or measured if such an exception is actually useful / needed given the fact that System.arraycopy is fast native code and most buffers used by java.io-located streams are just few KB? Just asking as it could be the case that interpreting this Java bytecode could be slower than executing some ASM ops to create a few-KB copy, and we might do an "premature optimization" here.
// 'tmp' is null if and only if 'out' is trusted | ||
byte[] tmp; | ||
Class<?> outClass = out.getClass(); | ||
if (outClass.getPackageName().equals("java.io") && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what do we need this string-based check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it's left over from a previous iteration. In any case, limiting it to a small number of output streams makes this easier to look at. BAOS and FOS seem okay, POP seems okay too but legacy and not interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it's left over from a previous iteration. In any case, limiting it to a small number of output streams makes this easier to look at. BAOS and FOS seem okay, POP seems okay too but legacy and not interesting.
Agreed for a rather short list of explicitly whitelisted implementations. We should get rid of the package check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked all the OutputStreams
in the list for trustworthiness. The package check is vestigial; will remove. It could be useful if multiple packages were involved with multiple trusted classes in each.
outClass == PipedOutputStream.class) | ||
tmp = null; | ||
else | ||
tmp = new byte[Integer.min(len, MAX_TRANSFER_SIZE)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay, I'd probably rename tmp to something better, maybe tmpbuf.
@bplb This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 88 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
/integrate |
Going to push as commit b0d1450.
Your commit was automatically rebased without conflicts. |
Pass
ByteArrayInputStream.buf
directly to theOutputStream
parameter ofBAIS.transferTo
only if the target stream is in thejava.io
package.Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16893/head:pull/16893
$ git checkout pull/16893
Update a local copy of the PR:
$ git checkout pull/16893
$ git pull https://git.openjdk.org/jdk.git pull/16893/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16893
View PR using the GUI difftool:
$ git pr show -t 16893
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16893.diff
Webrev
Link to Webrev Comment