Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dangerous ByteBufStreamInput methods #27076

Merged
merged 3 commits into from
Oct 24, 2017

Conversation

Tim-Brooks
Copy link
Contributor

This commit removes the ByteBufStreamInput readBytesReference and
readBytesRef methods. These methods are zero-copy which means that
they retain a reference to the underlying netty buffer. The problem is
that our TcpTransport is not designed to handle zero-copy. The netty
implementation sets the read index past the current message once it has
been deserialized, handled, and mostly likely dispatched to another
thread. This means that netty is free to release this buffer. So it is
unsafe to retain a reference to it without calling retain. And we
cannot call retain because we are not currently designed to handle
reference counting past the transport level.

This should not currently impact us as we wrap the ByteBufStreamInput
in NamedWriteableAwareStreamInput in the TcpTransport. This stream
essentially delegates to the underling stream. However, in the case of
readBytesReference and readBytesRef it leaves thw implementations
to the standard StreamInput methods. These methods call the read byte
array method which delegates to ByteBufStreamInput. The read byte
array method on ByteBufStreamInput copies so it is safe. The only
impact of this commit should be removing methods that could be dangerous
if they were eventually called due to some refactoring.

This commit removes the `ByteBufStreamInput` `readBytesReference` and
`readBytesRef` methods. These methods are zero-copy which means that
they retain a reference to the underlying netty buffer. The problem is
that our `TcpTransport` is not designed to handle zero-copy. The netty
implementation sets the read index past the current message once it has
been deserialized, handled, and mostly likely dispatched to another
thread. This means that netty is free to release this buffer. So it is
unsafe to retain a reference to it without calling `retain`. And we
cannot call `retain` because we are not currently designed to handle
reference counting past the transport level.

This should not currently impact us as we wrap the `ByteBufStreamInput`
in `NamedWriteableAwareStreamInput` in the `TcpTransport`. This stream
essentially delegates to the underling stream. However, in the case of
`readBytesReference` and `readBytesRef` it leaves thw implementations
to the standard `StreamInput` methods. These methods call the read byte
array method which delegates to `ByteBufStreamInput`. The read byte
array method on `ByteBufStreamInput` copies so it is safe. The only
impact of this commit should be removing methods that could be dangerous
if they were eventually called due to some refactoring.
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a suggestion about making it more future-proof.

BytesRef bytesRef = new BytesRef(buffer.array(), buffer.arrayOffset() + buffer.readerIndex(), length);
buffer.skipBytes(length);
return bytesRef;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep them, delegating to the parent impl with a comment, eg.

     @Override
    public BytesReference readBytesReference(int length) throws IOException {
        // NOTE: It is unsafe to share a reference of the internal structure, so we
        // need to return a copy, like the default implementation does.
        return super.readBytesReference(length);
    }

My motivation is that it won't be possible to optimize it unsafely in the future without seeing that comment.

@Tim-Brooks Tim-Brooks merged commit a7fa5d3 into elastic:master Oct 24, 2017
Tim-Brooks added a commit that referenced this pull request Oct 24, 2017
This commit removes the `ByteBufStreamInput` `readBytesReference` and
`readBytesRef` methods. These methods are zero-copy which means that
they retain a reference to the underlying netty buffer. The problem is
that our `TcpTransport` is not designed to handle zero-copy. The netty
implementation sets the read index past the current message once it has
been deserialized, handled, and mostly likely dispatched to another
thread. This means that netty is free to release this buffer. So it is
unsafe to retain a reference to it without calling `retain`. And we
cannot call `retain` because we are not currently designed to handle
reference counting past the transport level.

This should not currently impact us as we wrap the `ByteBufStreamInput`
in `NamedWriteableAwareStreamInput` in the `TcpTransport`. This stream
essentially delegates to the underling stream. However, in the case of
`readBytesReference` and `readBytesRef` it leaves thw implementations
to the standard `StreamInput` methods. These methods call the read byte
array method which delegates to `ByteBufStreamInput`. The read byte
array method on `ByteBufStreamInput` copies so it is safe. The only
impact of this commit should be removing methods that could be dangerous
if they were eventually called due to some refactoring.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 24, 2017
* master:
  Timed runnable should delegate to abstract runnable
  Expose adaptive replica selection stats in /_nodes/stats API
  Remove dangerous `ByteBufStreamInput` methods (elastic#27076)
  Blacklist Gradle 4.2 and above
  Remove duplicated test (elastic#27091)
  Update numbers to reflect 4-byte UTF-8-encoded characters (elastic#27083)
  test: avoid generating duplicate multiple fields (elastic#27080)
  Reduce the default number of cached queries. (elastic#26949)
  removed unused import
  ingest: date processor should not fail if timestamp is specified as json number
@Tim-Brooks Tim-Brooks deleted the fix_netty_stream branch November 14, 2018 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations >non-issue v6.1.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants