forked from dustin/java-memcached-client
-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPE in fillWriteBuffer() with binary protocol #17
Open
lmikkelsen
wants to merge
1
commit into
couchbase:master
Choose a base branch
from
lmikkelsen:binary-npe
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I apologize for using a PR to report an issue, but I'm unable to find a public issue tracker for spymemcached. Under certain circumstances `TCPMemcachedNodeImpl.fillWriteBuffer()` will throw a NullPointerException because the current write operation has transitioned to the `COMPLETE` state and thus doesn't have a write buffer. This issue has previously been reported in couchbase#16 and https://code.google.com/archive/p/spymemcached/issues/304. I'm able to consistently reproduce the issue using spymemcached 1.12.1 and memcached 1.4.34 installed using Homebrew on macOS 10.12.2. The daemon is running with the default options (`memcached -l localhost`). I'm running the included `Main.java` with the 1.12.1 release JAR: ``` $ javac -classpath spymemcached-2.12.1.jar Main.java $ java -cp spymemcached-2.12.1.jar:. Main ``` It consistently produces the following output: ``` 2017-01-23 20:40:24.678 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2017-01-23 20:40:25.170 INFO net.spy.memcached.MemcachedConnection: Reconnecting due to exception on {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=155, #iq=6, topRop=null, topWop=Cmd: 1 Opaque: 81 Key: test Cas: 0 Exp: 60 Flags: 2048 Data Length: 2097152, toWrite=0, interested=5} java.lang.NullPointerException at net.spy.memcached.protocol.TCPMemcachedNodeImpl.fillWriteBuffer(TCPMemcachedNodeImpl.java:206) at net.spy.memcached.MemcachedConnection.handleWrites(MemcachedConnection.java:800) at net.spy.memcached.MemcachedConnection.handleReadsAndWrites(MemcachedConnection.java:723) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:683) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:436) at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1446) 2017-01-23 20:40:25.172 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=155, #iq=9, topRop=null, topWop=Cmd: 1 Opaque: 81 Key: test Cas: 0 Exp: 60 Flags: 2048 Data Length: 2097152, toWrite=0, interested=5}, attempt 0. 2017-01-23 20:40:25.173 INFO net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: No buffer for current write op, removing 2017-01-23 20:40:25.173 WARN net.spy.memcached.MemcachedConnection: Could not redistribute to another node, retrying primary node for test. 2017-01-23 20:40:25.174 WARN net.spy.memcached.MemcachedConnection: Could not redistribute to another node, retrying primary node for test. 2017-01-23 20:40:25.175 WARN net.spy.memcached.MemcachedConnection: Could not redistribute to another node, retrying primary node for test. ``` Occasionally the IO thread dies: ``` 2017-01-23 20:40:27.116 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue Exception in thread "Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11211}" java.lang.NullPointerException at net.spy.memcached.protocol.TCPMemcachedNodeImpl.fillWriteBuffer(TCPMemcachedNodeImpl.java:206) at net.spy.memcached.MemcachedConnection.handleWrites(MemcachedConnection.java:800) at net.spy.memcached.MemcachedConnection.handleInputQueue(MemcachedConnection.java:582) at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:409) at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1446) ``` As far as I can tell the condition occurs when `MemcachedConnection.handleWrites()` exits before having written all of the current operation because the buffer is full. If the operation causes the daemon to return an error `OperationImpl.finishedPayload()` will transition the operation to the `COMPLETE` state (which sets the buffer to null). Since the operation isn't removed from the write queue a NullPointerException is thrown on the next `handleWrites()`. I'm using a large (2MB) data object to trigger an `ERR_2BIG` error, but I believe the condition could happen with any of the `ERR_BUSY`, `ERR_TEMP_FAIL`, etc. errors as well: https://github.com/couchbase/spymemcached/blob/2.12.1/src/main/java/net/spy/memcached/protocol/binary/OperationImpl.java#L244-L252
Hi, thanks for opening the PR here - its fine! Btw the issue tracker used is here: https://issues.couchbase.com/browse/SPY I'll look into it and report back |
pbruneton
pushed a commit
to criteo-forks/spymemcached
that referenced
this pull request
Aug 21, 2018
Related to BES-16032, this patch helps to gather more information about the issue encountered that kill the Memcached IO thread. Please note that this patch will not kill the Memcached IO thread which will continue to run. Previous known bug reports related to this issue are: couchbase#7 couchbase#17 Change-Id: If1d9887963e96ac6ee4a8ec0dd0c19626e521346 JIRA:WBSC-3570
A fix for this problem can be found in PR #38 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I apologize for using a PR to report an issue, but I'm unable to find a public
issue tracker for spymemcached.
Under certain circumstances
TCPMemcachedNodeImpl.fillWriteBuffer()
will throwa NullPointerException because the current write operation has transitioned to
the
COMPLETE
state and thus doesn't have a write buffer. This issue haspreviously been reported in #16 and
https://code.google.com/archive/p/spymemcached/issues/304.
I'm able to consistently reproduce the issue using spymemcached 1.12.1 and
memcached 1.4.34 installed using Homebrew on macOS 10.12.2. The daemon is
running with the default options (
memcached -l localhost
).I'm running the included
Main.java
with the 1.12.1 release JAR:It consistently produces the following output:
Occasionally the IO thread dies:
As far as I can tell the condition occurs when
MemcachedConnection.handleWrites()
exits before having written all of thecurrent operation because the buffer is full. If the operation causes the
daemon to return an error
OperationImpl.finishedPayload()
will transition theoperation to the
COMPLETE
state (which sets the buffer to null). Since theoperation isn't removed from the write queue a NullPointerException is thrown
on the next
handleWrites()
.I'm using a large (2MB) data object to trigger an
ERR_2BIG
error, but Ibelieve the condition could happen with any of the
ERR_BUSY
,ERR_TEMP_FAIL
,etc. errors as well:
https://github.com/couchbase/spymemcached/blob/2.12.1/src/main/java/net/spy/memcached/protocol/binary/OperationImpl.java#L244-L252