Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: NPE in fillWriteBuffer #38

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mako-fdc
Copy link

@mako-fdc mako-fdc commented Oct 2, 2019

This changeset fixes a NullPointerException (NPE) in the fillWriteBuffer mehtod.

The NPE can occur, if a an operation is transitioned prematurely into the state OperationState.COMPLETED, due to an payload error-code ERR_2BIG.

The situation can be reproduced with a while(true) loop, which tries to store too large objects in memcached.

The original reproduction method and the issue description were reported in:
#17

Also duplicate successively calls to getters in the fillWriteBuffer and the toString method have been replaced with a single getter call, because of the danger of race conditions.

Signed-off-by: Marco Kortkamp <[email protected]>
@theonajim
Copy link

ERR_2BIG I believe occurs due to mismatch in client/server configuration. Default SerializingTranscoder allows 20 MB value while open source memcached allows only 1MB value by default. To avoid ERR_2BIG, the SerializingTranscoder can be constructed with a size that matches the server ( 1MB = 1 * 1024 * 1024) so that large values are rejected on client side https://github.com/couchbase/spymemcached/blob/master/src/main/java/net/spy/memcached/transcoders/SerializingTranscoder.java#L63-L65

@favila
Copy link

favila commented Oct 15, 2019

We've had years-long problem in production (only in production!) where under load we would get the following exception:

Uncaught Exception on thread  Memcached IO over {MemcachedConnection to ....} :
NullPointerException
        net.spy.memcached.protocol.TCPMemcachedNodeImpl.fillWriteBuffer (TCPMemcachedNodeImpl.java:206)
        net.spy.memcached.MemcachedConnection.handleWrites (MemcachedConnection.java:800)
        net.spy.memcached.MemcachedConnection.handleInputQueue (MemcachedConnection.java:582)
        net.spy.memcached.MemcachedConnection.handleIO (MemcachedConnection.java:409)
        net.spy.memcached.MemcachedConnection.run (MemcachedConnection.java:1445)

The memcached connection on these clients would become unusable but not die or reconnect, so the client process would continue but in a performance-degraded state. To mitigate we added a default uncaught exception handler that looked for this exception and terminated the process if it was found (triggering a restart and fresh process in the cluster.) However these exceptions often came in storms which would bring down half the cluster within a second.

We've had this problem with spymemcached clients v2.12.3 and going back to v2.11.4.

It is very unlikely these were caused by ERR_2BIG for us. They were probably caused by ERR_BUSY or maybe even abrupt connection closes from memcached. They correlated with times when there was likely a write stampede from many clients attempting to write the same value to the same key.

Anyway, we ran this patch in production and it completely resolved our issues. This is the fork we are running in production.

Please do not get distracted by the ERR_2BIG part of this: it just so happens to be the easiest error condition to produce to trigger this problem. Other error codes and error states cause this also. And thank you @mako-fdc for this high-quality patch and test.

@timothypratley
Copy link

Great solution! Thank you, this will fix an issue for us.

@mako-fdc
Copy link
Author

mako-fdc commented Oct 17, 2019

@favila Thank you for your fast test in production.

Yes, ERR_2BIG was a plausible and the most reproducible cause for us. But ERR_BUSY could also be a likely cause.

@favila
Copy link

favila commented Oct 17, 2019

@mako-fdc Do you mind getting in touch with me via email? We at Clubhouse would like to mail you some token of our appreciation (t-shirts) if you are interested. I can't find any other way to contact you other than here.

@favila
Copy link

favila commented Nov 11, 2019

I have linked to this issue from the couchbase issue tracker to hopefully give it more visibility.

@favila
Copy link

favila commented Dec 19, 2019

@mako-fdc I identified an issue with this PR.

When a memcached server responds to an operation before the client has finished sending all bytes related to that operation, the operation's buffer is destroyed so the pending bytes are lost. This used to cause IO thread death but with your PR it causes the operation to be dropped.

Unfortunately, I think merely dropping the operation is not aggressive enough. If for example the write just sent "10 bytes to follow" and has only sent 5 bytes, the remaining 5 bytes will never be sent (they were released by the read handler for that op) and the client will start sending the next packet in what the sever may still think is the payload for the first packet. We have observed corrupted values stored in memcached server 1.4.34 after running your patch: the value was truncated and followed by raw binary memcached packets in the payload. (We don't understand why memcached stored those instead of discarding them or what the response was: we are still investigating.)

Perhaps the best answer is not to null the write buffer at read-time and only allow it to be cleared at write-time. However this is a more complex change and I am worried that it is not the right answer for every possible response, e.g. it makes no sense to keep sending if the server immediately says ERR_2BIG, that just wastes bandwidth.

So absent that, I think the safest thing to do is reconnect so the client and server can sync up again. This is what I do in this commit. We are currently running this set of changes in production and gathering more information.

@mako-fdc
Copy link
Author

I'm sorry to hear that. We never encountered that kind of problem with our setup yet.

Sadly, the local reproduction of such problems is usually quite hard. It seems to need some network latency etc. to occur.

Your solution with the re-connection looks good to me. Indeed the other mentioned solution might be cleaner, but will be more difficult to implement due to unit-testing/reproduction issues.

Happy holidays!

@favila
Copy link

favila commented Dec 20, 2019

Followup, the response we were getting was "out of memory" (response code 0x82), meaning out of slab memory, meaning that too many SETs come in simultaneously destined for the same chunk (as in a stampede, all our clients were trying to write the same value to the same key). In other words it is this issue.

This still doesn't explain memcached storing those bytes from the rejected response (that seems like a memcached server bug; we are using 1.14.43), but this is the scenario that can trigger it. Our reconnect strategy seems to work for preventing corruption.

@nishantagarwal1990
Copy link

Is there a timeline for fixing this issue and releasing a new version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants