-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.io.EOFException: EOF while reading packet #653
Comments
I have the same issue, works with 0.10.0 but not with 0.11.1 - trying to connect to Samba server running in docker on Raspberry Pi - fresh Samba/Docker install from this repo: https://github.com/alexandreroman/rpi-samba Using this Samba config: [global] |
I am seeing the same issue with a NAS device. Is there any resolution to this issue, or any known workaround? I am using SMBJ 0.11.1, and I am seeing the same regression from 0.10.0, where this was working. I have also tried to force the dialect to SMB 2.1 and that does not seem to help. |
Recently tried 0.11.5, same error. We still have to stick to 0.10.0. |
We (knime.com) have recently upgraded to 0.11.5 and are seeing this issue with our customers as well. Are there any news on how this could be fixed or worked around? (other than downgrading to 0.10) |
I've seen similar problems in one customers environment. I don't know for sure yet what environment he is using, but I think that in this specific case the SMB it is some kind of IBM solution. I'm still trying to get the necessary data from the customer and need to anonymize it. I hope I'm able to provide useful logs later on, although I'm afraid I won't manage to get a TCP dump. I can already tell that in this specific case, SMB_2_1 dialect is negotiated between server and client. The "EOF while reading packet" occurs when the DirectTcpPacketReader reads the header of the second SMB2_SESSION_SETUP packet (first packet has status STATUS_MORE_PROCESSING_REQUIRED):
Browsing old commits, I found this change. I'm not sure if I understand the code correctly, but it seems to me like this could be a regression because the SMBSessionBuilder (which since 0.11.x replaces the old code linked above) only respects a so called "preauthSession" in case SMB_3_1_1 dialect is used. |
Good morning, I just got together all the logs and stripped them from customers data. While I think my finding from above may still be a problem, it seems not to be the cause for the troubles my customer is experiencing. I've run a test using SMB_3_1_1 dialect, so the questionable if-conditions should be fulfilled, but the error remains the same. I've written a simple test application which creates a folder on a share. This test application I've built once with smbj 0.10.0 and once with 0.11.5, the later variant I've run both setting SMB_2_1 dialect in SmbConfig or leaving the dialects default. The log output (level TRACE) is below. 0.10.0 - working example
0.11.5, using SMB_2_1 - not working
0.11.5, dialect not configured (using SMB_3_1_1) - not working
Would be great if there was a solution, as other customers are moving on to SMB3 shares requiring encryption and unfortunately the problems in this specifics customers environment blocks an update. |
@hierynomus Thank you for looking into this. I've just built from branch and deployed a new version of my test application using smbj at the customers environment. Unfortunately, behavior is the same: it still fails processing the response to the second SMB2_SESSION_SETUP request, throwing an EOFException during readTcpHeader. Please let me know if I there is some way I can help finding the cause. Guess a TCP dump would be helpful, but unfortunately, I will not get clearance from the customer for this. Maybe one could built in some well-placed additional logging? |
Yes, correct i realized last night that it might not be correctly fixed. |
@ZwoCa The branch is updated. If all's correct it should now calculate the correct MIC 🙈 ... The other option I can see is that we not send a MIC, but I'd rather fix it correctly. |
If you could test it out that would be great |
Thank again @hierynomus - I'm happy to test it. Will have to wait until Monday or Tuesday though, before I'm able to reach out to the customer again. I've seen that there is a new option in SmbConfig regarding the Windows version. I guess this is something I should "break out" in my test application, so I can play around with this setting in the customers environment? |
I don't think it's needed to change. Not sure yet whether I'll keep it in SmbConfig or will hardcode it below the surface. It's mainly just passed around during Ntlm authentication for debugging purposes on client/server side. As far as the spec's concerned there should be no version dependent behaviour, but ymmv... |
I've tested the new build. It seems like there is a bit of progress, however, still SMB2_SESSION_SETUP fails. Instead of throwing an EOFException whentrying to read the TCP header of the second SMB2_SESSION_SETUP request, it now fails because the server sends an STATUS_LOGON_FAILURE (0xc000006d) response: 0.11.6-SNAPSHOT
I've double-checked the login credentials, they are correct, switching back to SMBJ 0.10.0 with the same credentials it works as expected. The password does not contain any weird characters, just lower case characters and a number. The username is 7 characters long and alpha-numeric only. |
Well that's good and bad news, this means that the new Session Setup is accepted by the remote side. Meaning that at least indeed part of the work is correct ;) Question: Is it possible for you/your client to obtain the server log to see why it complains that there is a LOGON failure? Feel free to forward it to my email if they don't want to drop it here. |
Thank you. It will be a bit of a challenge to find someone in charge who is willing to help, but I'm trying my best. It may take some days, though. |
No problem... Just a thought that crossed my mind, which SecurityProvider are you using? I've noticed an inconsistency whilst testing the NtlmV2Functions... The JceSecurityProvider gives the expected results (according to the MS-NLMP samples provided). However, the BCSecurityProvider, and the JceSecurityProvider backed by BouncyCastle give different/wrong results. |
Up to now I didn't modify the SecurityProvider via SmbConfig, so BCSecurityProvider was used. However, I just changed it to JceSecurityProvider, but the error is still the same. |
Ahh too bad, that would've been too easy... |
I've got an update. The bad news is that getting an excerpt from the logs or an TCP dump will require a lot of effort. Administration of the clients SMB servers is handled by another company. There are a lot parties involved and this would probably need NDAs and someone from management to push it. However, someone from SMB ops took a look at the TCP dumps himself and while I'm not allowed to get or share them, I've been given at least one small hint on what might be going wrong: When using SMBJ 0.10.0, for the second Session Setup Request (NTLMSSP_AUTH) the TCP dump will show: "User: europe.domain\qualifieduser" (actual data anonymized). However when doing the same with SMBJ 0.11.5, the TCP dump will show "User: e\q". So only the first letter of domain and username is sent, which of cause would explain the STATUS_LOGN_FAILURE response. I'm sorry I can't provide any further details, but I hope this makes sense and gives a hint in the right direction? |
Let me quickly dive into the old code :) Not sure I can explain the difference. Mainly also because I did a walkthrough of the code just last few days |
I've looked and looked and cannot explain that currently. I've added trace logging of NTLM messages to this branch now. It should log a nice toString for each NTLM message (received and sending) to see whether there's something in there that we can deduce. Could you run a build of this branch using your sample program? |
@ZwoCa Another question, you've mentioned that 0.10.0 works correctly, and 0.11.5 does not. Not sure whether you've tried narrowing down the version range. Could you try the following 2 versions if it's not too much to ask:
Thanks! |
Hi Jeroen, of cause I'm happy to try this. Thank you very much for your support. :)
Regarding the added trace logging, here is a fresh log: 0.11.6-SNAPSHOT, commit 449d5d7
Again, it's anonymized, but I tried to keep the different entries consistent. Please let me know if you need the ntResponse, I could share it via email but need to get clearance first. What I found to be kind of interesting: The trace log for NtlmAuthenticate shows complete domain and user name (europe.domain\qualifieduser). In yesterdays TCP dump, with 0.10.0 domain and user name where also fully readable, but truncated to one character (e\q) when using 0.11.6-SNAPSHOT: yesterdays excerpt from TCP dump, 011.6-SNAPSHOT
|
I just realized something a bit strange: Please have a look at NtlmNegotiate log entry:
← The content of domain is a single quotation mark, while the content of workstation is what I'd expect to be the domain. However, in NtlmAuthenticate looks better:
|
@ZwoCa You are correct, I mixed up the two fields when constructing the NtlmNegotiate. Just pushed the fix for that to the branch, you can try it out, but I'm not sure that it will make a huge difference. |
@hierynomus Thanks for the fix. The output regarding NtlmNegotiate looks reasonable now, but unfortunately and as expected it didn't make a difference regarding the STATUS_LOGON_FAILURE. I'm afraid air is getting thin, is there anything else to try? I guess that given this error, normally one would take a look at the server side for further debugging… But the provider of the Samba service has already indicated that they don't see it as their problem and have no intention of taking action here. |
Well there are a few more options we can try luckily, in order:
For me the hard part is that I'm not sure how to reproduce this on my samba docker image. That would considerably speed up testing efforts. |
Goodmorning @ZwoCa, do you have any update on the first two points? |
Well at least the semi-positive news is that the |
Unfortunately, no news so far. The customer did not simply move the share to a new host, but they switched the entire storage solution. Formerly, it was "on DELL-System [sic]", now they switched to NetApp. So it's a completely new infrastructure, sub-contracted to a new company. There may still be a chance that some users at the customer site are still using the old infrastructure, that might behave like the test host that was shut down. I'm in touch with them, but it will take me some time to sort that out. Hopefully, I can use one of these (production) hosts for testing purposes. |
Just a quick update: I'm in touch with an user who appears to still be using one of the old, problematic shares. I'll have the change to test it tomorrow, so hopefully I'll be able to reproduce the original issue with 0.11.5 and test if commit 72fbbe5 is working. |
Great to hear! In the meantime I've pushed an additional commit: 900621d And I've got one more lined up. It would be great if you could (after reproducing the EOF with 0.11.5), could iteratively try the commits from 72fbbe5 onwards. Then I'll push the commit that I have locally also, and we can make fast steps (before they also decommission their env 🙈 😉 ). Do you agree to this? |
May I ask you to already push this commit? If you don't want to push it to the current branch, maybe on another (temporary) branch? This way, I could prepare the necessary test application(s). The user does not really have anything to do with the matter and only makes himself and his environment available out of pure kindness, so I don't want to stretch his patience, and want to prevent him from doing his job as little as possible. 🙈 |
Done! 5aba7cf is also present... 🤞 Let me know if one of these commits starts breaking... |
Good news, the users share still uses the old environment, so I can recreate the error again. Commits up to including 900621d work fine. However, with 5aba7cf, the error is back:
I hope this helps narrowing down the problem? |
Hi, well that's (semi) good news... |
Hi @hierynomus - we ran the test, but e16fb93 is also failing: Same error message as witnessed before. So the culprit appears to be in this piece of code? |
Ok, now this is interesting indeed. What's the |
It's pretty default: The only change on the default config from SmbConfig.build() is that Also timeouts are increased, but I guess they are not of relevance here. |
Ok! Let me check the code... Will be back tomorrow probably ;) |
Hi @hierynomus, I just wanted to let you know, that from tomorrow on I'm on vacation (and also business trips) until the end of the month. Therefore, I will not have a change to test any new commits at the customers environment until I'm back in July. However, a colleague of mine has agreed to follow up on the topic and to try out any new commits at the customer's site. So if you have something new to test, just post an update - my colleague can probably give you feedback shortly. Also, I want to take this opportunity to thank you for being so committed to finding the cause of the issue. |
Hi Sebastian,
Enjoy the holiday! What's the github handle of your colleague? I'll tag him
next week when I've had time to make a new commit.
Talk to you soon!
Op vr 9 jun. 2023 12:02 schreef Sebastian ***@***.***>:
… Hi @hierynomus <https://github.com/hierynomus>, I just wanted to let you
know, that from tomorrow on I'm on vacation (and also business trips) until
the end of the month. Therefore, I will not have a change to test any new
commits at the customers environment until I'm back in July.
However, a colleague of mine has agreed to follow up on the topic and to
try out any new commits at the customer's site. So if you have something
new to test, just post an update - my colleague can probably give you
feedback shortly.
Also, I want to take this opportunity to thank you for being so committed
to finding the cause of the issue.
—
Reply to this email directly, view it on GitHub
<#653 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA4XIYQ3SWBYAZIIPECNMLXKLYCLANCNFSM5ANZD7RQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @hierynomus - I'm not the named colleague of @ZwoCa but we also experienced the error with a NAS server on customer side using v0.11.5. I have been following your conversation and thought I could support you both until @ZwoCa is back from vacation. Don't want to bring in confusion, but just want to let you know that with a build of current working-auth branch (e648ab6) we could authenticate and push a file. I then reverted the commit and merged e16fb93 which leads to failing authentication again, just to be sure to not making anything dumb. In addition, I tried current master that is also failing with the customer's NAS server. |
Hi @JanDornseifer, wow, thanks a lot for jumping in and testing this. It seems that we're on the right track again with the new commit, which is great to know. There's a few things that need to be added now to the Authenticator to bring it back up to speed with what's in the spec. I'll try to add a next increment tomorrow or Friday. Would be great if you could test that when it's there. |
Hi @hierynomus, I just built an updated version of our test application using your latest commit. However, I guess feedback will have to wait until next week, as our contact at the customers site will hardly be available this week. |
I was lucky and just found a time slot to try it out at our customers site. I can confirm @JanDornseifer results, it's still working with commit e648ab6. 👍 |
Hi @hierynomus - sure, I will try to support you both as best I can. I will test new commits in parallel with @ZwoCa. |
@ZwoCa Great to hear that it also works for your customers! Let's bring this to the finish line ;) |
Good morning @hierynomus. I'm happy to be able to tell you that it's still working fine with your latest commit. 🎉 Thank you so much for your efforts. So finish line is ultimately in reach, I guess? Are there any more things to try or is it only a question of releasing a new (non-snapshot) version now? I see that there is still a WORKING_AUTH_TODO file, but it seems to be out of date. |
@ZwoCa Goodmorning! Well the main thing I would still need to do now is to make the default for But other than that, the current working-auth branch is back on par with master and has solved the direct problems. My course of action will be:
|
@ZwoCa I've just merged master into the working-auth branch, can you do a sanity check that nothing broke before I continue |
Maybe I'm missing something, but merging the master to working-auth branch did only do one actual change: In NtlmAuthenticate class, there two flags for integrityEnabled and omitVersion were added (but they are not used). Apart from that, a comment in NtlmAuthenticator changed as well as a test. Was this really your intention? |
Yes, that indeed should be what happened, those 2 flags need to be removed indeed still, but other than that the few conflicting changes there were I overrode with the branch's contents. |
OK. Sanity check was successful, we just ran the test at the customers environment. 👍 |
@hierynomus @ZwoCa I can also confirm. Works with NAS server of our customer as well. |
Given that the branch has been merged, I'm going to close this large thread ;) |
We are currently evaluating smbj and tested it using 3 shares (cannot add the real names here):
Share 1:
\\samba_server\smb_with_optional_encryption
Share 2:
\\windows_server\smb_with_enforced_encryption
Share 3:
\\nas_server\smb_without_encryption
Using smbj 0.10.0 we saw the following behaviour:
Share 1 is OK: accessible (SMB_2_1)
Share 2 is OK: access denied (SMB_2_1)
Share 3 is OK: accessible (SMB_2_1)
Using smbj 0.11.1 we got an error with share 3:
Share 1 is OK: accessible (SMB_3_1_1)
Share 2 is OK: accessible (SMB_3_0_2)
Share 3 is not OK: throws the following error, no matter which configuration parameters we use or if we use no configuration at all:
We used the following code to test the library (only changed the SmbConfig using different parameters, e.g. withEncryptData(true) etc.):
Any idea what's going wrong here?
The text was updated successfully, but these errors were encountered: