-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lpstat hangs with stopped cupsd on Solaris #156
Comments
@l1gi This code implements an asynchronous connect so that we can try all of the server's potential addresses at the same time. Clearly Solaris doesn't support this behaviour, so my inclination is to provide a non-async code path just for Solaris. |
Why do you think Solaris doesn't support this? What exactly does Solaris different? As far as I can see, poll() returns 1 with errno 150 (EINPROGRESS) for ::1. Cups starts to believe it is already connected. Unixv7 standard says:
When poll() returns non-negative number it doesn't mean that the socket is connected. It just says something has happened on the socket. We must check the socket status to ensure it is connected. |
@l1gi But the pollfd events we are looking for at POLLIN and POLLOUT, not POLLERR. poll() should only return >0 when the socket is connected (OK to write or OK to read) and not on an error. So that is a Solaris bug, and given that Solaris isn't exactly a current, well-supported OS anymore I am going to opt for a fix that has the fewest possible side-effects. I am not interested in adding a bunch of code that should never be executed. |
@l1gi Try the following patch (or grab the latest Github master code...):
[master e50230a] Add a workaround for Solaris in httpAddrConnect2 (Issue #156) |
Hi Michael, the patch works fine, just
should be
Thank you! |
Oh sorry, yes, vice versa.
should be
Thanks again. |
Hi, I am sorry, but it seems I did not finish my testing of your change going into CUPS codebase mentioned here: The fix doesn't work. Before this change poll() didn't signalize any error though the socket's connection was refused. Later the socket started to be treated as connected. My change (patch) was not nice, but worked and did additional testing of the socket status using getsockopt() before CUPS decided if the socket is connected or not. Michael offered a better looking change, but it did not work because if is placed in 'socket error' code path and thus never run. I have approved Michael's patch by mistake, so blame me for that, please. Now this change looks similar, but works this time.
Could I kindly ask you to consider it for integration, please? Thank you, Martin. |
Just a comment that on Linux lpstat without cupsd running says:
With my change above I am getting:
on Solaris. |
Solaris behaves differently regarding connecting to a socket, so our current code caused hanging. The patch is from @l1gi's comment - OpenPrinting#156 (comment) Fixes OpenPrinting#156
Solaris behaves differently regarding connecting to a socket, so our current code caused hanging. The patch is from @l1gi's comment - OpenPrinting#156 (comment) Fixes OpenPrinting#156
There is an issue with cups client time out on Solaris. If I stop cupsd or use -h localhost:632 for lpstat it does not fail even if there is no daemon listening and hangs.
$ lpstat -h localhost:632
I have rebuilt the cups with debug enabled and tried again to generate more debug info. This is the interesting part of it:
cups.log
Even if there is nothing listening on [::1]:632, it thinks it is connected and continues trying with re-connect.
This is an interresting part of truss:
Firstly I don't understand why httpConnect2 is called with blocking=1 and the argument is not used in the underlying functions while they do unconditional nonblocking connect.
If this is an intention, than there is a missing mechanism to check the state of the socket before it is being polled. Maybe getsockopt() is what would 'workaround' the issue though I am in doubt that is the right fix here.
I assume it works accidentally on linux as the non blocking connec() operation finishes sooner that poll() is called. Still the check should be there as far as I understand.
Here is the patch which consider a socket on which getsockopt() fails as non connected on Solaris.
04-nonblock_connect.txt
Though I still don't understand some details of httpAddrConnect2() behavior this change makes Solaris lpstat -t fail in the same manner as on linux:
Could I ask you to rework the patch and integrate it in current cups, please?
The text was updated successfully, but these errors were encountered: