Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiplexing multiple xpra instances through one port #426

Closed
totaam opened this issue Sep 12, 2013 · 23 comments
Closed

multiplexing multiple xpra instances through one port #426

totaam opened this issue Sep 12, 2013 · 23 comments

Comments

@totaam
Copy link
Collaborator

totaam commented Sep 12, 2013

The problem is that in some situations, the servers may well be sitting behind firewalls that only allow outgoing connections to selected ports (usually 80 and 443 for web browsing).
One solution to this problem is to use SSH or a VPN server running on one of those ports, but these options have their own problems (key exchange, shell account access, etc)

r4326 implements a proof of concept proxy server (accessible via "xpra proxy"): the user connects to this server and after authentication (if required - usually/probably should be) the packets are forwarded to the real server. The encryption and compression is only enabled between client and proxy, and not between proxy and server. (though we could quite easily add options to change this)


What is still needed to make this functional:

  • threading/processes: how many concurrent connections can we handle before this becomes the bottleneck? (at the moment, the POC server can only handle a single connection)
  • handle disconnection from either end gracefully
  • how do we lookup the real server to connect to?
    • a shell script could be a little expensive but is more flexible
    • a lookup file?

Other things this may be useful for:

  • this could be used for load balancing
  • security? (it is easier to isolate the servers from the clients)
  • we could make the proxy server stateful and let it deal with hardware video encoding: in VM hosted environments the guests do not have easy access to hardware, but the host does. The proxy could tell the server to send all frames as plain rgb+lz4, and then it would just replace the frames with video encoded data. (later, this could be taken one step further and the guests could use a shared memory mechanism to avoid using the virtual network for sending frames to the host)
@totaam
Copy link
Collaborator Author

totaam commented Sep 22, 2013

2013-09-22 15:10:35: antoine commented


We probably want to make this the default proxy for all sessions on a local system (in TCP mode - SSH has user auth already), and allow system level authentication (via PAM on Linux, per platform auth), it should support "xpra list" too.
So we need to specify the username, probably add a "--username=NNNN" switch? (and/or support tcp:username@host:port).
Then there are threading issues, at the moment we start many threads per connection (2 for reading and 2 for writing), which is fine when you typically have just one connection active at a time, but this becomes a problem if we want to proxy for dozens of users. Even more so if the proxy handles picture encoding (one more thread..)
We also need to deal with session discoverability, and this is a good reason for moving the server sockets to /tmp/. Backwards compatibility can be achieved by symlinking to the old location, checking both locations, (and maybe even adding some code to the run-xpra script?)
Then ideally we would want some sort of privilege separation between the code that needs root (socket binding, connecting to xpra sockets in /tmp/) and the code that runs once authenticated (IO and encoding).

@totaam
Copy link
Collaborator Author

totaam commented Sep 29, 2013

2013-09-29 16:13:11: antoine commented


Here's how I think this is going to work using the python multiprocessing module:

  • proxy runs as root (generally - not strictly required)
  • add an authentication option: --auth=pam|win32security|ldap|file|script...
    (and maybe this can be used for regular servers too?)
  • when we receive a new connection, we process authentication via one of the auth modules
  • this auth module checks username/password/[display]?
    (maybe think about modules that can use a challenge rather than using a plain password)
    and returns: real uid, real server URI(s), [xpra env options], [session options]
  • the server can then launch a sub-process, passing it the socket connection (as per this example) and let it deal with changing uid, etc

Notes:

  • maybe disallow system auth if we're connecting from a non-encrypted TCP socket?
  • support "xpra list"
  • don't want to use sendmsg and completely separate processes
  • connecting to the real server: can't think of any disadvantages of doing it in the subprocess

@totaam
Copy link
Collaborator Author

totaam commented Sep 30, 2013

2013-09-30 12:38:58: antoine uploaded file auth-v3.patch (45.5 KiB)

splits authentication from server core, adds auth modules and keyfile so password file and encryption keyfile can be different

@totaam
Copy link
Collaborator Author

totaam commented Oct 2, 2013

2013-10-02 11:29:40: antoine uploaded file auth-v5.patch (61.5 KiB)

updated patch (broken multiprocessor support..)

@totaam
Copy link
Collaborator Author

totaam commented Oct 2, 2013

Sigh. As explained here: Caution: python-multiprocessing, threads and glib don't mix

So the v5 patch does not run... as the idle_add calls never fire.

@totaam
Copy link
Collaborator Author

totaam commented Oct 3, 2013

2013-10-03 03:24:31: antoine uploaded file auth-v6.patch (83.0 KiB)

updated patch using timers and custom code instead of gobject from the subprocesses

@totaam
Copy link
Collaborator Author

totaam commented Oct 3, 2013

2013-10-03 04:31:13: antoine commented


The [/attachment/ticket/426/auth-v6.patch auth-v6.patch] worksaround this by using custom code instead of gobject.
Lots of new improvements too:

  • username works
  • pam auth works and, setuid/setgid too
  • both threaded and multiprocessing modes work (controlled via env var)

What does not work yet / not done yet:

  • client connection fails most of the time (race with encryption setup, causes invalid packet)
  • hello filtering needs improvement (rencode / compression should use better values + use file overrides if we have any)
  • signal handling (not done) - see Python: Using KeyboardInterrupt with a Multiprocessing Pool, []
  • handle connection strings (and URIs) like: tcp:username@host:port
  • force kill subprocesses on exit?
  • restrict the subprocesses more: should not need file access (load passwords and keys beforehand?) or any new sockets, limit resource usage, prevent forking, prevent new imports, etc
  • invalid usernames should still trigger challenge (and avoid user enumeration)
    etc..

@totaam
Copy link
Collaborator Author

totaam commented Oct 3, 2013

2013-10-03 04:33:43: antoine uploaded file auth-v8.patch (85.1 KiB)

adds attempts at signal handling and process cleanup + 1 important server fix

@totaam
Copy link
Collaborator Author

totaam commented Oct 3, 2013

2013-10-03 12:05:23: antoine uploaded file auth-v10.patch (92.3 KiB)

many fixes (except encryption drop outs)

@totaam
Copy link
Collaborator Author

totaam commented Oct 4, 2013

2013-10-04 15:05:57: totaam commented


Works ok as of r4399. We have a number of auth modules we can choose from:

  • [/browser/xpra/trunk/src/xpra/server/auth/allow_auth.py allow]: always allows the user to login - dangerous / only for testing
  • [/browser/xpra/trunk/src/xpra/server/auth/fail_auth.py fail]: always fails authentication - useful for testing
  • [/browser/xpra/trunk/src/xpra/server/auth/file_auth.py file]: looks up usernames and password in the password file (format changed)
  • [/browser/xpra/trunk/src/xpra/server/auth/pam.py pam]: linux PAM authentication
  • [/browser/xpra/trunk/src/xpra/server/auth/win32_auth.py win32]: win32security authentication
  • sys is a virtual module which will choose win32 or pam

Once authenticated, the proxy server starts a new process as the user that successfully authenticated (with the uid and gid taken from the password database) and connects to the real server.
We choose the real display to connect to using the "display" capability (TODO: let client specify it) or choose the only session we find (if only one exists), or we fail.
The special case is with the file auth module, which allows us to specify authentication values which may not be valid system users (though a valid uid/gid pair is still required in that case) and a target display which may be a remote one (ie: "tcp:host:port")

@totaam
Copy link
Collaborator Author

totaam commented Oct 4, 2013

Here's how you can use it with the file auth module (sys auth needs encryption to work as we refuse to send unencrypted system passwords over the sockets):

  • start the server
xpra proxy :100 --bind-tcp=0.0.0.0:20000 --auth=file --password-file=./xpra-auth
  • add your user entries in the auth file, ie:
echo "antoine|thepassword|1000|1000|tcp:testhost:10|ENV=VALUE|compression=0" >> ./xpra-auth
  • connect from the client:
echo "thepassword" >> password.txt
xpra attach --username=myusername --password-file=./password.txt $PROXYHOST:20000

This should cause the proxy to forward the connection to the display specified in the auth file (in the example above: tcp:testhost:10)

@totaam
Copy link
Collaborator Author

totaam commented Oct 17, 2013

Many important fixes in r4541, r4537 should make this a lot more usable now.

If things don't work as expected, check that you haven't got an old daemon/zombie running.
Note: as of r4557, one can add session options to the auth file (only two are supported so far as a proof of concept compression_level and lz4), ie:

username|password|1000|1000|tcp:localhost:10000|ENV=VALUE|compression_level=1;lz4=0

Feedback welcome!

@totaam
Copy link
Collaborator Author

totaam commented Oct 23, 2013

Some important fixes in r4605, r4606, r4608, etc

I have identified the problem with the encryption: it isn't a problem with the encryption per se, the encryption just makes it more obvious.
When using the proxy server, we always end up dropping the first packet that the client sends after the hello. Normally, that's a "set_deflate" or one of two "server-settings" (if applicable) or the first of the three "clipboard-token"s...
So, when not using encryption, it's still wrong but we just don't notice because those packets aren't essential!
The AES decryption relies on the strict presence and order of the data, and the missing packet causes a corrupted stream and disconnection.

That's because when we close the proxy-side connection, we may still have a read blocked in IO wait state via socket.recv. When the next packet comes in, it gets to read it before closing down...
We either want to force exit the read loop early (not sure how), or get the data read and inject it into the subprocess (intrusive/ugly)... Or add a way to get the client to send a socket flush() (probably not enough to trigger a proxy read?) or to send a dummy unencrypted packet so we can close the connection? (also ugly but somewhat cleaner: everything exits with normal codepaths)

@totaam
Copy link
Collaborator Author

totaam commented Oct 24, 2013

The socket race is fixed in r4614 and encryption now works in proxy mode too (still only between client and proxy - between proxy and proxied server would require more configuration options, and is not a priority at the moment)

Note: we use a socket timeout (defaults to 0.1s) to guarantee that the sockets are always in a consistent state when handing them over to the new subprocess.
This does slow down the initial connection (on average by half that delay, so about 50ms). The current value seems like a good compromise between polling too frequently (wasting CPU) and waiting too long.

r4615 allows this timeout to be configured via the XPRA_PROXY_SOCKET_TIMEOUT env var. (setting this value too high makes it much more noticeable and one can even set it so high that the connection will often timeout)


What is left for this release (the rest can go in an enhancement ticket for another release):

  • signal handling and subprocess exit
  • performance/testing

@totaam
Copy link
Collaborator Author

totaam commented Nov 7, 2013

Most of the documentation found in this ticket has been added to the proxy server documentation page

@totaam
Copy link
Collaborator Author

totaam commented Nov 11, 2013

As of r4735, the proxy server should be able to exit cleanly.
"xpra stop" now works against the main proxy process (one must be authenticated as the same user that runs that process)

I think that's enough for this ticket, please test and close if it all works as expected.
Please verify that the connection from the proxy to the real xpra server uses rencode and not bencode.


What we may want to add (in a new ticket):

  • proxy video encoding (delegated encoding mode #504), for taking advantage of nvenc hardware accelerated encoding #370 on the host since a VM will not have direct access to the hardware
  • optimize packet handling (avoid decoding then re-encoding things)
  • password authentication and encryption between the proxy server and the real servers
  • better support for "xpra detach" so we can force kill connections (since the proxy will be long lived)

@totaam
Copy link
Collaborator Author

totaam commented Jan 16, 2014

2014-01-16 18:57:34: smo commented


This has been tested but not extensively. We are going to be testing this with 10+ clients and making sure there is nothing broken.

@totaam
Copy link
Collaborator Author

totaam commented Feb 6, 2014

With r5375 one can see a new socket for each proxy instance (this broke older versions which will need r5373 backported):

$ xpra list
Found the following xpra sessions:
	LIVE session at :proxy-28752
	LIVE session at :10
	LIVE session at :20

Which gives us an easier way of interacting and collecting information from proxy instances. It supports: "info", "version" and "stop".

@totaam
Copy link
Collaborator Author

totaam commented Mar 21, 2014

2014-03-21 00:19:48: smo commented


Is this normal when using the proxy.

]$ xpra list
Found the following xpra sessions:
        LIVE session at :100
        LIVE session at :17
        LIVE session at :proxy-20954
]$ xpra --username=username --password-file=./password.txt info :proxy-20954
server requested disconnect: this socket only handles 'hello', 'version' and 'stop' requests

@totaam
Copy link
Collaborator Author

totaam commented Mar 21, 2014

Hmmm, the warning message was wrong (fixed in r5878), "info" is handled, that's one of the main purposes of the proxy socket.

It works fine here... (as usual)

Is there anything in the proxy log? All I see (since it works):

New proxy instance control connection received: SocketConnection(/home/antoine/.xpra/desktop-proxy-25522)
Connection lost

@totaam
Copy link
Collaborator Author

totaam commented Mar 21, 2014

Got it: don't use --username or --password-file. The proxy instance does not support any authentication at present (and I hope it never needs it), it is on a unix domain socket only, so regular unix permissions should be sufficient. Unless someone uses the proxy server and shared group sockets with --socket-dir...

  • r5880 gives a more helpful error message if you try to use authentication
  • r5881 adds information to the man page

@totaam
Copy link
Collaborator Author

totaam commented Mar 25, 2014

2014-03-25 18:02:46: smo commented


Tested this with 8 connections through the proxy with no issues.

@totaam totaam closed this as completed Mar 25, 2014
@totaam
Copy link
Collaborator Author

totaam commented Apr 27, 2015

Some improvements worth mentioning here:

  • r9164: using blocking sockets after the connection is established (fewer timer wakeups)
  • r9163: re-compress window icon (avoids warning)

Both could be backported, but no rush. See also: #838#comment:12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant