-
-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add socket sharing #94
Conversation
✔️ Deploy Preview for robyn canceled. 🔨 Explore the source changes: bae23d5 🔍 Inspect the deploy log: https://app.netlify.com/sites/robyn/deploys/618ba755e63d070008da016b |
https://gist.github.com/josiahcarlson/3723597 Socket is not being pickled, need to find some other way to share them across the processes. |
8163dbc
to
b1500cf
Compare
@messense @JackThomson2 @awestlake87 , I know it has been long time since you folks worked here. But I am facing some major issues in passing the socket about around the multi processes. Do you folks seem to know a solution for this? https://stackoverflow.com/questions/69788270/a-copied-socket-is-not-being-pickled |
Hey, sorry I've been very busy starting a new job. Have you had a look in my fork I had a working example of this under one of the branches I can't remember which |
@JackThomson2 , first of all, congratulations on your new job! 🥳 I did take a look here: https://github.com/JackThomson2/robyn/tree/multiprocess . I couldn't find the process sharing here. |
Thank you! If you look at line 85 on the init.py file inside the Robyn folder the cloning of the socket takes place here and is sent to each new process |
@JackThomson2 , I tried that. I am getting the following error: File "integration_tests/base_routes.py", line 75, in <module>
app.start(port=5000, url='0.0.0.0')
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/robyn/__init__.py", line 100, in start
ns.x = socket.try_clone()
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/managers.py", line 1143, in __setattr__
return callmethod('__setattr__', (key, value))
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/managers.py", line 834, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/connection.py", line 209, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/multiprocess/reduction.py", line 54, in dumps
cls(buf, protocol, *args, **kwds).dump(obj)
File "/Users/bruhh/.pyenv/versions/maturin/lib/python3.8/site-packages/dill/_dill.py", line 498, in dump
StockPickler.dump(self, obj)
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 485, in dump
self.save(obj)
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 899, in save_tuple
save(element)
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 558, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 884, in save_tuple
save(element)
File "/Users/bruhh/.pyenv/versions/3.8.5/lib/python3.8/pickle.py", line 576, in save
rv = reduce(self.proto)
TypeError: cannot pickle 'builtins.SocketHeld' object |
See PyO3/pyo3#100 |
I'm definitely not an expert on this, but I think that it's impossible to pass sockets to another process via pickling. Sockets are process-local, so you can't just pass the file descriptor of an open socket to another process with normal methods. You can inherit open file descriptors from the parent process when the process is forked and then do the processing for a connection in the child process. But a forked process will only inherit the file descriptors from the parent at the time of the fork. The child process won't have access to any of the file descriptors that the parent opens after the fork. In order to do what you want, which I assume is delegate socket connections to a pool of worker processes that were forked before the connection was made, you'll likely need some sort of connection with |
The Just a heads up though, Maybe your best option is a some sort of reverse proxy like |
7605645
to
6579ac1
Compare
@JackThomson2 @messense @awestlake87 , thank you for your help. 😄 I was able to fix this. All the effort spent in learning low level code and just one line(https://github.com/sansyrox/robyn/pull/94/files#diff-8bbb55cf9793eb46fce842bf2abe71425ddb24af349e3114de42ef25f07b02fcR15) fixed it. Thanks again for the help! 🥳 |
@awestlake87 , can't tokio create new threads after the old ones die? |
It could create new worker threads, but the issue is that when the original worker threads die, they most likely left the tasks that were running prior to the fork in a bad state in the child process. Restarting the worker threads would allow new tasks to run, but the old tasks would never complete and the child process would most likely be left in a problematic state. Tokio chooses to do nothing in this case since there's no real way it can guarantee that the program should function properly after the fork. This is true of a lot of multi-threaded libraries, not just tokio, so in general the advice for forking your application is to fork it before any threads are created. This is possible to do in your case as long as pyo3-asyncio is not involved in the parent process at all (pyo3-asyncio tokio initialization is lazy, so it won't start until you try to use it). So basically do not use pyo3-asyncio until the process is forked, then tokio will be started in each child process as it processes the connection whenever pyo3-asyncio is used for the first time. |
Thank you for this information! 😄 I think that after pyo3-asyncio is not being used much in the parent process. It is just vanilla PyO3. Now, if it is getting started, I can make it start a little later. |
@awestlake87 , would it make sense if I would start a process for pyo3-asyncio, kill it and then spawn the two separate processes for the runtime? |
I'm not sure I follow on this. I was thinking that the parent process would handle server setup, then listening and accepting connections. Each connection forks the process and the child process runs the pyo3-asyncio handler. As long as pyo3-asyncio and tokio are not involved in the server setup / listening, I think you should be ok. |
@awestlake87 , ah got it. I believe the same is happening right now. The setup is handled before the processes, And even benchmarking is showing improved throughput and reduced time till 50k requests. |
For some reason, headers are not being shown atm. But, I guess, that is not a tokio problem. |
Latest Update: directory serving is not working, everything else works. Need to fix it and refactor code. |
Hey Sans, one thing I think it's worth checking is that all your processes are handling the share of requests. The perf improvements you're seeing may just be from setting workers to 1 which I found to be more efficient |
Hey @JackThomson2 , I did the test right now. The maximum performance for me was at 5 processes and 5 workers. I feel it is somehow dependent on the number of CPU cores(virtual or real). I was thinking of pushing with defaults as 1 and adding config flags for now. If we figure out the optimized algo for this, we can make that the default way but still the reconfigurability will be there in the future as well. |
aaec1af
to
86ee90a
Compare
c4acb39
to
63ae419
Compare
63ae419
to
bae23d5
Compare
@JackThomson2 @messense @awestlake87 , this feature has now landed. 🥳 Thank you for the help ! ✨ |
Todo:
UPDATE