Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: infinite loop when entity.channel is replaced by revive() o… #779

Merged
merged 1 commit into from
Jan 8, 2019
Merged

Bugfix: infinite loop when entity.channel is replaced by revive() o… #779

merged 1 commit into from
Jan 8, 2019

Conversation

tyarimi
Copy link
Contributor

@tyarimi tyarimi commented Aug 4, 2017

…n an connection drop

@tyarimi tyarimi changed the title fixing an infinite loop when entity.channel is replaced by revive() o… Bugfix: infinite loop when entity.channel is replaced by revive() o… Aug 7, 2017
@tyarimi
Copy link
Contributor Author

tyarimi commented Aug 8, 2017

@thedrow what is needed for this to be merged?

@tyarimi
Copy link
Contributor Author

tyarimi commented Aug 13, 2017

@thedrow I'm not really sure how to test it, as this bug is a combination of code in common.py (maybe_declare) and connection.py (ensure_connection). Can you please help?

@georgepsarakis
Copy link
Contributor

@tyarimi let me know if you need more help with the test cases.

@tyarimi
Copy link
Contributor Author

tyarimi commented Aug 28, 2017

@georgepsarakis thanks for verifying this. Here are some stacktraces we got using faulthandler after killing the stuck processes (last call first):

File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 415 in _read
  File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 240 in read_frame
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 490 in blocking_read
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 485 in drain_events
  File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 93 in wait
  File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 73 in send_method
  File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 448 in open
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 476 in channel
  File "/usr/local/lib/python2.7/dist-packages/kombu/transport/pyamqp.py", line 100 in create_channel
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 266 in channel
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 821 in default_channel
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 517 in _ensured
  File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 147 in _imaybe_declare
  File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 128 in maybe_declare
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/rpc.py", line 168 in on_task_call
  File "/usr/local/lib/python2.7/dist-packages/celery/app/base.py", line 736 in send_task
  File "/usr/local/lib/python2.7/dist-packages/celery/app/task.py", line 536 in apply_async
 File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 377 in _read
  File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 237 in read_frame
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 468 in blocking_read
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 464 in drain_events
  File "/usr/local/lib/python2.7/dist-packages/amqp/connection.py", line 300 in connect
  File "/usr/local/lib/python2.7/dist-packages/kombu/transport/pyamqp.py", line 130 in establish_connection
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 757 in _establish_connection
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 802 in connection
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 261 in connect
  File "/usr/local/lib/python2.7/dist-packages/kombu/utils/functional.py", line 333 in retry_over_time
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 405 in ensure_connection
  File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 515 in _ensured
  File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 143 in _imaybe_declare
  File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 124 in maybe_declare
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/rpc.py", line 168 in on_task_call
  File "/usr/local/lib/python2.7/dist-packages/celery/app/base.py", line 736 in send_task
  File "/usr/local/lib/python2.7/dist-packages/celery/app/task.py", line 535 in apply_async

@dvaldivia
Copy link

+1

@tyarimi
Copy link
Contributor Author

tyarimi commented Feb 11, 2018

@thedrow it's been a long time since I opened this PR, and unfortunately I can't find the time to write tests, as I'm not familiar with the project's testing framework.
We have been using this in production for 6 months, and can confirm the issue was completely gone, with no visible side effects.
Can someone help with writing the tests, or can we just merge this as-is?
Thanks!

@dralley
Copy link

dralley commented Mar 1, 2018

We've experienced issues like this sporadically in the past (I can't guarantee it was this one, precisely).

I would really love to see this in the 4.2 release.

@auvipy
Copy link
Member

auvipy commented Dec 12, 2018

@tyarimi any chance you could make this pr work on latest master?

@tyarimi
Copy link
Contributor Author

tyarimi commented Dec 12, 2018

@auvipy eventually I will, just don't have the time right now.

@auvipy
Copy link
Member

auvipy commented Dec 13, 2018

@tyarimi did you face any regression in production with this patch?

@tyarimi
Copy link
Contributor Author

tyarimi commented Dec 16, 2018

@auvipy not at all. It just solved the issue completely.

@auvipy
Copy link
Member

auvipy commented Dec 16, 2018

could u rebase this?

@auvipy auvipy merged commit d78a8fc into celery:master Jan 8, 2019
@auvipy
Copy link
Member

auvipy commented Jan 8, 2019

@tyarimi you could open an issue regarding the testcase for this pr

@auvipy auvipy added this to the 4.3 milestone Jan 8, 2019
danikmil pushed a commit to Scalr/kombu that referenced this pull request Jan 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants