-
-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redis transport - Redelivered messages should respect the original priority #2026
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will need unit tests to verify this change
8475bdf
to
7f95ead
Compare
@auvipy I've updated the existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do something more then the unit tests to verify this? a integrations test or manually testing proof? both would be great but in case IT is not possible manual ack is also fine.
@auvipy I've added integration tests I based this test off the existing I've also added the integration test for the other transports to show that those already have the correct behavior. It's only the redis transport that needs to be corrected (e.g. pyamqp and mongodb have the correct behavior) Let me know if this is sufficient, thanks! |
is something wrong with the 3.9 unit test? i noticed it's also failing in another PR https://github.com/celery/kombu/actions/runs/9644718778/job/26613373788?pr=2036 which is just a change to the readme..? @Nusnus @auvipy |
Yeah it appears so. Indeed, I also saw the Readme PR failing and was surprised. Few days ago I released v5.4.0rc1 and everything was totally fine so it's something very recent. I'll give it a look tomorrow 🙏 P.S |
it seems like nothing is actually failing, but it's just running slow for some reason and timing out... 🤔 oh wait it's consistently failing in the exact same spot, so probably not a timeout issue |
Locally 3.8, 3.12 and 3.9 are passing. Results (34.69s):
1177 passed
168 skipped
Restoring 2 unacknowledged message(s)
3.8-unit: OK (66.25=setup[29.75]+cmd[36.50] seconds)
congratulations :) (66.32 seconds) Results (33.83s):
1177 passed
168 skipped
Restoring 2 unacknowledged message(s)
3.12-unit: OK (58.78=setup[22.93]+cmd[35.85] seconds)
congratulations :) (58.85 seconds) Results (29.86s):
1177 passed
168 skipped
Restoring 2 unacknowledged message(s)
3.9-unit: OK (52.85=setup[21.15]+cmd[31.70] seconds)
congratulations :) (52.95 seconds) Also, the last merged PR has 100% passing CI: https://github.com/celery/kombu/actions/runs/9626076136 So it’s hard to quickly judge what’s wrong. |
@zemek just for the sake of unblocking you, you can comment out 3.9 in the CI and commit it to your branch, mark this PR as a draft (for the sake of marking it not-ready-for-merge), and then we’ll revert it before merge after the issue is resolved independently. |
sounds good, will do that! FWIW: everything looks pretty much the same between the two runs though... :/ |
Just to be on the safe side, I ran 3.8-3.12 unit tests on your branch locally right now, and they passed successfully.
Results (31.24s):
1177 passed
168 skipped
Restoring 2 unacknowledged message(s)
3.10-unit: OK ✔ in 34.65 seconds
3.11-unit: OK ✔ in 35.06 seconds
3.9-unit: OK ✔ in 35.71 seconds
3.12-unit: OK ✔ in 35.79 seconds
3.8-unit: OK (35.94=setup[3.06]+cmd[32.88] seconds)
3.9-unit: OK (35.71=setup[2.89]+cmd[32.82] seconds)
3.10-unit: OK (34.65=setup[2.70]+cmd[31.95] seconds)
3.11-unit: OK (35.06=setup[2.86]+cmd[32.20] seconds)
3.12-unit: OK (35.79=setup[4.12]+cmd[31.68] seconds)
congratulations :) (36.05 seconds) That being said, the first time I tried I got this error - funny, but I think it’s due to running it in parallel ( Results (41.23s):
1177 passed
168 skipped
Restoring 2 unacknowledged message(s)
3.12-unit: OK ✔ in 1 minute 6.47 seconds
3.8-unit: OK ✔ in 1 minute 6.64 seconds
3.8-unit: OK (66.64=setup[23.87]+cmd[42.77] seconds)
3.9-unit: FAIL code 3 (65.10=setup[23.68]+cmd[41.42] seconds)
3.10-unit: OK (65.26=setup[23.51]+cmd[41.74] seconds)
3.11-unit: OK (67.00=setup[23.57]+cmd[43.43] seconds)
3.12-unit: OK (66.47=setup[25.13]+cmd[41.34] seconds)
evaluation failed :( (67.42 seconds) Because the error was with code coverage (I/O, probably due to parallel runs): t/unit/utils/test_scheduling.py::test_round_robin_cycle ✓ 97% █████████▊
t/unit/utils/test_scheduling.py::test_priority_cycle ✓ 97% █████████▊
t/unit/utils/test_scheduling.py::test_sorted_cycle ✓ 97% █████████▊
/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py:339: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: _cov, Hook: pytest_runtestloop
DataError: Couldn't end data file '/Users/nusnus/dev/GitHub/kombu/.coverage.Tomers-MacBook-Pro.local.90690.XapqHrBx': disk I/O error
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
config.hook.pytest_runtestloop(session=session)
t/unit/utils/test_time.py::test_maybe_s_to_ms[3-3000] ✓ 97% █████████▊
t/unit/utils/test_time.py::test_maybe_s_to_ms[3.0-3000] ✓ 98% █████████▊
t/unit/utils/test_uuid.py::test_UUID.test_uuid ✓ 100% ██████████INTERNALERROR> Traceback (most recent call last):
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/sqlitedb.py", line 99, in __exit__
INTERNALERROR> self.con.__exit__(exc_type, exc_value, traceback)
INTERNALERROR> sqlite3.OperationalError: disk I/O error
INTERNALERROR>
INTERNALERROR> The above exception was the direct cause of the following exception:
INTERNALERROR>
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py", line 285, in wrap_session
INTERNALERROR> session.exitstatus = doit(config, session) or 0
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py", line 339, in _main
INTERNALERROR> config.hook.pytest_runtestloop(session=session)
t/unit/utils/test_time.py::test_maybe_s_to_ms[303-303000] ✓ 98% █████████▊
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_hooks.py", line 513, in __call__
INTERNALERROR> return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_manager.py", line 120, in _hookexec
INTERNALERROR> return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_callers.py", line 156, in _multicall
INTERNALERROR> teardown[0].send(outcome)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/plugin.py", line 339, in pytest_runtestloop
INTERNALERROR> self.cov_controller.finish()
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/engine.py", line 46, in ensure_topdir_wrapper
INTERNALERROR> return meth(self, *args, **kwargs)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/engine.py", line 255, in finish
INTERNALERROR> self.cov.stop()
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/control.py", line 785, in save
INTERNALERROR> data = self.get_data()
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/control.py", line 865, in get_data
INTERNALERROR> if self._collector.flush_data():
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/collector.py", line 548, in flush_data
INTERNALERROR> self.covdata.add_arcs(self.mapped_file_dict(arc_data))
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/sqldata.py", line 123, in _wrapped
INTERNALERROR> return method(self, *args, **kwargs)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/sqldata.py", line 542, in add_arcs
INTERNALERROR> con.executemany_void(
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/sqlitedb.py", line 104, in __exit__
INTERNALERROR> raise DataError(f"Couldn't end data file {self.filename!r}: {exc}") from exc
INTERNALERROR> coverage.exceptions.DataError: Couldn't end data file '/Users/nusnus/dev/GitHub/kombu/.coverage.Tomers-MacBook-Pro.local.90690.XapqHrBx': disk I/O error
t/unit/utils/test_time.py::test_maybe_s_to_ms[303.33-303330] ✓ 98% █████████▊
t/unit/utils/test_time.py::test_maybe_s_to_ms[303.333-303333] ✓ 98% █████████▊ |
Hmm… just for fun I tried running it again. t/unit/utils/test_uuid.py::test_UUID.test_uuid4 ✓ 99% ██████████
/Users/nusnus/dev/GitHub/kombu/.tox/3.12-unit/lib/python3.12/site-packages/coverage/data.py:180: CoverageWarning: Data file '/Users/nusnus/dev/GitHub/kombu/.coverage.Tomers-MacBook-Pro.local.97662.XwPjIjZx' doesn't seem to be a coverage data file:
data._warn(str(exc))
/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py:339: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: _cov, Hook: pytest_runtestloop
FileNotFoundError: [Errno 2] No such file or directory: '/Users/nusnus/dev/GitHub/kombu/.coverage.Tomers-MacBook-Pro.local.97684.XSRXGqSx'
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
config.hook.pytest_runtestloop(session=session)
t/unit/utils/test_uuid.py::test_UUID.test_uuid ✓ 100% ██████████INTERNALERROR> Traceback (most recent call last):
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py", line 285, in wrap_session
INTERNALERROR> session.exitstatus = doit(config, session) or 0
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/_pytest/main.py", line 339, in _main
INTERNALERROR> config.hook.pytest_runtestloop(session=session)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_hooks.py", line 513, in __call__
INTERNALERROR> return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_manager.py", line 120, in _hookexec
INTERNALERROR> return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pluggy/_callers.py", line 156, in _multicall
INTERNALERROR> teardown[0].send(outcome)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/plugin.py", line 339, in pytest_runtestloop
INTERNALERROR> self.cov_controller.finish()
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/engine.py", line 46, in ensure_topdir_wrapper
INTERNALERROR> return meth(self, *args, **kwargs)
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/pytest_cov/engine.py", line 255, in finish
INTERNALERROR> self.cov.stop()
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/control.py", line 837, in combine
INTERNALERROR> combine_parallel_data(
INTERNALERROR> File "/Users/nusnus/dev/GitHub/kombu/.tox/3.9-unit/lib/python3.9/site-packages/coverage/data.py", line 162, in combine_parallel_data
INTERNALERROR> with open(f, "rb") as fobj:
INTERNALERROR> FileNotFoundError: [Errno 2] No such file or directory: '/Users/nusnus/dev/GitHub/kombu/.coverage.Tomers-MacBook-Pro.local.97684.XSRXGqSx’ 3.11-unit: OK ✔ in 35.37 seconds
3.12-unit: OK ✔ in 36.16 seconds
3.8-unit: OK (36.26=setup[2.94]+cmd[33.32] seconds)
3.9-unit: FAIL code 3 (34.91=setup[2.76]+cmd[32.15] seconds)
3.10-unit: OK (34.90=setup[2.64]+cmd[32.26] seconds)
3.11-unit: OK (35.37=setup[2.71]+cmd[32.66] seconds)
3.12-unit: OK (36.16=setup[4.08]+cmd[32.08] seconds)
evaluation failed :( (36.37 seconds) |
@Nusnus looks like all the integration tests passed 😄 should i add back py3.9 to CI or leave it out for now? |
Haha nice! Merge will probably be done after it is solved so we’ll get 100% passing CI without skipping 3.9, but let’s clear you out first so we’ll know the work on the PR is at least done & approved. |
@auvipy sorry if i'm not supposed to re-request review, just noticed that the requested changes were on the ci.yaml which you already updated |
Yes i committed them on your behalf |
got this failure after merge on 3.10 integration test: -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html |
I suspected the 3.9 issue is something that will be “self-resolved”. These things tend to come and go by themselves sometimes. @auvipy probably noticed it was fine now and undid the CI patch we discussed about yesterday. This is normal, don’t worry @zemek. |
The rest of the integration tests are passing - could it be just a flaky issue possibly? |
@Nusnus @auvipy hmm I wonder if a Lines 331 to 332 in d620132
but I didn't include this sleep after requeuing and publishing the fourth message. Could that potentially be making it flaky? |
Hmm… TBH I’m not sure - worth a try. That being said, it looks like 3.9 is back to making troubles: https://github.com/celery/kombu/pull/2043/checks |
It might be a flaky one but lets try... |
Yeah that was the reason :) |
When messages are restored on warm shutdown in celery, they get placed into the highest priority queue by default. The appropriate behavior should be to place the task back on the same priority queue it originated from.