Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[M] - Bodhi could use a task working system #2851

Closed
bowlofeggs opened this issue Dec 14, 2018 · 9 comments
Closed

[M] - Bodhi could use a task working system #2851

bowlofeggs opened this issue Dec 14, 2018 · 9 comments
Assignees
Labels
Backwards incompatible The proposed change is backwards incompatible and should wait for the next major release High priority These issues are higher priority than normal reliability Issues pertaining to Bodhi's reliability RFE Requests for Enhancement

Comments

@bowlofeggs
Copy link
Contributor

Bodhi currently uses some fedmsg consumers to perform background tasks. One problem with this is that fedmsg-hub doesn't have a retry mechanism like Celery does. It also doesn't have a queuing mechanism to allow fanning work out to a pool of workers. I think Celery is likely a decent choice for a system, but it might be worth doing a survey of competing technologies as well.

This should help increase Bodhi's reliability in certain areas, such as managing Bugzilla tickets. This week we've had a lot of trouble with Bodhi not altering the states on Bugzilla tickets because Bugzilla's API has been flapping and Bodhi does not have a mechanism to retry later if a BZ update fails.

@bowlofeggs bowlofeggs added RFE Requests for Enhancement High priority These issues are higher priority than normal reliability Issues pertaining to Bodhi's reliability Backwards incompatible The proposed change is backwards incompatible and should wait for the next major release labels Dec 14, 2018
@bowlofeggs
Copy link
Contributor Author

Because this requires a significant deployment change, I'm marking it as backwards incompatible.

@abompard
Copy link
Member

Celery is very much field-tested and now there's a RabbitMQ instance in prod to use it with. I don't think there's anything close in terms of adoption, and it's always good for onboarding to use something well known.

@Conan-Kudo
Copy link
Contributor

@abompard The pulp team has had a number of issues with Celery with RabbitMQ. They've switched to RQ for this reason: https://pulpproject.org/2018/05/08/pulp3-moving-to-rq/

@abompard
Copy link
Member

Interesting read, thanks. Any opinion on that @bowlofeggs ?

@daviddavis
Copy link

daviddavis commented Mar 12, 2019

As a @pulp engineer (hi @bowlofeggs!), we've had a lot of problems with celery from deadlocking to memory consumption problems. I feel like the project has gone downhill since ask left. We've been unable to upgrade due to regressions (e.g. celery/celery#3802). We've tried looking into fixing some of the issues ourselves but the codebase is very hard to understand and read.

We've started using @rq in Pulp 3 since it seems more lightweight. Now granted we were using celery in ways that you probably won't be (e.g. with qpid) so your experience may differ and also, we haven't really tested rq in a production setting yet.

@dralley
Copy link
Contributor

dralley commented Mar 12, 2019

@daviddavis described it well. Celery is very powerful but is also incredibly complex, and difficult to debug. It also feels under-maintained for the size, complexity, and scope of the project, with issues and pull requests that remain open for months and not nearly enough bugfix releases as those issues do get resolved.

RQ is a dramatically simpler library with more straightforwards code, which does have some downsides, but ones which are acceptable to us.

Of course, depending on what features of Celery you do or don't use you may have a better experience than we did, and if you've already got a RabbitMQ instance and not a Redis instance perhaps the operational requirements would be lesser with Celery. It's difficult to make broad judgements especially when your use case may differ from ours.

@bowlofeggs
Copy link
Contributor Author

Yeah I actually used to work on Pulp, and was the lead engineer on the project to switch Pulp to use Celery all those years ago. Here are a few of my thoughts:

  • I agree that Celery is pretty messy. I have been using it for a very long time, but I never really looked at its code until I worked on Pulp and I was surprised at the code. I think it's a lot more complicated than it needs to be for what it does.
  • As noted, Pulp actually uses qpidd, not RabbitMQ, and so they aren't using the same kombu backend that we would be using. When I led the project to switch us to Celery, we actually used RabbitMQ initially and later switched to qpidd (and @bmbouter led that initiative). I am not familiar with the nature of the particular issues that Pulp has been facing, so I can't comment on whether any of them might be due to the use of qpidd vs. RabbitMQ.
  • Pulp doesn't (or at least, didn't when I was there…) use celery in the documented manner. There's a lot of history behind this, but Pulp has/had been using Celery to work around not having a transactional database (they are now working on a release to switch to a transactional database). Almost all the troubles that I experienced when I was on that team were due to us using Celery in these unusual ways. I've not seen Celery cause many problems when used in its documented manner, and Bodhi would not be doing anything unusual with it (we have a transactional database).
  • I am open to other options than Celery. Celery I would say is my default choice. I've been using it starting about 10 years ago, and it's generally worked well for me. As long as you don't try to stray from its simple documented use cases, it does the right thing in my experience. We don't have any needs to stray from its documented use cases in Bodhi. However, I fully support exploring alternatives.
  • Obviously, my information about Pulp is out of date. Corrections/omissions welcomed ☺

@jeremycline
Copy link
Member

For what it's worth, we already use celery on at least one other infra product (fmn, which has tons of problems, but Celery isn't one of them).

Yeah, Celery code isn't what I'd write. Sure, it occasionally has regressions. There aren't any projects under active development that don't. As for issue 3802 specifically, I don't see what the big deal is, tasks should be idempotent anyway since exactly-once delivery isn't something AMQP (or Redis AFAIK) offers.

@cverna
Copy link
Contributor

cverna commented Apr 17, 2019

Scope

Allow bodhi to be use with celery and replace existing tasks with celery tasks:
1 - Update message handler
2 - Composer

We can reuse the RabbitMQ broker available for fedora-messaging.

@cverna cverna changed the title Bodhi could use a task working system [M] - Bodhi could use a task working system Apr 17, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 24, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 28, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 29, 2019
abompard added a commit to abompard/bodhi that referenced this issue May 29, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 4, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 11, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 12, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 25, 2019
nphilipp pushed a commit to abompard/bodhi that referenced this issue Jun 25, 2019
nphilipp pushed a commit to abompard/bodhi that referenced this issue Jun 25, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 25, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jun 27, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jul 1, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jul 2, 2019
abompard added a commit to abompard/bodhi that referenced this issue Jul 2, 2019
Zlopez pushed a commit to abompard/bodhi that referenced this issue Jul 10, 2019
Zlopez pushed a commit to abompard/bodhi that referenced this issue Jul 12, 2019
nphilipp pushed a commit to abompard/bodhi that referenced this issue Jul 18, 2019
Fixes: fedora-infra#2851

Signed-off-by: Aurélien Bompard <[email protected]>
Signed-off-by: Michal Konečný <[email protected]>
nphilipp pushed a commit to abompard/bodhi that referenced this issue Jul 18, 2019
Fixes: fedora-infra#2851

Signed-off-by: Aurélien Bompard <[email protected]>
Signed-off-by: Michal Konečný <[email protected]>
@mergify mergify bot closed this as completed in 42b3682 Jul 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backwards incompatible The proposed change is backwards incompatible and should wait for the next major release High priority These issues are higher priority than normal reliability Issues pertaining to Bodhi's reliability RFE Requests for Enhancement
Projects
None yet
Development

No branches or pull requests

7 participants