-
Notifications
You must be signed in to change notification settings - Fork 61
Delegate reimbursement import to Celery #240
Conversation
3095034
to
12b569d
Compare
Testing with sample data seems Ok. Ready to Merge. |
7943c93
to
57703b8
Compare
In PVT @decko told me that actually it wasn't working with the Celery (worker) running in a container. So we still need to fix that before considering merging it. Is that right? |
57703b8
to
402553d
Compare
Ok… I'm back to this PR. I merged the novelties from The Then I moved the serialize part to the async task (42e512a). The importing command took 15s — we're saving more than 50% of time now, yay! From that I think the Celery part is indeed working and this PR is good for code review and, if that's the case, merging… cc @anaschwendler Many thanks @decko for the tests and the ideas ; ) UPDATE How to test this PR:
If at the end of the process the a) the importing was quicker in this branch when compared to |
Fixes the database host used on dev environment.
I'll start testing this PR little by little and how (so it will be edited along the way):
$ git clone [email protected]:datasciencebr/jarbas.git
$ cd jarbas I'll try to run with Docker, even if its showed without, I'll try :)
$ cp contrib/.env.sample env
$ docker-compose up -d
$ docker-compose run --rm django python manage.py migrate I'll start testing as @cuducos suggest in the last comment:
$ time docker-compose run --rm django python manage.py reimbursements /mnt/data/reimbursements_sample.xz And the result:
Ok, now let's test with @cuducos PR:
$ git checkout -b cuducos-celery origin/cuducos-celery
$ git merge master
$ docker-compose run --rm django python manage.py shell_plus
> Reimbursement.objects.all().delete() The result: >>> Reimbursement.objects.all().delete()
(1000, {'core.Tweet': 0, 'core.Reimbursement': 1000})
$ time docker-compose run --rm django python manage.py reimbursements /mnt/data/reimbursements_sample.xz The result: Starting jarbas_queue_1 ...
Starting jarbas_queue_1 ... done
Starting jarbas_postgres_1 ... done
Starting jarbas_tasks_1 ... done
1000 reimbursements scheduled to creation/update
docker-compose run --rm django python manage.py reimbursements 1.11s user 0.26s system 13% cpu 9.829 total It gave me a diference of 2secs by testing 3 times and getting the average time 🎉 Is it ok @cuducos ? Can I merge? |
Nice! Good work, people, congrats! |
This is a work in progress, I just created it earlier to have feedbacks. The aim, when complete, is to address part of #217.
What is the purpose of this Pull Request?
The purpose of this PR is make the process of importing reimbursement quicker and without manual steps as described in #217.
What was done to achieve this purpose?
I introduced Celery to perform some task asynchronously: that way the heavy work of the import is handled in the background.
How to test if it really works?
brew install rabbitmq
orapt-get install rabbitmq-server
should do the job for macOS and any Debian based Linux such as Ubuntu (Docker environment is ready, no need to worry about it)$ python -m pip install -U -r requirements.txt
$ python manage.py test
$ celery worker --app jarbas
$ python manage.py reimbursements <path to reimbursements.xz>
Who can help reviewing it?
@jtemporal @lipemorais @davinirjr @augustogoulart @adrianomargarin
What is out of scope
In #217 I mentioned creating a API endpoint so Rosie can send
POST
s with reimbursement data. I think this is an issue for a new PR when this one is finished, reviewed, stable an merged.Next steps in this PR
.xz
) line (either I screwed up when wiring up Celery, or the serialize step is another bottleneck; if so, a callback in Celery should be able to handle serializing and then the create/update step)Next steps
docker-compose.yml
after this PR is merged