Skip to content

Commit

Permalink
Merge pull request galaxyproject#17918 from davelopez/update_celery_docs
Browse files Browse the repository at this point in the history
Update config docs about Celery
  • Loading branch information
mvdbeek authored Apr 6, 2024
2 parents 9ff846a + 2f4bc1a commit 1abf45f
Show file tree
Hide file tree
Showing 4 changed files with 55 additions and 26 deletions.
29 changes: 16 additions & 13 deletions doc/source/admin/galaxy_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5145,6 +5145,21 @@
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~
``enable_celery_tasks``
~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Offload long-running tasks to a Celery task queue. Activate this
only if you have setup a Celery worker for Galaxy and you have
configured the `celery_conf` option below. Specifically, you need
to set the `result_backend` option in the `celery_conf` option to
a valid Celery result backend URL. For details, see
https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks
:Default: ``false``
:Type: bool


~~~~~~~~~~~~~~~
``celery_conf``
~~~~~~~~~~~~~~~
Expand All @@ -5162,22 +5177,10 @@
disabled on a per-task basis at this time.)
For details, see Celery documentation at
https://docs.celeryq.dev/en/stable/userguide/configuration.html.
:Default: ``{'task_routes': {'galaxy.fetch_data': 'galaxy.external', 'galaxy.set_job_metadata': 'galaxy.external'}}``
:Default: ``{'result_backend': 'redis://127.0.0.1:6379/0', 'task_routes': {'galaxy.fetch_data': 'galaxy.external', 'galaxy.set_job_metadata': 'galaxy.external'}}``
:Type: any


~~~~~~~~~~~~~~~~~~~~~~~
``enable_celery_tasks``
~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Offload long-running tasks to a Celery task queue. Activate this
only if you have setup a Celery worker for Galaxy. For details,
see https://docs.galaxyproject.org/en/master/admin/production.html
:Default: ``false``
:Type: bool


~~~~~~~~~~~~~~~~~~~~~~~~~~
``celery_user_rate_limit``
~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
17 changes: 17 additions & 0 deletions doc/source/admin/production.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,20 @@ Finally, if you are using Galaxy <= release_2014.06.02, we recommend that you in
### Make the proxy handle uploads and downloads

By default, Galaxy receives file uploads as a stream from the proxy server and then writes this file to disk. Likewise, it sends files as a stream to the proxy server. This occupies the GIL in that Galaxy process and will decrease responsiveness for other operations in that process. To solve this problem, you can configure your proxy server to serve downloads directly, involving Galaxy only for the task of authorizing that the user has permission to read the dataset. If using nginx as the proxy, you can configure it to receive uploaded files and write them to disk itself, only notifying Galaxy of the upload once it's completed. All the details on how to configure these can be found on the [Apache](apache.md) and [nginx](nginx.md) proxy instruction pages.

### Use Celery for asynchronous tasks

Galaxy can use [Celery](https://docs.celeryq.dev/en/stable/index.html) to handle asynchronous tasks. This is useful for offloading tasks that are usually time-consuming and that would otherwise block the Galaxy process. Some use cases include:

- Setting metadata on datasets
- Purging datasets
- Exporting histories or other data
- Running periodic tasks

The list of tasks that are currently handled by `Celery` can be found in `lib/galaxy/celery/tasks.py`.

To enable Celery in your instance you need to follow some additional steps:

- Set `enable_celery_tasks: true` in the Galaxy config.
- Configure the `backend` under `celery_conf` to store the results of the tasks. For example, you can use [`redis` as the backend](https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/redis.html#broker-redis). If you are using `redis`, make sure to install the `redis` dependency in your Galaxy environment with `pip install redis`. You can find more information on how to configure other backends in the [Celery documentation](https://docs.celeryq.dev/en/stable/userguide/tasks.html#task-result-backends).
- Configure one or more workers to handle the tasks. You can find more information on how to configure workers in the [Celery documentation](https://docs.celeryq.dev/en/stable/userguide/workers.html). If you are using [Gravity](https://github.com/galaxyproject/gravity) it will simplify the process of setting up Celery workers.
14 changes: 9 additions & 5 deletions lib/galaxy/config/sample/galaxy.yml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -2757,6 +2757,14 @@ galaxy:
# commented out line below).
#amqp_internal_connection: sqlalchemy+sqlite:///./database/control.sqlite?isolation_level=IMMEDIATE

# Offload long-running tasks to a Celery task queue. Activate this
# only if you have setup a Celery worker for Galaxy and you have
# configured the `celery_conf` option below. Specifically, you need to
# set the `result_backend` option in the `celery_conf` option to a
# valid Celery result backend URL. For details, see
# https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks
#enable_celery_tasks: false

# Configuration options passed to Celery.
# To refer to a task by name, use the template `galaxy.foo` where
# `foo` is the function name of the task defined in the
Expand All @@ -2770,15 +2778,11 @@ galaxy:
# For details, see Celery documentation at
# https://docs.celeryq.dev/en/stable/userguide/configuration.html.
#celery_conf:
# result_backend: redis://127.0.0.1:6379/0
# task_routes:
# galaxy.fetch_data: galaxy.external
# galaxy.set_job_metadata: galaxy.external

# Offload long-running tasks to a Celery task queue. Activate this
# only if you have setup a Celery worker for Galaxy. For details, see
# https://docs.galaxyproject.org/en/master/admin/production.html
#enable_celery_tasks: false

# If set to a non-0 value, upper limit on number of tasks that can be
# executed per user per second.
#celery_user_rate_limit: 0.0
Expand Down
21 changes: 13 additions & 8 deletions lib/galaxy/config/schemas/config_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3755,10 +3755,23 @@ mapping:
will automatically create and use a separate sqlite database located in your
<galaxy>/database folder (indicated in the commented out line below).
enable_celery_tasks:
type: bool
default: false
required: false
desc: |
Offload long-running tasks to a Celery task queue.
Activate this only if you have setup a Celery worker for Galaxy and you have
configured the `celery_conf` option below. Specifically, you need to set the
`result_backend` option in the `celery_conf` option to a valid Celery result
backend URL.
For details, see https://docs.galaxyproject.org/en/master/admin/production.html#use-celery-for-asynchronous-tasks
celery_conf:
type: any
required: false
default:
result_backend: redis://127.0.0.1:6379/0
task_routes:
'galaxy.fetch_data': 'galaxy.external'
'galaxy.set_job_metadata': 'galaxy.external'
Expand All @@ -3776,14 +3789,6 @@ mapping:
For details, see Celery documentation at https://docs.celeryq.dev/en/stable/userguide/configuration.html.
enable_celery_tasks:
type: bool
default: false
required: false
desc: |
Offload long-running tasks to a Celery task queue.
Activate this only if you have setup a Celery worker for Galaxy.
For details, see https://docs.galaxyproject.org/en/master/admin/production.html
celery_user_rate_limit:
type: float
Expand Down

0 comments on commit 1abf45f

Please sign in to comment.