This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Implements a task scheduler for resumable potentially long running tasks #15891
Merged
Merged
Changes from 28 commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
470f385
Implements a task scheduler for resumable potentially long running tasks
0961f52
Add filters to task retrieval + clean less often
862425f
Merge remote-tracking branch 'origin/develop' into mv/task-scheduler
MatMaul e6bf9ce
Move sql file
MatMaul abbbddf
Update schema version
MatMaul a4ed9fe
Add ts to get_tasks and narrow down the query in the loops
MatMaul 3490a91
Renames
e03c12d
Comments
8e29f3d
Update synapse/util/task_scheduler.py
b2dff65
Update synapse/util/task_scheduler.py
ce981c5
Add underscore to private fields
9e25f5e
lint
8c713f9
Apply suggestions from code review
bc92b72
Comments
46cbde8
Fix error handling
1025432
More comments
fd2c3dd
Fix
9e9a5e8
Add doc
861992f
Merge remote-tracking branch 'origin/develop' into mv/task-scheduler
4c1c833
Move to new DB delta folder
6762c07
Limit concurrent running tasks to 20
dfa3983
Remove launch now feature, max 10
7730d74
Fix tests
5878384
Fix olddeps tests
2383d1e
Merge loops
6122291
Merge remote-tracking branch 'origin/develop' into mv/task-scheduler
16be76b
Address comments
ad25237
Address comments
7b96379
Add delete_task
3fd518c
Add index on status
1a410ee
Add big docstring
8ed3e37
Add running tasks metric and log running tasks for more than 24hrs wi…
60ea494
Merge remote-tracking branch 'origin/develop' into mv/task-scheduler
c089f67
Remove unused sql file
d158b67
Address reviews
223f87e
Address comment
e5a5344
Fix
56ab35e
Address comment
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Implements a task scheduler for resumable potentially long running tasks. | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,209 @@ | ||
# Copyright 2023 The Matrix.org Foundation C.I.C. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import json | ||
from typing import TYPE_CHECKING, Any, Dict, List, Optional | ||
|
||
from synapse.storage._base import SQLBaseStore, db_to_json | ||
from synapse.storage.database import ( | ||
DatabasePool, | ||
LoggingDatabaseConnection, | ||
LoggingTransaction, | ||
make_in_list_sql_clause, | ||
) | ||
from synapse.types import JsonDict, JsonMapping, ScheduledTask, TaskStatus | ||
from synapse.util import json_encoder | ||
|
||
if TYPE_CHECKING: | ||
from synapse.server import HomeServer | ||
|
||
|
||
class TaskSchedulerWorkerStore(SQLBaseStore): | ||
def __init__( | ||
self, | ||
database: DatabasePool, | ||
db_conn: LoggingDatabaseConnection, | ||
hs: "HomeServer", | ||
): | ||
super().__init__(database, db_conn, hs) | ||
|
||
@staticmethod | ||
def _convert_row_to_task(row: Dict[str, Any]) -> ScheduledTask: | ||
row["status"] = TaskStatus(row["status"]) | ||
if row["params"] is not None: | ||
row["params"] = db_to_json(row["params"]) | ||
if row["result"] is not None: | ||
row["result"] = db_to_json(row["result"]) | ||
return ScheduledTask(**row) | ||
|
||
async def get_scheduled_tasks( | ||
self, | ||
*, | ||
actions: Optional[List[str]] = None, | ||
resource_ids: Optional[List[str]] = None, | ||
statuses: Optional[List[TaskStatus]] = None, | ||
max_timestamp: Optional[int] = None, | ||
) -> List[ScheduledTask]: | ||
"""Get a list of scheduled tasks from the DB. | ||
|
||
If an arg is `None` all tasks matching the other args will be selected. | ||
If an arg is an empty list, the value needs to be NULL in DB to be selected. | ||
MatMaul marked this conversation as resolved.
Show resolved
Hide resolved
MatMaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Args: | ||
actions: Limit the returned tasks to those specific action names | ||
resource_ids: Limit the returned tasks to the specific resource ids | ||
statuses: Limit the returned tasks to the specific statuses | ||
max_timestamp: Limit the returned tasks to the ones that have | ||
a timestamp inferior to the specified one | ||
|
||
Returns: a list of `ScheduledTask`, ordered by increasing timestamps | ||
""" | ||
|
||
def get_scheduled_tasks_txn(txn: LoggingTransaction) -> List[Dict[str, Any]]: | ||
clauses = [] | ||
args = [] | ||
if actions is not None: | ||
clause, temp_args = make_in_list_sql_clause( | ||
txn.database_engine, "action", actions | ||
) | ||
clauses.append(clause) | ||
args.extend(temp_args) | ||
if resource_ids is not None: | ||
clause, temp_args = make_in_list_sql_clause( | ||
txn.database_engine, "resource_id", resource_ids | ||
) | ||
clauses.append(clause) | ||
args.extend(temp_args) | ||
if statuses is not None: | ||
clause, temp_args = make_in_list_sql_clause( | ||
txn.database_engine, "status", statuses | ||
) | ||
clauses.append(clause) | ||
args.extend(temp_args) | ||
if max_timestamp is not None: | ||
clauses.append("timestamp <= ?") | ||
args.append(max_timestamp) | ||
|
||
sql = "SELECT * FROM scheduled_tasks" | ||
if clauses: | ||
sql = sql + " WHERE " + " AND ".join(clauses) | ||
|
||
sql = sql + "ORDER BY timestamp" | ||
|
||
txn.execute(sql, args) | ||
return self.db_pool.cursor_to_dict(txn) | ||
|
||
rows = await self.db_pool.runInteraction( | ||
"get_scheduled_tasks", get_scheduled_tasks_txn | ||
) | ||
return [TaskSchedulerWorkerStore._convert_row_to_task(row) for row in rows] | ||
|
||
async def insert_scheduled_task(self, task: ScheduledTask) -> None: | ||
"""Insert a specified `ScheduledTask` in the DB. | ||
|
||
Args: | ||
task: the `ScheduledTask` to insert | ||
""" | ||
await self.db_pool.simple_insert( | ||
"scheduled_tasks", | ||
{ | ||
"id": task.id, | ||
"action": task.action, | ||
"status": task.status, | ||
"timestamp": task.timestamp, | ||
"resource_id": task.resource_id, | ||
"params": None | ||
if task.params is None | ||
else json_encoder.encode(task.params), | ||
"result": None | ||
if task.result is None | ||
else json_encoder.encode(task.result), | ||
"error": task.error, | ||
}, | ||
desc="insert_scheduled_task", | ||
) | ||
|
||
async def update_scheduled_task( | ||
self, | ||
id: str, | ||
timestamp: int, | ||
*, | ||
status: Optional[TaskStatus] = None, | ||
result: Optional[JsonMapping] = None, | ||
error: Optional[str] = None, | ||
) -> bool: | ||
MatMaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""Update a scheduled task in the DB with some new value(s). | ||
|
||
Args: | ||
id: id of the `ScheduledTask` to update | ||
timestamp: new timestamp of the task | ||
status: new status of the task | ||
result: new result of the task | ||
error: new error of the task | ||
|
||
Returns: `False` if no matching row was found, `True` otherwise | ||
""" | ||
updatevalues: JsonDict = {"timestamp": timestamp} | ||
if status is not None: | ||
updatevalues["status"] = status | ||
if result is not None: | ||
updatevalues["result"] = json.dumps(result) | ||
MatMaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if error is not None: | ||
updatevalues["error"] = error | ||
nb_rows = await self.db_pool.simple_update( | ||
"scheduled_tasks", | ||
{"id": id}, | ||
updatevalues, | ||
desc="update_scheduled_task", | ||
) | ||
return nb_rows > 0 | ||
|
||
async def get_scheduled_task(self, id: str) -> Optional[ScheduledTask]: | ||
"""Get a specific `ScheduledTask` from its id. | ||
|
||
Args: | ||
id: the id of the task to retrieve | ||
|
||
Returns: the task if available, `None` otherwise | ||
""" | ||
row = await self.db_pool.simple_select_one( | ||
table="scheduled_tasks", | ||
keyvalues={"id": id}, | ||
retcols=( | ||
"id", | ||
"action", | ||
"status", | ||
"timestamp", | ||
"resource_id", | ||
"params", | ||
"result", | ||
"error", | ||
), | ||
allow_none=True, | ||
desc="get_scheduled_task", | ||
) | ||
|
||
return TaskSchedulerWorkerStore._convert_row_to_task(row) if row else None | ||
|
||
async def delete_scheduled_task(self, id: str) -> None: | ||
"""Delete a specific task from its id. | ||
|
||
Args: | ||
id: the id of the task to delete | ||
""" | ||
await self.db_pool.simple_delete( | ||
"scheduled_tasks", | ||
keyvalues={"id": id}, | ||
desc="delete_scheduled_task", | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
26 changes: 26 additions & 0 deletions
26
synapse/storage/schema/main/delta/80/01_scheduled_tasks.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
/* Copyright 2023 The Matrix.org Foundation C.I.C | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
-- cf ScheduledTask docstring for the meaning of the fields. | ||
CREATE TABLE IF NOT EXISTS scheduled_tasks( | ||
id TEXT PRIMARY KEY, | ||
action TEXT NOT NULL, | ||
status TEXT NOT NULL, | ||
timestamp BIGINT NOT NULL, | ||
resource_id TEXT, | ||
params TEXT, | ||
result TEXT, | ||
error TEXT | ||
); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to add some indices for the various fields here? Or is the assumption that this is going to be low volume? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be low volume. But then an indice doesn't cost much in space so perhaps we should still do it ?
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be a feature -- it is something internally used by Synapse; admins and end-users don't care.
We could probably just combine this with the other PR?