Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task Manager] old idle task not run when there's a large backlog of idle tasks #80371

Closed
pmuellr opened this issue Oct 13, 2020 · 24 comments · Fixed by #80692
Closed

[Task Manager] old idle task not run when there's a large backlog of idle tasks #80371

pmuellr opened this issue Oct 13, 2020 · 24 comments · Fixed by #80692
Assignees
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@pmuellr
Copy link
Member

pmuellr commented Oct 13, 2020

7.10.0-snapshot on staging cloud (post 7.10 FF)

I've been running some load testing on staging cloud with 7.10, and managed to get to a point where I have 96K tasks queued (for action execution), which appear to not being picked up to run. While this is going on, alerts are running, so tasks are running, it's just that these old tasks are not running.

Here's an example task which is not being run, but should, as it's almost 12 hours old at this point:
{
    "task": {
      "taskType": "actions:.server-log",
      "params": "{\"spaceId\":\"default\",\"actionTaskParamsId\":\"531c8827-191e-4fc9-bc09-3b25692e1ad2\"}",
      "state": "{}",
      "scope": [
        "actions"
      ],
      "attempts": 0,
      "scheduledAt": "2020-10-13T06:36:41.316Z",
      "startedAt": null,
      "retryAt": null,
      "runAt": "2020-10-13T06:36:41.316Z",
      "status": "idle"
    },
    "type": "task",
    "references": [],
    "migrationVersion": {
      "task": "7.6.0"
    },
    "updated_at": "2020-10-13T06:36:41.316Z"
  }
Here's a task doc for a currently running alert
{
    "migrationVersion": {
      "task": "7.6.0"
    },
    "task": {
      "taskType": "alerting:.index-threshold",
      "retryAt": "2020-10-13T15:40:45.196Z",
      "runAt": "2020-10-13T15:30:15.791Z",
      "scope": [
        "alerting"
      ],
      "startedAt": "2020-10-13T15:30:45.196Z",
      "state": "{\"alertInstances\":{},\"previousStartedAt\":\"2020-10-13T15:30:14.791Z\"}",
      "params": "{\"alertId\":\"4e8bbf2d-c7e7-4112-b1a8-d0bad05d9057\",\"spaceId\":\"default\"}",
      "ownerId": "kibana:a3dc9451-e65d-4245-89a2-442bab284249",
      "scheduledAt": "2020-10-13T05:22:58.374Z",
      "attempts": 1,
      "status": "running"
    },
    "references": [],
    "updated_at": "2020-10-13T15:30:45.246Z",
    "type": "task"
  }

Not exactly sure what I did to get in this state, exactly, but here are some of the things I was playing with.

  1. create a cloud deployment, set env var KBN_URLBASE and ES_URL as appropriate for the following scripts
  2. install https://github.com/pmuellr/es-apm-sys-sim , set ES_URL as appropriate
  3. run ALERTS=100 whole-lotta-alerts.sh, where that script is from this gist
  4. run es-apm-sys-sim 1 4 -k - that allows you to change the cpu threshold with keystrokes; bump them all to 100% by holding down keys 1, then 2, etc
  5. You'll now have 100 alerts firing every second (in theory) scheduling 4 actions every time the alert runs

Things will back up fairly quickly. You can get the number of tasks queued (basically actions queued to run) via

curl $ES_URL/_cat/indices/.kibana_task_manager?v

You can kill es-apm-sys-sim so that the alerts will stop firing, and see if the actions will eventually get run, and then deleted from the tm index. You can also easily "upgrade" Kibana, from say 1 1GB RAM instance to 4 8GB RAM instances, and that has worked at times to get the queued actions resolved.

To get to this state, I let ~100K tasks get queued, then killed es-apm-sys-sim, so that actions should STOP being queued, and hopefully resolved. No more tasks got queued (as expected), but these old ones also never got run and so the tm index was left with ~100K action tasks. Left it running overnight with 1 1GB Kibana instance, and they never got run.

@pmuellr pmuellr added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Oct 13, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

Gidi suggested capturing the task manager update-by-query params we're using, so that's here:

query but with some task-type match clauses elided
callCluster('updateByQuery' {
  "index": ".kibana_task_manager",
  "ignoreUnavailable": true,
  "refresh": true,
  "max_docs": 10,
  "conflicts": "proceed",
  "body": {
    "query": {
      "bool": {
        "must": [
          {
            "term": {
              "type": "task"
            }
          },
          {
            "bool": {
              "must": [
                {
                  "bool": {
                    "must": [
                      {
                        "bool": {
                          "should": [
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.status": "idle"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.runAt": {
                                        "lte": "now"
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "bool": {
                                      "should": [
                                        {
                                          "term": {
                                            "task.status": "running"
                                          }
                                        },
                                        {
                                          "term": {
                                            "task.status": "claiming"
                                          }
                                        }
                                      ]
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.retryAt": {
                                        "lte": "now"
                                      }
                                    }
                                  }
                                ]
                              }
                            }
                          ]
                        }
                      },
                      {
                        "bool": {
                          "should": [
                            {
                              "exists": {
                                "field": "task.schedule"
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "session_cleanup"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            // block below repeated for every task type
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.server-log"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            }
                            // block above repeated for every task type
                          ]
                        }
                      }
                    ]
                  }
                }
              ],
              "filter": [
                {
                  "bool": {
                    "must_not": [
                      {
                        "bool": {
                          "should": [
                            {
                              "term": {
                                "task.status": "running"
                              }
                            },
                            {
                              "term": {
                                "task.status": "claiming"
                              }
                            }
                          ],
                          "must": {
                            "range": {
                              "task.retryAt": {
                                "gt": "now"
                              }
                            }
                          }
                        }
                      }
                    ]
                  }
                }
              ]
            }
          }
        ]
      }
    },
    "sort": [
      "_score",
      {
        "_script": {
          "type": "number",
          "order": "asc",
          "script": {
            "lang": "painless",
            "source": "\nif (doc['task.retryAt'].size()!=0) {\n  return doc['task.retryAt'].value.toInstant().toEpochMilli();\n}\nif (doc['task.runAt'].size()!=0) {\n  return doc['task.runAt'].value.toInstant().toEpochMilli();\n}\n    "
          }
        }
      }
    ],
    "seq_no_primary_term": true,
    "script": {
      "source": "ctx._source.task.ownerId=params.ownerId; ctx._source.task.status=params.status; ctx._source.task.retryAt=params.retryAt;",
      "lang": "painless",
      "params": {
        "ownerId": "kibana:5b2de169-2785-441b-ae8c-186a1936b17d",
        "status": "claiming",
        "retryAt": "2020-10-13T17:11:21.892Z"
      }
    }
  }
}

The sort script, near the bottom, reformatted, is:

if (doc['task.retryAt'].size()!=0) {
    return doc['task.retryAt'].value.toInstant().toEpochMilli();
}
if (doc['task.runAt'].size()!=0) {
    return doc['task.runAt'].value.toInstant().toEpochMilli();
}
the elided task-type match clauses removed from above
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "lens_telemetry"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.email"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.index"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.pagerduty"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.server-log"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.slack"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.webhook"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.servicenow"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.jira"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions:.resilient"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 1
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "actions_telemetry"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting_telemetry"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:.index-threshold"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:.geo-threshold"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:example.always-firing"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:example.people-in-space"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:siem.signals"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:siem.notifications"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "endpoint:user-artifact-packager"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:metrics.alert.threshold"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:metrics.alert.inventory.threshold"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:logs.alert.document.count"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_cluster_health"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_license_expiration"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_cpu_usage"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_missing_monitoring_data"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_disk_usage"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_jvm_memory_usage"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_nodes_changed"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_logstash_version_mismatch"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_kibana_version_mismatch"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:monitoring_alert_elasticsearch_version_mismatch"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:apm.transaction_duration"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:apm.transaction_duration_anomaly"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:apm.error_rate"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:apm.transaction_error_rate"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "apm-telemetry-task"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:xpack.uptime.alerts.monitorStatus"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:xpack.uptime.alerts.tls"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "must": [
        {
          "term": {
            "task.taskType": "alerting:xpack.uptime.alerts.durationAnomaly"
          }
        },
        {
          "range": {
            "task.attempts": {
              "lt": 3
            }
          }
        }
      ]
    }
  }

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

I see a max_doc of 10 here. Is there some chance that ES would be happy to find 10 docs matching the conditions, and then sort those, instead of - what we want - having them sorted then pick the 10 oldest.

Looking at the action and alert tasks above, and seeing how that sort runs, looks like it's coded the way you want.

The retryAt is a bit funny - it gets set in the "update" script at the bottom, so is presumably not an indication of an error. This particular doc I just happened to catch in that running state. In any case, the actions runAt is still earlier than the alerts retryAtAND runAt, so it seems like it should have "sorted first"

I should note that I checked several times during my testing to see if any errors were being logged, never found anything.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

under Description from the update-by-query docs

While processing an update by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents. A bulk update request is performed for each batch of matching documents. Any query or update failures cause the update by query request to fail and the failures are shown in the response. Any update requests that completed successfully still stick, they are not rolled back.

And docs for sort and maxDocs:

max_docs - (Optional, integer) Maximum number of documents to process. Defaults to all documents.
sort - (Optional, string) A comma-separated list of : pairs.

Given that, seems like quite possibly UBQ find 10 updateable docs, and then says "done!", missing older ones

@pmuellr pmuellr self-assigned this Oct 13, 2020
@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

Some more interesting settings:

  • scroll_size - Optional, integer) Size of the scroll request that powers the operation. Defaults to 100.
  • terminate_after - (Optional, integer) The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early.
  • wait_for_active_shards - (Optional, string) The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). Default: 1, the primary shard.

scroll_size is interesting, especially that default of 100. The testing I was doing had exactly 100 alerts! And I was initially thrown off as the doc later gives an example of setting this, saying:

By default update by query uses scroll batches of 1000. You can change the batch size with the scroll_size parameter:

Opened an issue to get the docs fixed: elastic/elasticsearch#63637

terminate_after is also interesting, if there's some calculated default like the value of max_docs

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

If the missing docs are in fact missing because of scroll_size or terminate_after or similar, it's still not quite clear to me what kind of a number we could provide that would be reasonable. We've seen task manager indices with millions of failed action executions, which are ignored by the query and have not caused us any known problems (but we haven't looked real hard either, I think).

At this point, I think I've exhausted by ES foo on this, will need to pull in some other experts ...

@mikecote
Copy link
Contributor

Task manager also uses conflicts: 'proceed' (here) which should indicate Elasticsearch to keep trying to update other docs until max_docs is reached and skip 409 errors. I wonder if this isn't working as I hoped.

This is there for Kibana instances to each claim 10 tasks at the same time and skip through updates that return 409 errors.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

hmmm ... I was running multiple Kibana instances sometimes, and I did happen to look at some ES stats that indicated some conflicts had occurred ...

@pmuellr
Copy link
Member Author

pmuellr commented Oct 13, 2020

So, I was dying to try deleting some alerts to see if TM would then "see" the action docs. But don't want to touch the environment. So created a new deployment with a snapshot of the one I've been playing with. Came right up, looks exactly the same, noice!

First tried deleting one alert - nothing; well, it did delete one task doc (the alert one). Then tried 50, then tried the final 49. Nothing. Actions still not running. Also tried restarting Kibana, same. So it seems unlikely it has anything to do with scroll_size, max_docs, etc.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 14, 2020

Playing around some more yesterday, I found the smoothest way to "wake up" some of the old tasks was to disable a significant # of the 100 enabled alerts. Started with 50. After a few minutes, started getting a few of them to run.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 14, 2020

Here's a version of the query used, that you can paste into the dev tools console. It's not exactly the same; I used size: 10 instead of max_docs: 10; conflicts doesn't seem relevant here :-)

dev tools query
POST /.kibana_task_manager/_search
{
    "size": 10,
    "query": {
      "bool": {
        "must": [
          {
            "term": {
              "type": "task"
            }
          },
          {
            "bool": {
              "must": [
                {
                  "bool": {
                    "must": [
                      {
                        "bool": {
                          "should": [
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.status": "idle"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.runAt": {
                                        "lte": "now"
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "bool": {
                                      "should": [
                                        {
                                          "term": {
                                            "task.status": "running"
                                          }
                                        },
                                        {
                                          "term": {
                                            "task.status": "claiming"
                                          }
                                        }
                                      ]
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.retryAt": {
                                        "lte": "now"
                                      }
                                    }
                                  }
                                ]
                              }
                            }
                          ]
                        }
                      },
                      {
                        "bool": {
                          "should": [
                            {
                              "exists": {
                                "field": "task.schedule"
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "session_cleanup"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "lens_telemetry"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.email"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.index"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.pagerduty"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.server-log"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.slack"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.webhook"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.servicenow"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.jira"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions:.resilient"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 1
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "actions_telemetry"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting_telemetry"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:.index-threshold"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:.geo-threshold"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:example.always-firing"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:example.people-in-space"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:siem.signals"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:siem.notifications"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "endpoint:user-artifact-packager"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:metrics.alert.threshold"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:metrics.alert.inventory.threshold"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:logs.alert.document.count"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_cluster_health"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_license_expiration"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_cpu_usage"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_missing_monitoring_data"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_disk_usage"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_jvm_memory_usage"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_nodes_changed"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_logstash_version_mismatch"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_kibana_version_mismatch"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:monitoring_alert_elasticsearch_version_mismatch"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:apm.transaction_duration"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:apm.transaction_duration_anomaly"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:apm.error_rate"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:apm.transaction_error_rate"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "apm-telemetry-task"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:xpack.uptime.alerts.monitorStatus"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:xpack.uptime.alerts.tls"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            },
                            {
                              "bool": {
                                "must": [
                                  {
                                    "term": {
                                      "task.taskType": "alerting:xpack.uptime.alerts.durationAnomaly"
                                    }
                                  },
                                  {
                                    "range": {
                                      "task.attempts": {
                                        "lt": 3
                                      }
                                    }
                                  }
                                ]
                              }
                            }
                          ]
                        }
                      }
                    ]
                  }
                }
              ],
              "filter": [
                {
                  "bool": {
                    "must_not": [
                      {
                        "bool": {
                          "should": [
                            {
                              "term": {
                                "task.status": "running"
                              }
                            },
                            {
                              "term": {
                                "task.status": "claiming"
                              }
                            }
                          ],
                          "must": {
                            "range": {
                              "task.retryAt": {
                                "gt": "now"
                              }
                            }
                          }
                        }
                      }
                    ]
                  }
                }
              ]
            }
          }
        ]
      }
    },
    "sort": [
      "_score",
      {
        "_script": {
          "type": "number",
          "order": "asc",
          "script": {
            "lang": "painless",
            "source": "\nif (doc['task.retryAt'].size()!=0) {\n  return doc['task.retryAt'].value.toInstant().toEpochMilli();\n}\nif (doc['task.runAt'].size()!=0) {\n  return doc['task.runAt'].value.toInstant().toEpochMilli();\n}\n    "
          }
        }
      }
    ],
    "seq_no_primary_term": true
  }

Here's the response with hits elided, and then the first hit:

{
  "took" : 81,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : []
  }
}
      {
        "_index" : ".kibana_task_manager_1",
        "_type" : "_doc",
        "_id" : "task:2831c731-0d14-11eb-ae66-f1f66f186dec",
        "_seq_no" : 1507447,
        "_primary_term" : 1,
        "_score" : 3.6364787,
        "_source" : {
          "migrationVersion" : {
            "task" : "7.6.0"
          },
          "task" : {
            "taskType" : "alerting:.index-threshold",
            "retryAt" : null,
            "runAt" : "2020-10-14T15:56:59.636Z",
            "scope" : [
              "alerting"
            ],
            "startedAt" : null,
            "state" : """{"alertInstances":{},"previousStartedAt":"2020-10-14T15:56:58.636Z"}""",
            "params" : """{"alertId":"793b2a84-2399-4917-8670-95a7b25b29fc","spaceId":"default"}""",
            "ownerId" : null,
            "scheduledAt" : "2020-10-13T05:22:58.979Z",
            "attempts" : 0,
            "status" : "idle"
          },
          "references" : [ ],
          "updated_at" : "2020-10-14T15:56:59.860Z",
          "type" : "task"
        },
        "sort" : [
          3.6364787,
          1.602691019636E12
        ]
      }

The sort value there looks ... interesting.

Looking at the 10 results, they all had retryAt: null and the runAt was the same for 9 of them, with 1 being different. However, they all had that same _score value of 3.6364787.

Which makes that score even more suspicious. Smelling like the sort script to me ...

@pmuellr
Copy link
Member Author

pmuellr commented Oct 14, 2020

doc on scripting a sort: https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-sort-context.html

  "sort": {
    "_script": {
      "type": "number",
      "script": {
        "lang": "painless",
        "source": "doc['theatre'].value.length() * params.factor",
        "params": {
          "factor": 1.1
        }
      },
      "order": "asc"
    }

So, yeah, it looks like we're sorting on _score then the scripted value, but I think we just want to sort on the scripted value. We don't really need some kind of score from the query, do we? I think the intention is that the dates determines the order, nothing else in the docs.

our current sort:

"sort": [
  "_score",
  {
    "_script": {
      "type": "number",
      "order": "asc",
      "script": {
        "lang": "painless",
        "source": "..."
      }
    }
  }
]

maybe it should be more like this?

"sort": {
  "_script": {
    "type": "number",
    "order": "asc",
    "script": {
      "lang": "painless",
      "source": "..."
    }
  }
}

I'll see if I can repro the problem locally (I've just been doing it on cloud), and then see if changing the sort gets things moving again ...

@pmuellr
Copy link
Member Author

pmuellr commented Oct 14, 2020

Here's how I'm trying to repro.

  • d/l the whole-lotta-alerts.sh gist https://gist.github.com/pmuellr/f30331660ae032a0a7ccf2767aea3900
  • that gist mentions other tools to install, jq, kbn-action, es-apm-sys-sim
  • run the script as ALERTS=100 whole-lotta-alerts.sh; 100 is the number of alerts to create
  • run es-apm-sys-sim 1 4 -k; this will create 4 alert instances and runs in "keypress mode"; you can set the CPU level to 100% (c:1.00 in the terminal) for each instance by pressing 1 till you see it hit 1.00, then 2, then 3, then 4 the same. Get them all up to 100%. You can ctrl-c to exit out to stop having the alerts fire (or use q, w, e, r to reduce the CPU level)

This should start producing a lot of action requests. You can check by doing a cat of the task manager index - there's only a handful of non-alerting tasks, and then there should be 100 alert tasks and the rest are action execution tasks.

At some point, the actions will start running, but you can get them to start again by killing es-apm-sys-sim. So, not an exact repro, since my cloud deploy has no active alerts, and tons of actions that aren't running. But I think it's probably good enough - there should be SOME actions running while the alerts are active.

That's the basic gist - we should come up with a script we can use that just touches task manager. Maybe add an example task in examples, that we can queue up via http requests, then figure out how to tell if "old" ones get stuck.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 14, 2020

well, dang, here's the commentary on why _score is in the sort:

sort: [
// sort by score first, so the "pinned" Tasks are first
'_score',
// the nsort by other fields
SortByRunAtAndRetryAt,
],

The "pinned" tasks came in with PR #54765 . I vaguely remember this, it was about having to replace using _id fields in documents, but I don't remember what we were using those for. Perhaps "run now" tasks?

@pmuellr
Copy link
Member Author

pmuellr commented Oct 15, 2020

I happened to think to try the devtools query, but remove the score sort criteria, just leaving the date criteria. When I run that, the old action tasks get returned. Are we including the score in the sort for cases we shouldn't be, like there are no "pinned tasks" to query on?

@gmmorris
Copy link
Contributor

gmmorris commented Oct 15, 2020

Yup, you're right - score came in to allow runNow.
It was the only way to ensure the runNow task is prioritised over the overdue tasks.

We can change the ubq to only include _score when we have a taskId in the queue for runNow.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 15, 2020

We can change the ubq to only include _score when we have a taskId in the queue for runNow.

Ya, that seems right. I thought I noticed a path where we didn't add the score, so thought maybe we were already optionally using it - but I wasn't sure what those paths were.

In this case, there shouldn't be any runNow()'s happening. No new alerts being created, and I only saw one other case of runNow() being used by apm_telemetry, in it's startup code, and only on a kibana version change.

@gmmorris
Copy link
Contributor

gmmorris commented Oct 15, 2020

Ya, that seems right. I thought I noticed a path where we didn't add the score, so thought maybe we were already optionally using it - but I wasn't sure what those paths were.

Not in Task Store - the markAvailableTasksAsClaimed always uses the _score.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 15, 2020

My attempted repro ^^^ wasn't perfect, because it didn't result in any "zombie" tasks, but I did notice it was running a 60/40 mix of (old actions)/ (alerts), when I think it should have been 100/0, given the actions were all older than the runAt's of the alerts. So, the change should be noticeable with a fix.

BTW, I've developed some viz's and dashboards to try to visualize some of this stuff, have it in a gist here which you should should be able to import into Kibana as saved objects: https://gist.github.com/pmuellr/b19420bd52124c50fc3cdb116f56644e

@gmmorris
Copy link
Contributor

It might be worth looking into whether we can nudge the _score somehow so that older tasks have a higher score... I'm guessing this is out of our control, but worth investigating in case this comes back at some point.

@pmuellr
Copy link
Member Author

pmuellr commented Oct 15, 2020

Ya, you could imagine somehow changing the score to reflect the "age" of the task, but then it's not clear how you then boost pinned tasks so they always "beat" those. I'm hesitant to add even more complexity here! :-)

@pmuellr
Copy link
Member Author

pmuellr commented Oct 15, 2020

I have a draft PR here #80692 that only adds the score sort when there are pinned tasks.

When I re-ran my non-perfect repro ^^^ with this, I was unable to get more than ~500 tasks in the tm index. I think that's a good sign - it means that some of the "older" actions that were queueing up previously are now being run, keeping things a bit more "stable" in terms of running "old" things.

OTOH, this will be a probable systemic change - I'd guess that alerts will be running less often than they did before, because these older actions will now be running where they weren't before, consuming the task slots that alerts were using.

I'd guess this is only problematic for cases where there are A LOT of actions queued up, and I hope that would be an uncommon case. Makes me think this may not be critical to ship as in 7.10, we can probably wait till 7.11. A workaround if we notice this in the field would be to disable a number of alerts for a short while, then re-enable them. This seemed to be enough to get older actions running again in cases where I noticed 10's of 1000's of action task zombies.

@gmmorris
Copy link
Contributor

I tend to agree regarding 7.10.
We've had runNow since 7.9, so this isn't introduced by 7.10, and as you said - this is mostly an issue for a large backed up system.
We might see a spike of these when 7.11 is released but only in large deployments with many backed up actions.
We might be able to rely on the health endpoint to express this to the customers after the upgrade.

pmuellr added a commit to pmuellr/kibana that referenced this issue Oct 20, 2020
…tasks

resolves: elastic#80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.
@nik9000
Copy link
Member

nik9000 commented Oct 22, 2020

Just kind of a drive by question by someone totally unfamiliar with the process: do y'all have randomized exponential backoff for the task grabbing failures? I feel like tcp style backoff could make the process tolerant of pretty wonky issues. It could change the problem from "we can't run tasks" to "running tasks is slower than I'd like".

pmuellr added a commit that referenced this issue Oct 22, 2020
…tasks (#80692)

resolves: #80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.
pmuellr added a commit to pmuellr/kibana that referenced this issue Oct 22, 2020
…tasks (elastic#80692)

resolves: elastic#80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.
pmuellr added a commit to pmuellr/kibana that referenced this issue Oct 22, 2020
…tasks (elastic#80692)

resolves: elastic#80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.
# Conflicts:
#	x-pack/plugins/task_manager/server/task_store.test.ts
pmuellr added a commit that referenced this issue Oct 22, 2020
…tasks (#80692) (#81495)

resolves: #80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.
pmuellr added a commit that referenced this issue Oct 22, 2020
…pinned tasks (#80692) (#81497)

resolves: #80371

Previously, when claiming tasks, we were always sorting the tasks to claim by
the score and then by the time they should be run.  We sort by score to
capture `runNow()` tasks, also referred to internally as "pinned" tasks
in the update by query.

The change in this PR is to only sort by score if there are pinned tasks, and
to not sort by score at all if there aren't any.

Also had to fix type check error after fixing a. merge conflict during the backport:
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants