Skip to content

Commit

Permalink
Merge pull request #157 from pauldg/update_state_stats_query
Browse files Browse the repository at this point in the history
Update state stats query
  • Loading branch information
hexylena authored May 22, 2024
2 parents f1049d8 + 148c29c commit 74f0be7
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 13 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
- query monthly-users-registered to add YYYY-MM parameter @afgane
- query monthly-jobs to add YYYY-MM and --by_state parameters @afgane
- query total-jobs to add date and --total parameters @afgane
- query job-state-stats: added a --older-than param from @pauldg
- Fixed:
- Replaced hardcoded metric_name with the variable in query_tool-metrics function @sanjaysrikakulam
- improved man pages a tad
Expand Down
33 changes: 23 additions & 10 deletions docs/README.query.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Command | Description
[`query data-origin-distribution-summary`](#query-data-origin-distribution-summary) | breakdown of data sources (uploaded vs derived)
[`query datasets-created-daily`](#query-datasets-created-daily) | The min/max/average/p95/p99 of total size of datasets created in a single day.
[`query dataset-usage-and-imports`](#query-dataset-usage-and-imports) | Fetch limited information about which users and histories are using a specific dataset from disk.
[`query destination-queue-run-time The average/median/95%/99% tool spends in queue/run state grouped by tool_id and destination_id.`](#query-destination-queue-run-time-The-average/median/95%/99%-tool-spends-in-queue/run-state-grouped-by-tool_id-and-destination_id.) | query destination-queue-run-time The average/median/95%/99% tool spends in queue/run state grouped by tool_id and destination_id.
[`query destination-queue-run-time`](#query-destination-queue-run-time) | The average/median/95%/99% tool spends in queue/run state grouped by tool and destination.
[`query disk-usage`](#query-disk-usage) | Disk usage per object store.
[`query disk-usage-library`](#query-disk-usage-library) | Retrieve an approximation of the disk usage for a data library
[`query dump-users`](#query-dump-users) | Dump the list of users and their emails
Expand Down Expand Up @@ -38,7 +38,7 @@ Command | Description
[`query jobs-queued-internal-by-handler`](#query-jobs-queued-internal-by-handler) | How many queued jobs do not have external IDs, by handler
[`query jobs-ready-to-run`](#query-jobs-ready-to-run) | Find jobs ready to run (Mostly a performance test)
[`query job-state`](#query-job-state) | Get current job state given a job ID
[`query job-state-stats`](#query-job-state-stats) | Shows all jobs states for the last 30 days in a table counted by state
[`query job-state-stats`](#query-job-state-stats) | Shows all jobs states within a time interval (default: 30 days) in a table counted by state
[`query jobs`](#query-jobs) | List jobs ordered by most recently updated. = is required.
[`query large-old-histories`](#query-large-old-histories) | Find large, old histories that probably should be deleted.
[`query largest-collection`](#query-largest-collection) | Returns the size of the single largest collection
Expand Down Expand Up @@ -273,18 +273,21 @@ This has built in support for "cleaning up" paths like /data/galaxy/.../dataset_
(1 row)


## query destination-queue-run-time The average/median/95%/99% tool spends in queue/run state grouped by tool_id and destination_id.
## query destination-queue-run-time

([*source*](https://github.com/galaxyproject/gxadmin/search?q=query_destination-queue-run-time_The_average/median/95%/99%_tool_spends_in_queue/run_state_grouped_by_tool_id_and_destination_id.&type=Code))
query destination-queue-run-time - gxadmin query destination-queue-run-time The average/median/95%/99% tool spends in queue/run state grouped by tool_id and destination_id.
([*source*](https://github.com/galaxyproject/gxadmin/search?q=query_destination-queue-run-time&type=Code))
query destination-queue-run-time - The average/median/95%/99% tool spends in queue/run state grouped by tool and destination.

**SYNOPSIS**

gxadmin query destination-queue-run-time The average/median/95%/99% tool spends in queue/run state grouped by tool_id and destination_id.
gxadmin query destination-queue-run-time [--older-than=30] [--seconds]

**NOTES**

$ gxadmin query destination-queue-run-time
Lists queue and run time statistics grouped by use tool and destination within a time window (# of days).
Requires <older-than> a given number of days

$ gxadmin query destination-queue-run-time --older-than='90'
destination_id | tool_id | count | avg | min | median_queue | perc_95_queue | perc_99_queue | max | avg | min
| median_run | perc_95_run | perc_99_run | max
----------------+-----------------+-------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+--------------
Expand Down Expand Up @@ -874,15 +877,15 @@ query job-state - Get current job state given a job ID
## query job-state-stats

([*source*](https://github.com/galaxyproject/gxadmin/search?q=query_job-state-stats&type=Code))
query job-state-stats - Shows all jobs states for the last 30 days in a table counted by state
query job-state-stats - 30 days) in a table counted by state

**SYNOPSIS**

gxadmin query job-state-stats
gxadmin query job-state-stats [--older-than=<interval>]

**NOTES**

Shows all job states for the last 30 days in a table counted by state
Shows all job states within a time interval (default: 30 days) in a table counted by state

Example:
$ gxadmin query job-state-stats
Expand All @@ -895,6 +898,16 @@ $ gxadmin query job-state-stats
...
-26 days

The '--older-than=' option takes a value in the PostgreSQL date/time interval
format, see documentation: https://www.postgresql.org/docs/current/functions-datetime.html
Be sure to quote intervals containing spaces:

$ gxadmin query job-state-stats --older-than='2 days'
date | new | running | queued | upload | ok | error | paused | stopped | deleted
------------+-------+---------+--------+--------+-------+-------+--------+---------+---------
2022-04-26 | 921 | 564 | 799 | 0 | 581 | 21 | 1 | 0 | 2
2022-04-25 | 1412 | 1230 | 1642 | 0 | 1132 | 122 | 14 | 0 | 15


## query jobs

Expand Down
26 changes: 23 additions & 3 deletions parts/22-query.sh
Original file line number Diff line number Diff line change
Expand Up @@ -818,12 +818,13 @@ query_recent-jobs() { ##? <hours>: Jobs run in the past <hours> (in any state)
EOF
}

query_job-state-stats() { ## : Shows all jobs states for the last 30 days in a table counted by state
query_job-state-stats() { ##? [--older-than=<interval>]: Shows all jobs states within a time interval (default: 30 days) in a table counted by state
meta <<-EOF
ADDED: 19
UPDATED: 22
EOF
handle_help "$@" <<-EOFhelp
Shows all job states for the last 30 days in a table counted by state
Shows all job states within a time interval (default: 30 days) in a table counted by state
Example:
$ gxadmin query job-state-stats
Expand All @@ -836,8 +837,27 @@ query_job-state-stats() { ## : Shows all jobs states for the last 30 days in a t
...
-26 days
The '--older-than=' option takes a value in the PostgreSQL date/time interval
format, see documentation: https://www.postgresql.org/docs/current/functions-datetime.html
Be sure to quote intervals containing spaces:
$ gxadmin query job-state-stats --older-than='2 days'
date | new | running | queued | upload | ok | error | paused | stopped | deleted
------------+-------+---------+--------+--------+-------+-------+--------+---------+---------
2022-04-26 | 921 | 564 | 799 | 0 | 581 | 21 | 1 | 0 | 2
2022-04-25 | 1412 | 1230 | 1642 | 0 | 1132 | 122 | 14 | 0 | 15
EOFhelp

fields="new=1;running=2;queued=3;upload=4;ok=5;error=6;paused=7;stopped=8;deleted=9"
tags="date=0"

interval="AND job.create_time > (timezone('UTC', now()) - '30 days'::INTERVAL)"
if [[ -n "$arg_older_than" ]]; then
interval="AND job.create_time > (timezone('UTC', now()) - '$arg_older_than'::INTERVAL)"
fi


read -r -d '' QUERY <<-EOF
SELECT
date_trunc ('day', job.create_time)::date as date,
Expand All @@ -855,7 +875,7 @@ EOFhelp
job_state_history
WHERE
job_state_history.job_id = job.id
and job.create_time >= now() - INTERVAL '30 DAYS'
$interval
GROUP BY
date
ORDER BY
Expand Down

0 comments on commit 74f0be7

Please sign in to comment.