Skip to content

5.0.0

Compare
Choose a tag to compare
@rra rra released this 23 Mar 04:43
· 370 commits to main since this release
5.0.0
9d98985

Backwards-incompatible changes

  • Settings are now handled with Pydantic and undergo much stricter validation. In particular, the Slack web hook URL must now be a valid URL if provided.
  • In order to enable stricter and more useful Pydantic validation of flock specifications, the syntax for creating a flock has changed. business is now a dictionary, the restart option has been moved under it, the type of business is specified with type, and the business configuration options have moved under that key as options. Options that are not applicable to a given business type are now rejected.
  • The jupyter.url_prefix option is now just url_prefix, and juyter.image is now just image. The names of the setting under image have changed.
  • The TAPQueryRunner options tap_sync and tap_query_set are now just sync and query_set.
  • lab_settle_time is no longer supported as a configuration option for the businesses that spawn a Nublado lab. It defaulted to 0 and we never set it.
  • JupyterJitterLoginLoop has been retired. Instead, set the jitter option on JupyterPythonLoop.
  • JupyterLoginLoop has been merged with JupyterPythonLoop. The only difference in the former is that no lab session was created and no code was run, which seems pointless and not worth the distinction. JupyterPythonLoop runs a simple addition by default, which should be an improvement over JupyterLoginLoop in every likely situation.

New features

  • When the production logging profile is used, the messages from monkeys are no longer reported to the main mobu log, only to the individual monkey logs. This should produce considerably less noise in external log aggregators.
  • The notebook being run is now included in all Slack error reports, not just for code execution failures.
  • The API documentation now shows only the relevant options for the type of business when showing how to create a flock.
  • Add support for running a business once and returning its results, via a POST to the new /run endpoint.
  • Add support for the new Nublado lab controller (see SQR-066.
  • The time a business pauses after a failure before it is restarted is now configurable with the error_idle_time option and defaults to 10 minutes (instead of 1 minute) for Nublado businesses, since this is how long JupyterHub will wait for a lab to spawn before giving up.

Bug fixes

  • The dp0.2 TAPQueryRunner query set is now lighter-weight and will consume less memory and CPU to execute, hopefully reducing timeout errors.
  • Cell numbering in error reports is now across all cells, not just code cells.
  • TAPQueryRunner no longer creates a TAP client in its __init__ method, since creating a TAP client makes HTTP requests to the TAP server that can fail and failure would potentially crash mobu. Instead, it creates the TAP client in startup and handles exceptions properly so that they're reported to Slack.
  • Business failures during startup are now counted as a failed execution so that a business that fails repeatedly in startup doesn't report 100% success in the flock summary.
  • The code run by JupyterPythonLoop and NotebookRunner to get the Kubernetes node on which the lab is running now uses lsst.rsp.get_node instead of the deprecated rubin_jupyer_utils.lab.notebook.utils.get_node.

Other changes

  • Slightly improve logging when monkeys are shut down due to errors.
  • mobu's internals have been extensively refactored following the design in SQR-072 to hopefully make future maintenance easier.

What's Changed

  • [neophile] Update dependencies by @sqrbot in #171
  • [neophile] Update dependencies by @sqrbot in #172
  • [neophile] Update dependencies by @sqrbot in #173
  • [neophile] Update dependencies by @sqrbot in #174
  • Bump python from 3.10.6-slim-bullseye to 3.10.7-slim-bullseye by @dependabot in #175
  • [neophile] Update dependencies by @sqrbot in #176
  • [neophile] Update dependencies by @sqrbot in #177
  • [neophile] Update dependencies by @sqrbot in #178
  • [neophile] Update dependencies by @sqrbot in #179
  • [neophile] Update dependencies by @sqrbot in #180
  • [neophile] Update dependencies by @sqrbot in #182
  • [neophile] Update dependencies by @sqrbot in #183
  • [neophile] Update dependencies by @sqrbot in #185
  • [neophile] Update dependencies by @sqrbot in #186
  • [neophile] Update dependencies by @sqrbot in #187
  • [neophile] Update dependencies by @sqrbot in #188
  • [neophile] Update dependencies by @sqrbot in #189
  • [neophile] Update dependencies by @sqrbot in #190
  • [neophile] Update dependencies by @sqrbot in #192
  • [neophile] Update dependencies by @sqrbot in #193
  • [neophile] Update dependencies by @sqrbot in #194
  • [neophile] Update dependencies by @sqrbot in #195
  • [neophile] Update dependencies by @sqrbot in #196
  • [neophile] Update dependencies by @sqrbot in #197
  • Bump python from 3.10.7-slim-bullseye to 3.11.1-slim-bullseye by @dependabot in #191
  • [neophile] Update dependencies by @sqrbot in #198
  • [neophile] Update dependencies by @sqrbot in #200
  • Bump docker/build-push-action from 3 to 4 by @dependabot in #201
  • Bump python from 3.11.1-slim-bullseye to 3.11.2-slim-bullseye by @dependabot in #203
  • [neophile] Update dependencies by @sqrbot in #202
  • [neophile] Update dependencies by @sqrbot in #204
  • [neophile] Update dependencies by @sqrbot in #206
  • Downscale DP0.2 querys to approx. DP0.1 size by @fritzm in #205
  • [neophile] Update dependencies by @sqrbot in #207
  • DM-38339: Use the new Safir Slack webhook support by @rra in #208
  • DM-38339: Convert to pyproject.toml by @rra in #209
  • DM-38339: Update GitHub Actions configuration by @rra in #210
  • DM-38339: Update mypy configuration and type annotations by @rra in #211
  • DM-38339: Switch to backtracking resolver by @rra in #212
  • DM-38339: Update to latest Safir, use Settings for config by @rra in #213
  • DM-38339: Use relative imports for test modules by @rra in #214
  • DM-38339: Remove types from docstrings by @rra in #215
  • DM-38339: Redo monkey logging and state machine by @rra in #216
  • DM-38339: Fix some coding style issues in TAP code by @rra in #217
  • DM-38339: Add notebook to Slack error reports by @rra in #218
  • [neophile] Update dependencies by @sqrbot in #219
  • DM-38339: Reorganize source and clean up business type structure by @rra in #220
  • DM-38339: Remove lab_settle_time configuration by @rra in #221
  • DM-38339: Eliminate JupyterJitterLoginLoop by @rra in #222
  • DM-38339: Merge JupyterLoginLoop and JupyterPythonLoop by @rra in #223
  • DM-38339: Handle failures during TAPQueryRunner setup by @rra in #224
  • DM-38339: Refactor mobu state management by @rra in #225
  • DM-38339: Add support for running a business once by @rra in #226
  • DM-38408: Add support for the new Nublado lab controller by @rra in #227

Full Changelog: 4.5.0...5.0.0