Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote: ensure all remote commands use a platform config #5152

Merged
merged 3 commits into from
Sep 28, 2022

Conversation

oliver-sanders
Copy link
Member

@oliver-sanders oliver-sanders commented Sep 21, 2022

Closes #5153

There were a couple of _private interfaces left behind in cylc.flow.remote. These interfaces did not use platform configuration, which is why they were private.

I've replaced these with convenience methods which use the localhost platform. Now all remote commands should be run with either a job platform configuration or the localhost platform configuration.

This closes #5153 by using the localhost platform (allowing things like the ssh command to be configured) for the two remaining uses:

  • Evaluation of host selection rankings (via cylc psutil).
  • The detect old contact file check, which tests whether a workflow is still running on the server recored in the contact file.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg and conda-environment.yml.
  • Functionalities individually tested (but not by CI, please test locally).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/542.
  • If this is a bug fix, PRs raised to both master and the relevant maintenance branch.

@oliver-sanders oliver-sanders added this to the cylc-8.0.3 milestone Sep 21, 2022
@oliver-sanders oliver-sanders self-assigned this Sep 21, 2022
@oliver-sanders oliver-sanders force-pushed the remote-command-platform-config branch from 554ea02 to 1eaf427 Compare September 26, 2022 10:27
@oliver-sanders oliver-sanders changed the base branch from master to 8.0.x September 26, 2022 10:27
@oliver-sanders oliver-sanders added the bug Something is wrong :( label Sep 26, 2022
* Remote interfaces (SSH & rsync) now all require a platform
  object for configuration purposes (e.g. "ssh command").
* Convenience interfaces added for remote calls to Cylc servers
  which use the localhost platform configuration. These
  are now used for:
  * `cylc play` command re-invocation on a Cylc server.
  * Evaluation of host selection rankings (via `cylc psutil`).
  * The detect old contact file check, which tests whether
    a workflow is still running on the server recored in the
    contact file.
@oliver-sanders oliver-sanders force-pushed the remote-command-platform-config branch from 1eaf427 to 1b15b2f Compare September 26, 2022 14:05
Copy link
Member Author

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the intended behaviour changes. Please test for yourself to confirm they have been correctly implemented.

Example testing strategy:

[scheduler]
    [[run hosts]]
        # configure a couple of hosts and a ranking to get host-select to kick in
        available = my-host-one, my-host-two
        ranking = """
            virtual_memory().available
        """

[platforms]
    [[localhost]]
        # try running remote jobs with SSH comms to make sure it still works
        communication method = ssh
        # modify the SSH command from default / site settings, look for this in debug messages
        # to ensure this config is being picked up where appropriate.
        ssh command = ssh -oBatchMode=yes -oConnectTimeout=42

@@ -533,7 +533,7 @@ def _get_metrics(hosts, metrics, data=None):
}
for host in hosts:
if is_remote_host(host):
proc_map[host] = _remote_cylc_cmd(cmd, host=host, **kwargs)
proc_map[host] = cylc_server_cmd(cmd, host=host, **kwargs)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaviour change: this now uses the localhost platform config.

It now also logs the command to debug level.

'cylc path': cylc_path,
'use login shell': login_sh,
}
proc = remote_cylc_cmd(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No behaviour change, the same arguments are passed through in a different way.

@@ -393,7 +393,7 @@ def _distribute(host, workflow_id_raw, workflow_id):
cmd.append("--host=localhost")

# Re-invoke the command
_remote_cylc_cmd(cmd, host=host)
cylc_server_cmd(cmd, host=host)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaviour change: This now uses the localhost platform config.

It now also logs the command to debug level.

@@ -406,7 +406,7 @@ def _is_process_running(
metric = f'[["Process", {pid}]]'
if is_remote_host(host):
cmd = ['psutil']
cmd = _construct_ssh_cmd(cmd, host)
cmd = construct_cylc_server_ssh_cmd(cmd, host)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No behaviour change: this interface defaulted to localhost before.

@oliver-sanders oliver-sanders marked this pull request as ready for review September 26, 2022 14:13
Copy link
Member

@MetRonnie MetRonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With your example I do indeed get (printed to stderr i.e. before the scheduler log file has been created)

DEBUG - running command:
    $ ssh -oBatchMode=yes -oConnectTimeout=42 \
        my-host-one.example.com env CYLC_VERSION=8.0.3.dev bash \
        --login -c 'exec "$0" "$@"' cylc psutil
DEBUG - running command:
    $ ssh -oBatchMode=yes -oConnectTimeout=42 \
        my-host-two.example.com env CYLC_VERSION=8.0.3.dev bash \
        --login -c 'exec "$0" "$@"' cylc psutil
DEBUG - running command:
    $ ssh -oBatchMode=yes -oConnectTimeout=42 -n my-host-one env \
        CYLC_VERSION=8.0.3.dev bash --login -c 'exec "$0" "$@"' \
        cylc play temp4 -vv --host=localhost

cylc/flow/remote.py Outdated Show resolved Hide resolved
Co-authored-by: Ronnie Dutta <[email protected]>
Copy link
Contributor

@datamel datamel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and behaving as expected (used the review example, thanks for that).
A couple of very minor points. Happy for this to go in.

cylc/flow/remote.py Outdated Show resolved Hide resolved
cylc/flow/remote.py Outdated Show resolved Hide resolved
Copy link
Contributor

@datamel datamel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good to go once tests pass.

@MetRonnie MetRonnie merged commit ece609d into cylc:8.0.x Sep 28, 2022
@oliver-sanders oliver-sanders deleted the remote-command-platform-config branch September 28, 2022 10:11
datamel pushed a commit that referenced this pull request Sep 29, 2022
* fix reversed data-store edge source-target (#5156)

Co-authored-by: David Sutherland <[email protected]>

* `log_vc_info`: Redirect diff straight to file to avoid blocking pipe (#5139)

* log_vc_info: redirect diff straight to file to avoid blocking pipe
* Update changelog

* A no-flow task should not merge and retrigger incomplete children (#5146)

Prevent no-flow merge.

* remote-install: add "ana/" to the default install list (#5137)

* The `ana/` directory is used by `rose_ana` to load comparison modules.
* These are typically run where the data is generated which is often
  remote.
* These directories typically contain a small number of Python files.

* Auto bump dev version on release

* Run GH Actions tests on push to `8.*.x` branches

* Db store force triggered (#5023)

Store force-triggered flag in the run DB.

* Add a new functional test.
* Remove some redundant DB updates.
* Update change log.

* remote: ensure all remote commands use a platform config (#5152)

Remote interfaces (SSH & rsync) now all require a platform object for configuration purposes (e.g. "ssh command").
Convenience interfaces added for remote calls to Cylc servers which use the localhost platform configuration. These are now used for:
* `cylc play` command re-invocation on a Cylc server.
* Evaluation of host selection rankings (via `cylc psutil`).
* The detect old contact file check, which tests whether a workflow is still running on the server recored in the contact file.

Co-authored-by: David Sutherland <[email protected]>
Co-authored-by: Ronnie Dutta <[email protected]>
Co-authored-by: Hilary James Oliver <[email protected]>
wxtim added a commit to wxtim/cylc that referenced this pull request Oct 3, 2022
…l-failure.bk

* upstream/8.0.x:
  remote: ensure all remote commands use a platform config (cylc#5152)
  Db store force triggered (cylc#5023)
  Run GH Actions tests on push to `8.*.x` branches
  Auto bump dev version on release
  remote-install: add "ana/" to the default install list (cylc#5137)
  A no-flow task should not merge and retrigger incomplete children (cylc#5146)
  `log_vc_info`: Redirect diff straight to file to avoid blocking pipe (cylc#5139)
  fix reversed data-store edge source-target (cylc#5156)
  Fix Jinja2 support if HOME undefined.
  Assume Jinja2 might be used in global-tests.cylc.
  Fix post-reload trigger. (cylc#5104)
  Bump dev version
  Update changelog
  workflow_state xtrigger: infer run num
  Type annotations
  Prepare release 8.0.2
  Remove HOME Env Variable from get_remote_workflow_run_dir (cylc#5115)
wxtim added a commit that referenced this pull request Oct 6, 2022
* fix reversed data-store edge source-target (#5156)

Co-authored-by: David Sutherland <[email protected]>

* `log_vc_info`: Redirect diff straight to file to avoid blocking pipe (#5139)

* log_vc_info: redirect diff straight to file to avoid blocking pipe
* Update changelog

* A no-flow task should not merge and retrigger incomplete children (#5146)

Prevent no-flow merge.

* remote-install: add "ana/" to the default install list (#5137)

* The `ana/` directory is used by `rose_ana` to load comparison modules.
* These are typically run where the data is generated which is often
  remote.
* These directories typically contain a small number of Python files.

* Auto bump dev version on release

* Run GH Actions tests on push to `8.*.x` branches

* Db store force triggered (#5023)

Store force-triggered flag in the run DB.

* Add a new functional test.
* Remove some redundant DB updates.
* Update change log.

* remote: ensure all remote commands use a platform config (#5152)

Remote interfaces (SSH & rsync) now all require a platform object for configuration purposes (e.g. "ssh command").
Convenience interfaces added for remote calls to Cylc servers which use the localhost platform configuration. These are now used for:
* `cylc play` command re-invocation on a Cylc server.
* Evaluation of host selection rankings (via `cylc psutil`).
* The detect old contact file check, which tests whether a workflow is still running on the server recored in the contact file.

* host select: allow unary operators in ranking expressions (#5151)

* host select: allow unary operators in ranking expressions

* Whitelist unary operators (required for expressions like `-1 * x`)
  in `[scheduler][run hosts]ranking` expressions.
* Improve error formatting.

* Update tests/unit/test_exceptions.py

Co-authored-by: Tim Pillinger <[email protected]>

Co-authored-by: Tim Pillinger <[email protected]>

* reinstall changes to `rose-suite.conf` (#5125)

reinstall: ensure rose-suite.conf changes trigger reinstallation

Co-authored-by: David Sutherland <[email protected]>
Co-authored-by: Ronnie Dutta <[email protected]>
Co-authored-by: Hilary James Oliver <[email protected]>
Co-authored-by: Tim Pillinger <[email protected]>
datamel pushed a commit to datamel/cylc-flow that referenced this pull request Oct 19, 2022
* fix reversed data-store edge source-target (cylc#5156)

Co-authored-by: David Sutherland <[email protected]>

* `log_vc_info`: Redirect diff straight to file to avoid blocking pipe (cylc#5139)

* log_vc_info: redirect diff straight to file to avoid blocking pipe
* Update changelog

* A no-flow task should not merge and retrigger incomplete children (cylc#5146)

Prevent no-flow merge.

* remote-install: add "ana/" to the default install list (cylc#5137)

* The `ana/` directory is used by `rose_ana` to load comparison modules.
* These are typically run where the data is generated which is often
  remote.
* These directories typically contain a small number of Python files.

* Auto bump dev version on release

* Run GH Actions tests on push to `8.*.x` branches

* Db store force triggered (cylc#5023)

Store force-triggered flag in the run DB.

* Add a new functional test.
* Remove some redundant DB updates.
* Update change log.

* remote: ensure all remote commands use a platform config (cylc#5152)

Remote interfaces (SSH & rsync) now all require a platform object for configuration purposes (e.g. "ssh command").
Convenience interfaces added for remote calls to Cylc servers which use the localhost platform configuration. These are now used for:
* `cylc play` command re-invocation on a Cylc server.
* Evaluation of host selection rankings (via `cylc psutil`).
* The detect old contact file check, which tests whether a workflow is still running on the server recored in the contact file.

* host select: allow unary operators in ranking expressions (cylc#5151)

* host select: allow unary operators in ranking expressions

* Whitelist unary operators (required for expressions like `-1 * x`)
  in `[scheduler][run hosts]ranking` expressions.
* Improve error formatting.

* Update tests/unit/test_exceptions.py

Co-authored-by: Tim Pillinger <[email protected]>

Co-authored-by: Tim Pillinger <[email protected]>

* reinstall changes to `rose-suite.conf` (cylc#5125)

reinstall: ensure rose-suite.conf changes trigger reinstallation

Co-authored-by: David Sutherland <[email protected]>
Co-authored-by: Ronnie Dutta <[email protected]>
Co-authored-by: Hilary James Oliver <[email protected]>
Co-authored-by: Tim Pillinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remote: interfaces with no platform configuration
3 participants