Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove channels #3515

Open
benclifford opened this issue Jul 7, 2024 · 3 comments
Open

Remove channels #3515

benclifford opened this issue Jul 7, 2024 · 3 comments

Comments

@benclifford
Copy link
Collaborator

benclifford commented Jul 7, 2024

Overview

Parsl has a facility called Channels intended to support use cases such as running a workflow on your laptop but with task execution on a supercomputer. This facility never evolved beyond prototype stage, and effort from University of Chicago and University of Illinois was rapidly distracted into the funcX project which aims (with substantially more code and developer time) to provide roughly the same facility - now named Globus Compute and developed by professional programmers rather than an academic research team.

The presence of channels as quasi-abandonware inside the modern Parsl codebase is a painful drain on both user and developer time: Users continue to be fooled into believing that this abandoned prototype facility, rather than the results of the funcX project, should be used for remote execution use cases. Developer time continues to be taken up dealing with the intricate lacing of channel handling through the core codebase.

Channels should be removed from Parsl, with the default behaviour of LocalChannel becoming the only behaviour.

Proposed timeline

I propose this timeline as a default, unless a consensus forms for some other timeline. In the absence of any other consensus forming in the comments of this issue, I'll make this timeline happen.

now .. 7th August 2024: people paying attention to issues get to comment on this issue, including offering entirely different alternative paths forward
7th August 2024: channel code is marked as going away in the codebase, in some way that is hard to avoid for users (for example, renaming all the user-facing channel classes). Users should be encouraged to visit this issue and comment, and to seek alternatives such as Globus Compute.
7th November 2024: channel code is removed from Parsl

Why can't this become an "abandoned component sitting in a directory"

The channel facility is not componentised - it contains prototype quality code inside the core of Parsl

What about the AdHocProvider which attempts to use multiple channels to run on a cluster which has no resource manager?

This should go away too - without channels it is of little use. Users should be encouraged to find other ways to run a cluster without any resource manager. It should not be parsl's job to manage this aspect of a user's cluster.

What if someone wants to take on Channel tech sponsorship?

This could happen, but I think it's unlikely.

The tech sponsor would need to be responsible for tidying up the architecture and implementing those fixes, with noise such as "why are channels initialised or not based on the presence of script_dir attribute on a provider?", test implementations and fixups such as testing parsl without shared file systems in a channel-like environment, and substantially more fundamental questions such as "how is htex supposed to work when channels do not provide a channel for htex network connections?".

For the more fundamental problems, this starts to sound like making a re-run of Globus Compute, which the funcX project has shown can occupy several full time developers for several years. I think probably that tech sponsors time would be better places working to get Parsl playing nicely with other remote execution technology - for example Globus Compute.

I just tagged a load of issues with I just tagged a bunch of issues with channels to give an overview of the sort of stuff a Channel tech sponsor would need to address.

An indicator for a successful channel tech sponsor would be substantial progress on these issues by the removal date in the timetable above.

@benclifford benclifford added enhancement outreachy Good initial contributions for Outreachy applicants channels and removed outreachy Good initial contributions for Outreachy applicants labels Jul 7, 2024
@yadudoc
Copy link
Member

yadudoc commented Jul 9, 2024

@benclifford I just wanted to put my vote here to remove channels entirely.

From the user perspective the only folks I think we need to make sure know of the change would be:

  1. folks using SSHChannels ought to gently pushed to make a switch over to globus-compute.
  2. AdHocChannels will have no alternative really beyond your recommendation to use a cluster resource manager (slurm..)

@astro-friedel
Copy link

@benclifford I agree that channels are no longer needed and can be very finnicky to use.

benclifford added a commit that referenced this issue Aug 7, 2024
This is described in issue #3515.

Issue #3515 has not received any comments arguing in favour of retaining
channels and the AdHocprovider, and several in support of removing them,
 and so this PR takes a heavy handed approach that is well on the way to
the end goal of #3515 of deleting channels and the AdHocProvider entirely:

Channels except LocalChannel are renamed, so that any users of
other channels will have to make a change to their code and
actively observe the word "Deprecated" in the name.

The AdHocProvider is renamed in the same way.

Most documentation (but not docstrings) about channels and the
ad-hoc provider is removed or replaced with a link to issue

Tests which much be manually run, and so in effect are never run and
shouldn't be expected to work now - parsl/tests/manual_tests and
parsl/tests/integration/ - are deleted rather than fixed to follow
the above naming change.

The tests for SSH channels and the AdHocProvider that run in CI are modified
to continue passing.

Exposure of the deprecated components via top level parsl.providers and
parsl.channels re-export is removed. To use this components, the deprecated
modules must be imported directly.
benclifford added a commit that referenced this issue Aug 8, 2024
This is described in issue #3515.

Issue #3515 has not received any comments arguing in favour of retaining
channels and the AdHocprovider, and several in support of removing them,
 and so this PR takes a heavy handed approach that is well on the way to
the end goal of #3515 of deleting channels and the AdHocProvider entirely:

Channels except LocalChannel are renamed, so that any users of
other channels will have to make a change to their code and
actively observe the word "Deprecated" in the name.

The AdHocProvider is renamed in the same way.

Most documentation (but not docstrings) about channels and the
ad-hoc provider is removed or replaced with a link to issue

Tests which much be manually run, and so in effect are never run and
shouldn't be expected to work now - parsl/tests/manual_tests and
parsl/tests/integration/ - are deleted rather than fixed to follow
the above naming change.

The tests for SSH channels and the AdHocProvider that run in CI are modified
to continue passing.

Exposure of the deprecated components via top level parsl.providers and
parsl.channels re-export is removed. To use this components, the deprecated
modules must be imported directly.
benclifford pushed a commit that referenced this issue Aug 16, 2024
Removed paramiko from requirements.txt and added it as an optional module in setup.py. Added OptionalModuleMissing errors for the ssh channel files for when usage is attempted without the required paramiko module being installed.

Changed Behaviour:
If users have code that depends on the ssh channels, they may need to opt in to that module.

Prepares for #3515
benclifford added a commit that referenced this issue Oct 23, 2024
This is part of staged removal of channels - see issue #3515 and PR #3650.
benclifford added a commit that referenced this issue Oct 24, 2024
This is part of staged removal of channels - see issue #3515 and PR #3650.

Co-authored-by: Kevin Hunter Kesling <[email protected]>
benclifford added a commit that referenced this issue Oct 24, 2024
These were introduced to help make channels easier to typecheck in the
DFK, but the DFK code continued to use hasattr tests for 'channel' and
'channels' attributes rather than these marker types.

This is part of channel removal, issue #3515
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
# Changed Behaviour

Functionality removal for anyone using the AdHocProvider

## Type of change

- New feature
- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
This is part of staged dissolution of LocalChannel as part of #3515
channel removal.

# Changed Behaviour

Removed (but I think generally unused) functionality and configuration
option

## Type of change

- New feature
- Code maintenance/cleanup
benclifford added a commit that referenced this issue Nov 7, 2024
See PR #3677 for removal of SSH channel code, and issue #3515
for an overall channel removal.
benclifford added a commit that referenced this issue Nov 7, 2024
Prior to this PR, Channel.execute_wait took a parameter to modify the
unix environment of the newly executed process. This is unused in current
Parsl. As part of the implementation of that, LocalChannel cached the
unix environment at object initialization, and used that as the base to
override with any supplied execute_wait environment parameter.

Post this PR:

The `env` parameter of execute_wait is removed.

LocalChannel does not cache the environment any more. The executed process
will inherit the parent process environment as of the point of execution
not object initialization.

This is a behaviour change: if the workflow process changes its unix
environment after creating the configuration objects, then prior to this
PR, executed processes would not observe that change. Post this PR,
executed processes will observe that change. I hope this is not a big
deal.
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
See PR #3677 for removal of SSH channel code, and issue #3515 for an
overall channel removal.

## Type of change

- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
…3682)

Prior to this PR, Channel.execute_wait took a parameter to modify the
unix environment of the newly executed process. This is unused in
current Parsl. As part of the implementation of that, LocalChannel
cached the unix environment at object initialization, and used that as
the base to override with any supplied execute_wait environment
parameter.

Post this PR:

The `env` parameter of execute_wait is removed.

LocalChannel does not cache the environment any more. The executed
process will inherit the parent process environment as of the point of
execution not object initialization.

# Changed Behaviour

This is a behaviour change: if the workflow process changes its unix
environment after creating the configuration objects, then prior to this
PR, executed processes would not observe that change. Post this PR,
executed processes will observe that change. I hope this is not a big
deal.

## Type of change

- Code maintenance/cleanup
benclifford added a commit that referenced this issue Nov 7, 2024
Prior to this PR, if the argument defaulted to None, then the DFK would
set a script dir based on the workflow rundir.

Post this PR, this default behaviour is the only behaviour and the user
cannot specify an overriding value.

Future PRs will remove channel level script directories entirely.
This is part of issues #3515 removing channels,
benclifford added a commit that referenced this issue Nov 7, 2024
LocalChannel, the only extant channel, does not have any close()
behaviour.

This is part of #3515 channel removal.
benclifford added a commit that referenced this issue Nov 7, 2024
Prior to this PR, if the argument defaulted to None, then the DFK would
set a script dir based on the workflow rundir.

Post this PR, this default behaviour is the only behaviour and the user
cannot specify an overriding value.

Future PRs will remove channel level script directories entirely.
This is part of issues #3515 removing channels,
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
Prior to this PR, if the argument defaulted to None, then the DFK would
set a script dir based on the workflow rundir.

Post this PR, this default behaviour is the only behaviour and the user
cannot specify an overriding value.

Future PRs will remove channel level script directories entirely. This
is part of issues #3515 removing channels,

# Changed Behaviour

This PR removes a little-used user configuration option.

## Type of change

- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
LocalChannel, the only extant channel, does not have any close()
behaviour.

This is part of #3515 channel removal.

## Type of change

- Code maintenance/cleanup
benclifford added a commit that referenced this issue Nov 7, 2024
This is part of moving LocalChannel to have no state, part of #3515

This hostname and exception will be removed when more #3515 work makes
channel push/pull of files go away.
benclifford added a commit that referenced this issue Nov 7, 2024
This is only needed for channels executing on a remote system without
a shared run directory with the submit side. This is no longer supported.
See #3515.
benclifford added a commit that referenced this issue Nov 7, 2024
This is only needed for channels executing on a remote system without
a shared run directory with the submit side. This is no longer supported.
See #3515.
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
This is part of moving LocalChannel to have no state, part of #3515

This hostname and exception will be removed when more #3515 work makes
channel push/pull of files go away.

## Type of change

- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
This is only needed for channels executing on a remote system without a
shared run directory with the submit side. This is no longer supported.
See #3515.

## Type of change

- Code maintenance/cleanup
benclifford added a commit that referenced this issue Nov 8, 2024
This is part of implementing #3515 channel removal code.
These methods became unused in PR #3688.
github-merge-queue bot pushed a commit that referenced this issue Nov 8, 2024
This is part of implementing #3515 channel removal code. These methods
became unused in PR #3688.

## Type of change

- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Dec 3, 2024
# Description

This moves the execute_wait functionality previously provided by
LocalChannel into parsl.utils, as part of #3515.
LocalChannel.execute_wait did not reference `self` so it already
basically behaved as a function rather than a method.

This leaves LocalChannel as solely a place for script_directory, which
will be untangled in a subsequent PR.

# Changed Behaviour

none

## Type of change

- Code maintenance/cleanup
github-merge-queue bot pushed a commit that referenced this issue Dec 3, 2024
The final use of channel errors was removed in PR #3690

This is part of #3515 channel removal project

# Changed Behaviour

Any user catching these errors explicitly will now get an error about
the exceptions being undefined. Because these don't exist any more, it
should be safe to remove those catches.

## Type of change

- Code maintenance/cleanup
benclifford added a commit to benclifford/gen3_workflow that referenced this issue Dec 5, 2024
Channels have been removed from Parsl, and the removed
configuration is now the default behaviour.

See Parsl/parsl#3515
for more context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants