Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect HTEX communication with CurveZMQ #3030

Merged
merged 4 commits into from
Feb 1, 2024

Conversation

rjmello
Copy link
Member

@rjmello rjmello commented Jan 19, 2024

Description

The crux of the implementation revolves around two new classes: ServerContext and ClientContext. These replace the standard zmq.Context and share many commonly used methods, including term, destroy and, most importantly, socket. The latter applies the necessary certificates and options to each socket object.

A connection requires a ServerContext on one end (CurveZMQ server), which validates clients, and a ClientContext on the other (CurveZMQ client), which validates the server.

E.g.,

import zmq
from parsl import curvezmq

cert_dir = curvezmq.create_certificates(base_dir=".")

client_ctx = curvezmq.ClientContext(cert_dir)
client_socket = client_ctx.socket(zmq.PUSH)

server_ctx = curvezmq.ServerContext(cert_dir)
server_socket = server_ctx.socket(zmq.PULL)

port = server_socket.bind_to_random_port("tcp://127.0.0.1")
client_socket.connect(f"tcp://127.0.0.1:{port}")

client_socket.send(b"hola")
msg = server_socket.recv()
print(msg)

The interchange serves as a CurveZMQ server, while the executor and various managers serve as CurveZMQ clients. Thus, when enabled, all communication channels between these entities are encrypted.

The HTEX start method creates new certificates for each run in a private certificates/ directory. We generate a single shared client certificate because all clients will have access to this directory.

Users can enable encryption by setting the encrypted initialization argument for the HighThroughputExecutor to True. We disable encryption by default because, depending on the pyzmq installation path, it can cause a significant impact on throughput performance. I've included some recommended installation paths to address these issues in the docs (e.g., install via conda).

Fixes #2199
Partially addresses #952

Type of change

  • New feature

@rjmello rjmello added executor:htex globus-compute Issues that globus-compute team might be interested in fixing labels Jan 19, 2024
@rjmello rjmello force-pushed the implement-curvezmq-htex-952 branch 3 times, most recently from 6b017b1 to e565065 Compare January 24, 2024 00:04
Copy link
Collaborator

@khk-globus khk-globus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments from a static analysis so far, but I'm still poking at the functionality locally. I'll tentatively say "looking good" at the moment.

parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/curvezmq.py Show resolved Hide resolved
parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/curvezmq.py Outdated Show resolved Hide resolved
parsl/executors/high_throughput/executor.py Outdated Show resolved Hide resolved
parsl/executors/high_throughput/interchange.py Outdated Show resolved Hide resolved
parsl/executors/high_throughput/zmq_pipes.py Show resolved Hide resolved
@rjmello rjmello force-pushed the implement-curvezmq-htex-952 branch from e565065 to d81ceb4 Compare January 26, 2024 17:07
@rjmello rjmello force-pushed the implement-curvezmq-htex-952 branch 2 times, most recently from d09fa89 to 6c6765f Compare January 29, 2024 15:32
@rjmello rjmello force-pushed the implement-curvezmq-htex-952 branch from 6c6765f to b14ec56 Compare January 30, 2024 16:37
`ServerContext` and `ClientContext` replace the standard `zmq.Context`.
They share many commonly used methods, including `term`, `destroy` and,
most importantly, `socket`. The latter applies the necessary certs and
options to each socket object.

A connection requires a `ServerContext` on one end, which validates
clients, and a `ClientContext` on the other, which validates the server.
Certificates are generated via the `create_certificates` function.
The interchange serves as a CurveZMQ server, while the executor and
various managers serve as CurveZMQ clients. Thus, all communication
between these entities is now encrypted.

The HTEX `start` method generates new certs for each run in a private
`certificates/` directory. We generate a single shared client cert
because all clients will have access to this dir.

We disable encryption by default, but users can enable it by setting the
`encrypted` initialization argument for the HTEX to `True`.
@rjmello rjmello force-pushed the implement-curvezmq-htex-952 branch from b14ec56 to 431daef Compare January 30, 2024 16:38
@rjmello
Copy link
Member Author

rjmello commented Jan 30, 2024

We've decided to disable encryption by default because, depending on the pyzmq installation path, it can cause a significant impact on throughput performance. I've included some recommended installation paths to address these issues in the docs (e.g., install via conda).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
executor:htex globus-compute Issues that globus-compute team might be interested in fixing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HighThroughputExecutor workers can connect to somebody else's interchange
3 participants