Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally dropping DW_JOB_ prefix from environment variables #199

Open
jameshcorbett opened this issue Aug 22, 2024 · 9 comments
Open

Optionally dropping DW_JOB_ prefix from environment variables #199

jameshcorbett opened this issue Aug 22, 2024 · 9 comments

Comments

@jameshcorbett
Copy link
Collaborator

I was talking with @behlendorf and @mcfadden8 and the consensus was that LC users would prefer to have their environment variables be just NAME instead of DW_JOB_{NAME}. Presumably the same goes for DW_PERSISTENT_{NAME}. Could this be enabled with some additional configuration?

@bdevcich
Copy link
Contributor

Couldn't flux drop those prefixes when it pulls the env vars from the workflow?

@jameshcorbett
Copy link
Collaborator Author

u rite u rite

@jameshcorbett
Copy link
Collaborator Author

(as usual)

@jameshcorbett
Copy link
Collaborator Author

Closing this for an issue in flux-coral2. flux-framework/flux-coral2#201

@github-project-automation github-project-automation bot moved this from 📋 Open to ✅ Closed in Issues Dashboard Aug 23, 2024
@matthew-richerson
Copy link
Contributor

You might not want to drop those prefixes though. We guarantee that "name" is unique within "job" #DWs and within "persistent" #DWs, but we don't guarantee that they're unique across those boundaries. So we might return these two environment variables in the same workflow:

DW_JOB_myfs
DW_PERSISTENT_myfs

If you drop those prefixes then you'll get a conflict.

@jameshcorbett
Copy link
Collaborator Author

Yeah I agree, and I don't mind the prefixes but maybe some users will. I'll ask Brian and Marty for input.

@behlendorf
Copy link
Collaborator

behlendorf commented Aug 23, 2024

Actually, my concern isn't so much the DW_ prefix, it's the unique _myfs suffix. The prefix is handy since it prevents namespace collisions, but suffix is awkward because it makes it difficult to script. You need to already know the "fsname" to determine the what environment variable is called, and if you don't it's hard to determine. Adding a couple additional environment variables I think would give people a consistent place to look and make scripting much easier. Here's what I'd suggest (or something similar).

  1. Add an environment variable for the first file system of each type as requested. We need to support multiple filesystems of the same type, like currently, but that's going to be rare and we make to make the common case trivial for users.
DW_XFS="<mountpoint>"
DW_LUSTRE="<mountpoint>"
DW_GFS2="<mountpoint>"

This allows for things like the following in scripts: cd $DW_XFS; git clone myrepo; cd myrepo; make.

  1. Add DW_JOBS and DW_PERSISTENTS environment variables which contain a comma separated list of the filesystem names and mountpoints. This provides a toe hold for advanced users to get the filesystem names from the environment. Without it you need to either already know the name, which you may not if something in the application stack requested it on your behalf. For example, SCR will want to request and manage this space on behalf of the job. Either of the following would really help. I think a prefer option b) myself.
DW_JOBS="myfs1,myfs2,myfs3" or
DW_JOBS="myfs1:<mountpoint>,myfs2:<mountpoint>,myfs3:<mountpoint>"

Marty's more in touch with the users than I am and may have a different take on this. But we really want to make the most common single filesystem case very easy, yet still provide enough for advanced users.

@jameshcorbett jameshcorbett reopened this Aug 23, 2024
@github-project-automation github-project-automation bot moved this from ✅ Closed to 📋 Open in Issues Dashboard Aug 23, 2024
@matthew-richerson
Copy link
Contributor

I think both of the ideas you listed above (1 and 2b) are fairly straight forward for us to do.

Would you want those new environment variables be in addition to or instead of the current DW_JOB_[name] and DW_PERSISTENT_[name] variables?

For the DW_XFS, DW_LUSTRE, etc. I assume those are only for job DWs? Would you want a DW_PERSISTENT variable as well to point to the first first instance of a persistent file system in the workflow?

@mcfadden8
Copy link

Instead of (or in addition to) using environment variables, would it be possible to create the mountpoint name with friendly values or create a symbolic link somewhere to the actual mountpoint? I think that users are already aware of mountpoints like /p/lustreN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Open
Development

No branches or pull requests

5 participants