-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API/YAML: Support using env vars to define file_mounts. #2146
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this @concretevitamin! This should make our YAML interface more flexible. Left several questions.
sky/task.py
Outdated
def _substitute_env_vars(target: Union[str, List[str]], | ||
envs: Dict[str, str]) -> Union[str, List[str]]: | ||
for key, value in envs.items(): | ||
pattern = r'\$\{?\b' + key + r'\b\}?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to add a check for the key of the task.envs
somewhere in our code, i.e. we need to make sure the key ~ '[a-zA-Z][a-zA-Z0-9]*'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get checked by storage.py's validate_name().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I was trying to say that if the user provides a envvar like the following:
envs:
hello world: some_value
run: |
echo ${hello world}
Here the run section will fail, as it is not a valid syntax in bash. However, if this env var is used in the file_mounts
section, the envvar will be replaced correctly.
file_mounts:
/dst:
source: ${hello world}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! Done.
sky/task.py
Outdated
storage_config = _fill_in_env_vars_in_storage_config( | ||
storage[1], task.envs) | ||
storage_obj = storage_lib.Storage.from_yaml_config( | ||
storage_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of overriding the storage_config
, should we just replace all the env var in the file_mounts
section? For example, in L285 we do:
if config.get('file_mounts') is not None:
file_mounts_str = yaml.dumps(config['file_mounts'])
new_file_mounts_str = _replace_env_var(file_mounts_str, config.get('envs', {}))
config['file_mounts'] = yaml.loads(new_file_mounts_str)
With that, the user can do something like the following:
file_mounts:
/model_path/llama-${SIZE}b: s3://llama-weights/llama-${SIZE}b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One worry with string replacement is that it'll make some weird things "work":
file_mounts:
/path:
mo${empty_envvar}de: xxxx
I think we should make replacement more structured.
That said, supporting env vars in src/dst seems a useful case. How about we support it when users ask?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, though I still think the replacement for dst
and src
might be quite important, as that was one thing we would like to have for the Vicuna for downloading the correct llama weight from the bucket based on the envvar.
For the problem you mentioned, we can do the schema validation before the replacement happens to avoid that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! Updated as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @concretevitamin! The code looks good to me with a small corner case that we may need to be careful.
sky/task.py
Outdated
# TODO(zongheng): support ${ENV:-default}? | ||
file_mounts_str = json.dumps(file_mounts) | ||
for key, value in task_envs.items(): | ||
pattern = r'\$\{?\b' + key + r'\b\}?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it seems bash replaces the envvar by looking for the envvar in the string first and match it with the existing envvar, while here we do it in a reversed way. That will cause a slight inconsistency in the behavior:
HELLO_WORLD=1
echo $HELLO_WORLD_1
The block above will output an empty string, but in our implementation, we will output 1_1
.
Probably, an alternative way to do this is:
# Create a replacement function
def replace_var(match):
var_name = match.group(1) # Now directly accessing the variable name
return task_envs.get(var_name, match.group(0)) # If the variable isn't in the dictionary, return it unchanged
pattern = r'\$\{?\b([a-zA-Z_][a-zA-Z0-9_]*)\b\}?'
# Use re.sub with the replacement function
result = re.sub(pattern, replace_var, file_mounts_str)
This is a very corner case, we can also leave it as is to see how people feel about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! Used this & reran the smoke test.
name = _get_cluster_name() | ||
test_commands = [ | ||
*storage_setup_commands, | ||
(f'sky launch -y -c {name} --cpus 2+ --cloud {generic_cloud} ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, we should have all our tests to be --cpus 2+
in the future to save cost. ; )
|
Overriding with
sky launch --env
works too.Also fixes
sky launch
parsing errors:Fixes #2093.
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py::test_using_file_mounts_with_env_vars
pytest tests/test_optimizer_dryruns.py
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh