Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

machine status: Make name argument optional #6855

Merged
merged 1 commit into from
Nov 17, 2021

Conversation

karajan1001
Copy link
Contributor

fix #6838

  1. make machine status name optional.
  2. show all machine status if no name provide.
  3. add a new test for it.

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

An example of it.
image

@karajan1001 karajan1001 requested a review from a team as a code owner October 24, 2021 10:08
@karajan1001 karajan1001 requested a review from efiop October 24, 2021 10:08
@karajan1001 karajan1001 changed the title Make machine status name optional machine status: Make name argument optional Oct 25, 2021
@pmrowla
Copy link
Contributor

pmrowla commented Oct 25, 2021

One other thing I've noticed, we should be including instance_type (instance CPU count/size) and instance_gpu in the default status output (doesn't have to be in this PR though)

@dberenbaum
Copy link
Collaborator

Looks good! Just need to update help message similar to #6804.

@karajan1001
Copy link
Contributor Author

@dberenbaum Any suggestions on UI side?
@efiop How about this commit message style?

@dberenbaum
Copy link
Collaborator

Similar to what was mentioned in #6263 (comment), I think a table structure makes sense, with a row for each instance.

@karajan1001
Copy link
Contributor Author

#6263 (comment)

Oh, that would be in a new PR.

For this PR, what we need to do is to add documents to show different usage with or without name arg in dvc machine status

@dberenbaum
Copy link
Collaborator

Got it. A couple thoughts (@jorgeorpinel might also have ideas):

  • Terminology is inconsistent across commands. We should be clear about what is a machine, what is an instance, etc. I'd propose we have two concepts: machine config (the spec) and machine instance (actual running hardware) and use these consistently. We need a separate issue to review this for all commands.
  • Are we planning to support multiple instances per machine with different IDs? It would probably be ideal for the positional argument to be flexible to accept either a machine config (myaws) or a machine instance (myaws:instance_num_1).

@jorgeorpinel
Copy link
Contributor

Your suggestions make sense @dberenbaum. I'd need to see a sample UI interaction of this command for specifics on argument naming, etc. (browsed GH and Notion and couldn't easily find the latest info.)

BTW @karajan1001 this doesn't need formal documentation yet but please try to keep a basic doc updated guys, is https://github.com/iterative/dvc/wiki/Remote-executors the main one you've been writing? Thanks

We need a separate issue to review this for all commands.

There's this epic #5392
Other older related ones: #4172 (may apply here directly), #3642

@karajan1001
Copy link
Contributor Author

karajan1001 commented Nov 5, 2021

@jorgeorpinel , @dberenbaum
Any updates on this PR, is the current UI OK to merge? The tabulation part had already been in another one.

Copy link
Collaborator

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the --global/system/project/local flags used in this command? They are shown in the help message but not sure they make sense or do anything for this command.

dvc/command/machine.py Outdated Show resolved Hide resolved
dvc/command/machine.py Outdated Show resolved Hide resolved
@karajan1001
Copy link
Contributor Author

Are the --global/system/project/local flags used in this command? They are shown in the help message but not sure they make sense or do anything for this command.

They are useful in --show-origin mode for now. But in a normal mode, because I melt all levels of config together, it gives no difference in results. Btw, there are still some discussions on how we deal with the machine config on multi-levels, one machine config might be spread in all of them. And makes it hard to maintain. Should we restrict that one machine config can only be in one level (cross level deduplication) and maybe make it only a local config ( can't be in system/global level)

@dberenbaum
Copy link
Collaborator

They are useful in --show-origin mode for now.

Is this an option for dvc machine status? And how would --global/system/project/local be useful with that option? Wouldn't that be more useful without those options so you could see where each machine config came from?

Btw, there are still some discussions on how we deal with the machine config on multi-levels, one machine config might be spread in all of them. And makes it hard to maintain. Should we restrict that one machine config can only be in one level (cross level deduplication) and maybe make it only a local config ( can't be in system/global level)

Do you have an example? My assumption would be that it would follow the same priority as other config options: local > project > global > system. How does this work for remotes? Can someone define some options in local and some in project?

@karajan1001
Copy link
Contributor Author

Is this an option for dvc machine status? And how would --global/system/project/local be useful with that option? Wouldn't that be more useful without those options so you could see where each machine config came from?

machine status shows the machine running status which is decoupled with the original machine config. So no --global/system/project/local for it. But for now, it shows is in the help UI, we need to remove them. Btw, this UI issue also exists in some other machine commands, create, ssh etc. We can create an issue for it.

Do you have an example?

I first notice this problem when I was working on the rename of remote config. Here is the example, imagine that we have a remote name myremote in global/repo/local. What should we do if we rename it on repo level, should it affect the local and global config. If we don't do that then an original same remote will split, causing some strange behavior some of the config stored in global levels might be missed in the new remote config. But if we do this, it might affect some other repo that is using this remote. Its remote setup might be missing, because we can't rename all of the repos in the local machine. Same as the machine config here.

My assumption would be that it would follow the same priority as other config options: local > project > global > system.
Yes it is.

How does this work for remotes?
Machine and remote are the same for now.

Can someone define some options in local and some in the project?
Yes, we can define a config in all of the four levels and they will merge into one with the priority defined above.

For now I have two solutions

  1. Restrict that one machine config can only be in one level (cross level deduplication)
  2. Make it only a local config ( can't be in system/global level)

Both of these solutions can block the cross repo affection.

@dberenbaum
Copy link
Collaborator

For now I have two solutions

1. Restrict that one machine config can only be in one level (cross level deduplication)

2. Make it only a local config ( can't be in system/global level)

Both of these solutions can block the cross repo affection.

My concern is cases where you might want to define the machine/remote at a repo level to share, but the credentials are stored in the local config.

I first notice this problem when I was working on the rename of remote config. Here is the example, imagine that we have a remote name myremote in global/repo/local. What should we do if we rename it on repo level, should it affect the local and global config. If we don't do that then an original same remote will split, causing some strange behavior some of the config stored in global levels might be missed in the new remote config. But if we do this, it might affect some other repo that is using this remote. Its remote setup might be missing, because we can't rename all of the repos in the local machine. Same as the machine config here.

Are there issues other than renaming? Looking at usage stats, the rename commands are very infrequently used, and I'm not convinced they are necessary. Similar to how we ask people to edit stages in dvc.yaml now, I think it's reasonable to rename through manual editing. If we support a rename command, I think it's fine if it's imperfect, and it's up to the user to make sure they are not breaking their config across multiple repos. We can warn about it, but I don't think it's worth trying to prevent it.

@karajan1001
Copy link
Contributor Author

For now I have two solutions

1. Restrict that one machine config can only be in one level (cross level deduplication)

2. Make it only a local config ( can't be in system/global level)

Both of these solutions can block the cross repo affection.

My concern is cases where you might want to define the machine/remote at a repo level to share, but the credentials are stored in the local config.

There are some differences between machine and remote.It might cause a machine missing if we rename a global / system machine, especially for multi-users.
#6633 (comment)

I first notice this problem when I was working on the rename of remote config. Here is the example, imagine that we have a remote name myremote in global/repo/local. What should we do if we rename it on repo level, should it affect the local and global config. If we don't do that then an original same remote will split, causing some strange behavior some of the config stored in global levels might be missed in the new remote config. But if we do this, it might affect some other repo that is using this remote. Its remote setup might be missing, because we can't rename all of the repos in the local machine. Same as the machine config here.

Are there issues other than renaming? Looking at usage stats, the rename commands are very infrequently used, and I'm not convinced they are necessary. Similar to how we ask people to edit stages in dvc.yaml now, I think it's reasonable to rename through manual editing. If we support a rename command, I think it's fine if it's imperfect, and it's up to the user to make sure they are not breaking their config across multiple repos. We can warn about it, but I don't think it's worth trying to prevent it.

Only rename for now

@karajan1001
Copy link
Contributor Author

@Mergifyio rebase

fix: iterative#6838
1. make machine status name optional.
2. show all machine status if no name provide.
3. UI changes for this cmd.
4. add a new test for it.

Co-authored-by: Dave Berenbaum <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Nov 17, 2021

rebase

βœ… Branch has been successfully rebased

@karajan1001 karajan1001 merged commit f43c04f into iterative:master Nov 17, 2021
@efiop efiop added the enhancement Enhances DVC label Nov 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

machine: show all machines in status
5 participants