Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report storage usage in admin-get-clusters command #2035

Open
YevheniiSemendiak opened this issue Mar 12, 2021 · 8 comments
Open

Report storage usage in admin-get-clusters command #2035

YevheniiSemendiak opened this issue Mar 12, 2021 · 8 comments
Labels

Comments

@YevheniiSemendiak
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Currently, there is no way to check how much storage: is used in a specific cluster.
The only workaround is to go to your cluster provided's (AWS/GCP/whatever) console and check the underlying backend NFS usage. I don't even know what we could do in on-premise installations.

Describe the solution you'd like

It would be nice to see the storage utilization in neuro admin get-clusters command output.

Describe alternatives you've considered

Querying each cluster (provider?) might take some time (especially if the user has access to multiple clusters).
Therefore, as an alternative, we could introduce a different command like neuro admin get-cluster-details <cluster> and see their overall info about cluster such as:

  • storage utilization
  • registry utilization
  • services versions
  • number of users
  • avg job number
  • etc.

Additional context

The request came from the Synthesis team.

@YevheniiSemendiak YevheniiSemendiak added the feature request feature request label Mar 12, 2021
@romasku
Copy link
Contributor

romasku commented Apr 14, 2021

Thanks for the issue, it is a good direction for improvement!

I think we should first add some separate commands because not all reports can be collected easily. Also it will allow us to display more detailed info.

Storage utilization

My idea here is to add neuro storage stats (or neuro storage du) that will print total storage usage plus per-user stats:

User      Usage
romasku   2G
admin      50Mb

Total: 2G Used, 3G Free, 5G Total 

This can be easily (and efficiently) implemented on the server-side.

Registry utilization

Official registry API has no support for retrieving total disk utilization, so we will have to implement it for each cloud provider separately. This will require a lot of effort and probably will not work for onprem.
As second options, we can scan all images periodically (on server) and generate some cached stats view. This will work for most instalations and will allow us to usage by user but will have info with large delay (~1 hour, maybe even more).
I personally prefer second option as it more robust and can generate more user-friedly data, but we need to think about it.

Service versions

I see this as some command like neuro admin get-cluster-service-versions <cluster-name> that will print table with versions (similar to how slack bot prints it) and another command to show info about all clusters user is admin of (something like neuro admin get-service-versions)

Summary for cluster command

After we will implement all of above commands, it will be easy to add some additional command to show summary, through I'm not sure that we will really need it.

@YevheniiSemendiak
Copy link
Contributor Author

Storage

As for me, the command neuro storage du should take the storage path as an argument and compute disk usage for subpath (even storage:// - storage root for current cluster). But what's about RBAC here, is it OK?
neuro storage stats seems to be more general, just to view the utilization of all user's storage, where you have access, right?
What I'm trying to say is that the du cmd seems to be much more flexible and useful, but might be tricky to implement, while stats is much easier and could be a good place to start with (and move towards du later if needed).

Registry

I also do like 2nd option more 👍
In any case, we could later just expose a call to that "calculator" to refresh the cached stats so it will simply give us needed result.

Services versions

Cool idea!

Summary for cluster command

Indeed, right now we don't need a summary command for all mentioned aspects, but if we have each of them separately - implementing some sort of alias to call all of them in a row is straightforward.

Minor remart:
I would like if we could group all those stats commands under the same grop.
Something like
neuro admin get-stats <aspect>, where aspect is one of storage, registry, service-versions whatever.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@YevheniiSemendiak
Copy link
Contributor Author

should I add some APs here?

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions
Copy link

github-actions bot commented Aug 9, 2022

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the Stale label Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants