Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin enumerates all compartments and hits 429 for single-compartment users #311

Open
igaevd opened this issue Jan 24, 2025 · 8 comments
Open

Comments

@igaevd
Copy link

igaevd commented Jan 24, 2025

Description of the issue:
I’ve set up the OCI Metrics plugin in a Grafana environment where my user/group only has permission to read metrics in one compartment, not in the root tenancy. During the “Save & Test” connectivity check, the plugin tries a ListMetrics call at the tenancy (root) level, which fails (HTTP 404 or 401). Then it enumerates all compartments in my tenancy anyway, calling ListMetrics for each. Because I only have permissions in a single compartment, every other call fails with 404/401. After ~100 consecutive failures, I start seeing “HTTP 429 TooManyRequests,” and the plugin effectively gets stuck.

Why this is problematic:
I do not have tenancy-wide metric-read permissions, nor do I need them. I only need to query metrics from a single compartment.
The plugin’s TestConnectivity logic tries to validate by listing metrics in the root tenancy first, then falls back to enumerating every subcompartment if that first call fails.
This leads to a flood of auth errors and eventually hitting rate limits (429), preventing the plugin from working properly.

Expected behavior:
If the plugin is configured for a specific compartment (and I only have permission on that compartment), it should skip trying to read from the tenancy root or enumerating every compartment. It should just test connectivity against the configured compartment.

Proposed solution or ideas:
Provide a configuration option or toggle to disable enumerating all compartments, especially during the connectivity test.
Fall back to enumerating only if the user explicitly allows the plugin to do so.
If a compartment ID is configured, the plugin should only test that compartment.

Thank you for considering this change.

I appreciate any guidance or fixes to avoid having to grant root-level permissions just to pass the plugin’s connectivity test.

oci-metrics-datasource-6.0.2
Grafana v9.2.10

Image
@mamorett
Copy link
Member

Can you try to upgrade to latest plugin 6.5.0 ?

@skydontdribble
Copy link

skydontdribble commented Feb 11, 2025

@mamorett I am running the latest plugin 6.5.0 and I'm seeing the same result.
Image
Similarly to @igaevd, the user credentials I have are only authorized to list metrics against one compartment, not an entire tenancy (or root compartment).

I also see that the HTTP request for TestConnectivity accepts either a compartment or tenancy OCID. Given a tenancy OCID, then the request will attempt to list metrics at the root compartment. Since the auth configuration for the datasource requests a tenancy OCID, TestConnectivity sends a request to list metrics at the root compartment. I've tried to work around this limitation by providing a compartment OCID rather than a tenancy OCID so that the HTTP request will be scoped down to the specific compartment in which I am authorized to list metrics. Unexpectedly, that still doesn't work.
Image

I agree with the proposed solution above. Please consider allowing compartment-specific configurations as tenancy-wide configurations can be too permissive in many cases. This may include adapting TestConnectivity to only query metrics within a single compartment rather than a tenancy given a compartment OCID.

[EDIT]
Error when providing a tenancy OCID (of which I am not authorized to list metrics in the entire tenancy):
Image

Error when providing a, more-granular, compartment OCID (of which I am authorized to list metrics in):
Image

@mamorett
Copy link
Member

mamorett commented Feb 12, 2025

Hi. Unfortunately a compartment ocid specific configuration will not work . The fact is that the plugin, when in user principal, is configured using the SDK call common.NewRawConfigurationProvider, which strictly requires a tenancy ocid. You cannot use a compartment ocid when configuring the provider. The listmetrics is then trying to list metrics in the tenancy first, and if this does not work it is keeping retrying till it will find a compartment where it is allowed to perform metrics operations. The issue is that this retry is subject to timeouts, in case tenancy has many compartments and yours is at the bottom of this list, or because OCI will stop you because of the too many requests. The only solution would be to add an additional field in the configuration where you can specify the compartment. I am honestly quite against this solution because of the following reasons:

  • it will introduce a field which will be used for test function only and never more in the plugin lifecycle and operations
  • could drive to confusion to the majority of the customers which has a tenancy wide policy
  • it can also be misleading, since you may have more than one tenancy where your user has the policy which allow metrics operation.

I will investigate other options here, but for now I do not have a nice solution.

@skydontdribble
Copy link

Thanks for the response and context @mamorett!

... The only solution would be to add an additional field in the configuration where you can specify the compartment. I am honestly quite against this solution because of the following reasons:

  • it will introduce a field which will be used for test function only and never more in the plugin lifecycle and operations

Given a new field is introduced that accepts (one or many) compartment OCID(s), couldn't the new field be useful within the plugin's operations by scoping all of its ListMetrics HTTP requests to a specific compartment?

  • could drive to confusion to the majority of the customers which has a tenancy wide policy

Existing (tenancy-wide) policies would continue to work as expected since they provide more permissions than the proposed solution. The solution you propose around adding a new field wouldn't be a breaking change given the field is optional. I am curious how it may create additional confusion if the documentation for the versions of the plugin that implement the field support its users by explaining the updated permission requirements.

  • it can also be misleading, since you may have more than one tenancy where your user has the policy which allow metrics operation.

Given a user has read access to metrics across more than one tenancy and wishes to make them available via the plugin, then it is currently required that the user must supply credentials to the auth configuration blocks for each tenancy, if i understand correctly. Suppose the compartment OCID(s) are another field in the auth configuration block, the compartments would naturally be associated with a tenancy. Can you elaborate on how it might be misleading to users?

@mamorett
Copy link
Member

hi @skydontdribble, thanks for your inputs. So here why I do not think introducing a compartment ocid field in datasource configuration to me is not a good idea:

Given a new field is introduced that accepts (one or many) compartment OCID(s), couldn't the new field be useful within the plugin's operations by scoping all of its ListMetrics HTTP requests to a specific compartment?

No. This additional compartment ocid field will be used only for the test function. For all the other operations the compartment list will be populated using SDK dynamically. This for two reason: compartments can be added or removed, and they will be immediately catch using the SDK, while reading from a static field in the datasource configuration will not allow to reload this list dynamically. Therefore the static value will never be used in the plugin after the test of the datasource. Second, because we read the datasource field, for security reasons, only once when the plugin bootstrap. The configuration values of datasource are functional to establish connectivity with OCI and to set the context. After that point, these values will never be read during the plugin operation lifecycle until the next boot or until you switch datasource.
More than one compartment ocid, will not be in any case possible, because we cannot have dynamic lists in datasource configuration (and will not make sense either).

Existing (tenancy-wide) policies would continue to work as expected since they provide more permissions than the proposed solution. The solution you propose around adding a new field wouldn't be a breaking change given the field is optional. I am curious how it may create additional confusion if the documentation for the versions of the plugin that implement the field support its users by explaining the updated permission requirements.

This is because the new field will introduce a deviation from the standard oci configuration parameters required for a user principal configuration. We want to maintain consistency and coherence between various software, and the datasource configuration here is identical to the schema used in the oci cli. The datasource configuration should maintain the same schema to avoid confusion, to allow user to programmatically generate datasource configuration from ther oci cli config in case, and should not introduce elements which are not relevant for the configuration itself. Here the compartment ocid will not be of any use in the datasource functionality, would be only a way to speedup, and in some case to avoid timeouts in the test function.

Given a user has read access to metrics across more than one tenancy and wishes to make them available via the plugin, then it is currently required that the user must supply credentials to the auth configuration blocks for each tenancy, if i understand correctly. Suppose the compartment OCID(s) are another field in the auth configuration block, the compartments would naturally be associated with a tenancy. Can you elaborate on how it might be misleading to users?

It will generate the following questions for the user with wide tenancy policy: Which compartment should I put here ? and why is a compartment requested if in OCI CLI it is not and it is not used to establish the identity provider with OCI ?

This is why I am not favourable of introducing the compartment ucid in datasource configuration. I will talk with my colleagues btw to have a second opinion. Best regards :)

@mamorett
Copy link
Member

I am also investigating another option, need to check the feasibility. I will keep you posted

@skydontdribble
Copy link

Understood @mamorett! Thank you for the additional clarity and investigation. I look forward to hearing from you regarding the other option.

@mamorett
Copy link
Member

mamorett commented Mar 3, 2025

Hi @skydontdribble : please update your plugin. In version 6.5.1 we introduced a restriction on the compartments listing when using test function to address your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants