Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki: Add support for Azure Workload Identity authentication #7540

Merged
merged 3 commits into from
Jan 13, 2023

Conversation

Tolsto
Copy link
Contributor

@Tolsto Tolsto commented Oct 28, 2022

What this PR does / why we need it:

Adds support for Azure Workload Identity authentication.

@Tolsto Tolsto requested a review from a team as a code owner October 28, 2022 00:22
@CLAassistant
Copy link

CLAassistant commented Oct 28, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Oct 28, 2022
@Tolsto Tolsto force-pushed the azure-workload-identity branch 2 times, most recently from 0ecbde2 to 4134182 Compare October 28, 2022 00:36
@Tolsto Tolsto force-pushed the azure-workload-identity branch from 4134182 to 93e30f9 Compare October 28, 2022 01:21
@jkroepke
Copy link
Contributor

Please validate, if it re-reads the token after a hour, since the token file is rotating each hour.

Copy link
Contributor

@kavirajk kavirajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for PR @Tolsto. Added one minor feedback. Can you please add small test to check if adal.NewServicePrincipalTokenFromFederatedToken works correctly with this additional flag and configs?

Comment on lines 386 to 391
var environmentName string
if b.cfg.Environment == azureGlobal {
environmentName = "AzurePublicCloud"
} else {
environmentName = b.cfg.Environment
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add AzurePublicCloud as part of the list of const Environments top of this file.

Also you could avoid else here.

		environmentName := azurePublicCloud

		if b.cfg.Environment != azureGlobal {
			environmentName = b.cfg.Environment
		}

looks more readable IMHO.

Q: Also curious what's the difference between AzureGlobal vs AzurePublicCloud environments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where AzureGlobal is coming from but it doesn't exist in the Azure Go SDKs. It's AzurePublicCloud there, i.e. you cannot do an environment lookup via EnvironmentFromName with AzureGlobal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

The workload identity webhook is exposing a authority host AZURE_AUTHORITY_HOST which is the env.ActiveDirectoryEndpoint. Consider to prefer the environment variable instead a own cloud env lookup

Copy link
Contributor Author

@Tolsto Tolsto Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but decided to go with the reference implementation outlined by the person who added the code to adal. I also looked at the new azidentity module where AZURE_AUTHORITY_HOST gets only used as a fallback option. It will only honor it if no cloud configuration has been set. However, the current implementation of Azure storage in Loki requires the cloud environment name to be set, i.e. adding fallback logic is imho futile.

@Tolsto Tolsto force-pushed the azure-workload-identity branch from 93e30f9 to 3f40965 Compare November 2, 2022 01:09
@pull-request-size pull-request-size bot added size/L and removed size/M labels Nov 2, 2022
@Tolsto Tolsto force-pushed the azure-workload-identity branch from 3f40965 to 6120ef4 Compare November 2, 2022 01:53
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@Tolsto Tolsto requested a review from kavirajk November 2, 2022 02:04
@jkroepke
Copy link
Contributor

jkroepke commented Nov 3, 2022

FYI: Workload Identity will be official supported jan 2023 - Azure/azure-sdk-for-go#15615

Edit:

ref: #5413 - why discontinued azure-storage-blob is used.

@Tolsto
Copy link
Contributor Author

Tolsto commented Nov 3, 2022

FYI: Workload Identity will be official supported jan 2023 - Azure/azure-sdk-for-go#15615

Doesn't help us much as Loki doesn't use azure-sdk-for-go for the Azure storage integration but the discontinued azure-storage-blob SDK.

@Tolsto Tolsto force-pushed the azure-workload-identity branch 2 times, most recently from 30a571a to 32d4d5c Compare November 4, 2022 05:35
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@Tolsto Tolsto force-pushed the azure-workload-identity branch from 32d4d5c to 853e577 Compare November 8, 2022 14:08
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@Tolsto Tolsto requested review from weisdd and removed request for kavirajk November 9, 2022 06:31
@Tolsto Tolsto requested a review from kavirajk November 15, 2022 08:50
@Tolsto Tolsto force-pushed the azure-workload-identity branch from 853e577 to 230d0b0 Compare December 1, 2022 06:20
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@Tolsto
Copy link
Contributor Author

Tolsto commented Dec 13, 2022

@kavirajk Anything missing for this PR to move forward?

@keisari-ch
Copy link

Would be a nice addition, because actual workaround using sidecar requires enabling privilege escalation.

@Tolsto
Copy link
Contributor Author

Tolsto commented Jan 10, 2023

ping @kavirajk

Copy link
Contributor

@jeschkies jeschkies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't know Azure well enough to accept this change. I does look good to me though.

mockedServicePrincipalToken *adal.ServicePrincipalToken
}

func (suite *FederatedTokenTestSuite) SetupTest() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Are you using the suite to get a test setup code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the suite to set up an isolated environment with mocks for testing the Azure federated token functionality. There might be other ways to do it, but I come from the Ruby/Rspec world, and using a suite was the most familiar approach for me.

@@ -829,6 +829,13 @@ The `azure_storage_config` configures Azure as a general storage for different d
# CLI flag: -<prefix>.azure.use-managed-identity
[use_managed_identity: <boolean> | default = false]

# Use a Federated Token to authenticate to the Azure storage account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Azure users understand what a federated token is? Do you have a link to the documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the second link in the documentation? I think that would be helpful.

Also, I know this is a big ask but could you provide documentation similar to https://grafana.com/docs/loki/next/storage/#aws-deployment-s3-single-store

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the second link is very helpful as it describes the federated token architecture in general. >90% of Loki users will use it with the reference implementation (Azure Workload Identity) in a Kubernetes cluster.
I'm willing to provide documentation if this PR can then get merged in the very near future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me 😊

@jeschkies jeschkies enabled auto-merge (squash) January 12, 2023 08:25
@jeschkies
Copy link
Contributor

@Tolsto once you've fixed the merge conflicts we can land the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants