-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPOTenantSettings: Failing to deploy or test the resource's state with latest Graph version #4746
Comments
It looks like this fix doesn't work when you deploy other SPO resources before deploying the SPOTenantSettings. In that case a PNP has already been made and connecting via Graph still fails. Running a Disconnect-PnPOnline first doesn't work unfortunately. |
I had, still have, a lot of things to take care of and forgot to open a new issue for it. Yes it doesn't solve the issue in all cases and I don't know how to solve it any other way if we need to keep using PnP 1.x instead of 2.x because of PS5, I'm pretty sure that updating PnP would solve this. For my use case I solved it by processing the deployments separately by workload and then when it reaches SPO sort the resources to process SPOTenantSettings first, a big band-aid but it's the only way I have right now to keep this working. |
Oh of course in between each deployment you also need to stop the |
I can reproduce the issue very easy by first running a Connect-PnPOnline and then a Connect-MgGraph. So far I haven't been able to fix a broken session 😢 But not giving up yet! |
Yes pretty much that's what the resource was doing, if you reverse the connections then it works but as you've noticed if there are other SPO resources processed first, or even OD for that matters since it also uses PnP, then you get the same problem again. I don't think there's an easy fix for this with PnP 1.x, or we could just get rid of retrieving the timezone and connection to Graph wouldn't be required. EDIT: of course forgot to mention but it should be obvious, even if we remove that connection to Graph and it "works" if you try to process OD and SPO workloads before AAD or Intune you'll get the same problem, we were just lucky so far because the workloads are processed alphabetically and no one noticed the problem before. |
A former team mate is part of the PnP team. Asked him and it sounded familiar to him. Has something to do with the order of loading dll's. The Graph is probably requiring newer ones, so if you first load PnP the older DLL's are loaded that Graph cannot handle. Solution is to start loading of code in a separate process as described here. Example: $ApplicationId = '<id>'
$CertificateThumbprint = '<thumbprint>'
$TenantId = '<tenantname>.onmicrosoft.com'
Start-Job {
$params = $args[0]
New-M365DSCConnection -Workload 'PnP' -InboundParameters @{
ApplicationID = $params.ApplicationId
CertificateThumbprint = $params.CertificateThumbprint
TenantId = $params.TenantId
}
Get-PnPContext
} -ArgumentList @{ApplicationID = $ApplicationId;CertificateThumbprint = $CertificateThumbprint;TenantId = $TenantId} | Receive-Job -Wait
New-M365DSCConnection -Workload 'MicrosoftGraph' -InboundParameters @{ApplicationID = $ApplicationId;CertificateThumbprint = $CertificateThumbprint;TenantId = $TenantId}
Get-MgContext Since this happens with PnP PowerShell in combination with Graph, I would suggest to move all PnP cmdlets into a Start-Job function. That way, you can use resources in any random order. |
Uh! That's actually quite clever, still a big band-aid if you ask me but much better than having to shuffle resources in a weird way to cope with the problem. I'd say go for it, right now I'm really busy and won't be able to do that, but I also spoke with Nik just about an hour ago and once I'm available I'll start sending several patches for EXO resources that are quite not working. The reason I eventually found out this problem with SPO was because EXO also has problems if deployments are done next to Graph albeit the problem here is directly with WMI, not actually importing the DLLs in a certain order. |
@ricmestre Why do EXO resources need updating? This has something to do with PnP and we are not using PnP for EXO, right? When going down the Start-Job route: We need to think about batching requests. Each job is a new process and requires a new authentication towards M365. Would be wise to run multiple commands in one request. |
I'm reviewing EXO workload and a lot of them have several different issues that need patches because they're not working, but another thing is that if I only use EXO resources (that are currently working) in a deployment they work fine but if I mix with Intune then WMI starts spitting errors in random order. Imagine the following example that I'm using in my integration tests, deploying EXO, Intune, O365, OD, SPO and Teams workload I found the problem with PnP and Graph and removed SPOTenantSettings from the deployment so when I tried again and it worked, but in my tests just right after the deployment I call Test-DscConfiguration and this always fails in EXO in random resources, not always the same, and with different WMI errors. Removing Intune from the equation the problems went away so to fix my problem was to make the deployments, and the tests after, separate per each workload. See #4632 that is one of the errors I got, but I had others sometimes even without error codes just something like |
Would you say that the here-and-now solution would be to use Powershell 7 ? |
Unfortunately, that isn't an option. The LCM is still based on PSv5.1 😢 |
@ykuijs As I've mentioned the problems I've seen with EXO + Graph were at the deployment stage on WMI events, not while importing the modules. Also this happens with ExchangeOnlineManagement 3.4.0, we had to rollback 3.5.0 since it currently doesn't work with service principals. Nevertheless I've just seen someone reporting problems importing Graph modules if EXO is imported first (although it's 3.5.0 which we are not using, as explained above) so most likely your workaround for PnP might also need to be adapted for EXO and/or Graph, see microsoftgraph/msgraph-sdk-powershell#2803 |
Has anyone looked into https://github.com/microsoft/CloudConfigurationManager as an alternative to DSC ? |
I can answer that myself, seeing that @NikCharlebois and @andikrueger is on the list of contributors... |
CCM doesn't in itself solve the basic problem but it looks much easier to modify Test- and Start-CCMConfiguration to batch resources in separate Powershell jobs and in this way avoid (most) DLL conflicts. |
Well, 'much easier' is a stretch since PSJobs don't inherit global vars meaning that each resource is unaware of all others and will have to do its own authentication. This could be alleviated by using runspaces and synchronized hashtables like this article indicates: https://learn-powershell.net/2013/04/19/sharing-variables-and-live-objects-between-powershell-runspaces/ CCM could be modified to manage runspaces and 'shared variables' based on parameters in order to allow it to still be used with other DSC-resources. |
CCM follows a different approach to remove the dependency on LCM. It will handle a configuration as such and make the necessary calls into the test and set functions of the resources. From a DLL loading perspective and the description of the issues above it will make no difference as long as everything would be run in the same run space. Using jobs might resolve this one but will cause endless sign-ins as long as the sessions cannot be shared. Having one run space per workload might solve this issue all together. This can also be achieved by a configuration that is split into workloads and executed on different host. Using this kind of approach allows parallelism and reduces runtime. Furthermore it allows for more flexibility in configuration and separation of credentials. |
@andikrueger That's exactly my current workaround for this, I split my build and deployment scripts into separate workloads, whatever ones might be configured, but in my case I run them all in sequence always in the same host (using MSHosted pipeline containers so they are actually assigned random hosts per each run). |
@ricmestre Really interested in how you have implemented this. I am in the final stages of updating the new version of the M365DSC whitepaper. Your ideas might be a good addition for vNext. Could you please share more details? |
I have a build script that iterates per each tenant, and then per each workload, which converts individual policies in markdown to dsc format in a single blueprint and then compile it to MOF. That blueprint will only contain policies that were modified in a pull request so that during deployment I don't have to re-deploy all policies of the customer, the PR itself is also validated (with branch policy enforced) by ensuring that the change done to the markdown works, first by running a linter that I've created that checks for instance if the property being changed is an integer and you used a string, then if it's able to convert it to dsc format and if that converted blueprint can be compiled into MOF. I even go the extra mile and call In the deploy script then I will have individual MOF files separated by tenant which have their own folder and the MOFs to be deployed are called localhost-name_of_workload.mof, that I copy to localhost.mof to be processed and between each workload I have a sleep for 30s which I've seen helped in some scenarios where if they are processed too fast the deployment might fail. I also have integration tests for New/Update/Remove so while running the tests the mofs are actually called localhost-name_of_workload-test_mode.mof and additionally call |
Based on the PR of @ricmestre, @hvdbrink discovered that connecting with the Graph in the first resources that uses PnP also solves this issue. Since any of the resources that uses PnP can be the first resource, this means we have to add a Connect to Graph to every PnP resource, even when that resource doesn't use Graph. Not an ideal situation, but I would say definitely a quick and easy workaround for now. Really interested in your feedback of this workaround. A more permanent solution could be the use of runspaces. Below some code for using shared variables across runspaces. I haven't tested this yet with connections to PnP, that is be a good next test: $hash = [hashtable]::Synchronized(@{})
$hash.One = 1
Write-host ('Value of $Hash.One before background runspace is {0}' -f $hash.one) -ForegroundColor Green -BackgroundColor Black
$runspace = [runspacefactory]::CreateRunspace()
$runspace.Open()
$runspace.SessionStateProxy.SetVariable('Hash',$hash)
$powershell = [powershell]::Create()
$powershell.Runspace = $runspace
$powershell.AddScript({
$hash.one++
}) | Out-Null
$handle = $powershell.BeginInvoke()
While (-Not $handle.IsCompleted) {
Start-Sleep -Milliseconds 100
}
Write-host ('Value of $Hash.One after first execution in background runspace is {0}' -f $hash.one) -ForegroundColor Green -BackgroundColor Black
$powershell.Commands.Clear()
$powershell.AddScript({
$hash.one += 20
}) | Out-Null
$handle = $powershell.BeginInvoke()
While (-Not $handle.IsCompleted) {
Start-Sleep -Milliseconds 100
}
$powershell.EndInvoke($handle)
$runspace.Close()
$powershell.Dispose()
Write-host ('Value of $Hash.One after second execution in background runspace is {0}' -f $hash.one) -ForegroundColor Green -BackgroundColor Black |
Given that the issue is the load-order of modules, here's a simple solution in MSCloudLoginAssistant for the PnP-workload:
The last ErrorAction is for the fringe-case that the config doesn't contain any Graph-resources |
That sounds like a more elegant, but still temporary solution. @salbeck-sit Have you tested that importing the module is the resolution instead of connecting to Graph? |
I haven't tested it as such but the common issue is that each module contains a list of DLLs and the first version of a DLL loaded 'wins' - meaning that a subsequent load of a DLL with the same name is ignored, regardless of version. |
Yes, the actual connect, or calling any other cmdlets for that matters, is not important in this case, what we need to ensure is that the modules are loaded in the correct order because the first one wins as @salbeck-sit mentioned. |
Have tested if importing is sufficient and you are absolutely right 😉 Will submit a PR for the MSCloudLoginAssistant to implement this workaround |
Created PR and requested @NikCharlebois to review and merge: |
@ykuijs do we expect to see the new whitepaper anytime soon? |
Description of the issue
When @salbeck-sit added the TenantDefaultTimezone to SPOTenantSettings at first this was working, but it looks like with Graph 2.19 this isn't working properly anymore, the resource can be exported but not deployed or tested afterwards giving this error message
It looks like our old PnP version has dlls that conflict with latest Graph version, but the fix is simple enough, we just need to connect to Graph first before connecting to SPO. I already tried this and it worked so I'll raise a PR with a fix for this.
Microsoft 365 DSC Version
1.24.605.1
Which workloads are affected
SharePoint Online
The DSC configuration
Verbose logs showing the problem
Environment Information + PowerShell Version
N/A
The text was updated successfully, but these errors were encountered: