Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The process cannot access the file NuCache.Content.db because it is being used by another process #5035

Closed
robertjf opened this issue Mar 21, 2019 · 182 comments

Comments

@robertjf
Copy link
Contributor

robertjf commented Mar 21, 2019

PR: #5924

Something is happening in Azure WebApps where the NuCache.Content.db file is locked causing the site to hang. I've attached the log file for reference.

The exception is as follows:

Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.

-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.

-> System.IO.IOException: The process cannot access the file 'D:\home\site\wwwroot\App_Data\TEMP\NuCache\NuCache.Content.db' because it is being used by another process.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)
   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)
   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNormal..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)
   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(Options options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IUmbracoContextAccessor umbracoContextAccessor, ILogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, ISiteDomainHelper siteDomainHelper, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders)

...

This is on Umbraco 8.0.1 which was upgraded from Umbraco 8.0.0. (attachment will need to be changed from .txt to.json)

UmbracoTraceLog.RD2818786B7D96.20190319.txt

@zpqrtbnk
Copy link
Contributor

Is this on Umbraco Cloud, or on pure Azure? Where is D:\home\site\wwwroot\App_Data\TEMP - is it on a shared drive, or on a local drive? Is it possible that another instance of the site, running on another server, is trying to access these files? These files can only be accessed by 1 instance at a time - we have mechanism in place to ensure this on one server but not accross servers.

@robertjf
Copy link
Contributor Author

This is in standard Azure on a basic plan with single instance only, there’s no special configuration at all, pretty vanilla Umbraco 8 with hacked up starter kit

@Shazwazza
Copy link
Contributor

Shazwazza commented Mar 26, 2019

The same config principles apply to v8 as v7 for running on Azure. We defo need to get the docs updated (PRs are welcome!) The current azure docs are here https://our.umbraco.com/Documentation/Getting-Started/Setup/Server-Setup/azure-web-apps

for v8 you will need these in appSettings:

  • <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />
  • <add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" />
    • Alternatively depending on your requirements there is also Examine.LuceneEngine.Directories.TempEnvDirectoryFactory

@robertjf
Copy link
Contributor Author

interestingly enough, there is no examine config files in this website - I hadn't noticed that before... is that normal? Out of the box shouldn't this just work? Do I need to add the files?

@Shazwazza
Copy link
Contributor

There is no examine config files in v8. OOTB, just like v7, you need to adjust some config to make umbraco work with azure. So the above 2 config values are needed. These are equivalent to the v7 appSetting umbracoLocalTempStorage and the examine config directoryFactory switch.

@Shazwazza
Copy link
Contributor

The 2 config values are just in your web.config, these are just appSetting key values in v8

@robertjf
Copy link
Contributor Author

ah, right - good to know :)

@robertjf
Copy link
Contributor Author

so is it right to assume these are going to be automatically added in future, or are they only in certain situations?

@Shazwazza
Copy link
Contributor

No you cannot assume that, Just like deploying v7 to Azure you need to setup specific config for that, it's no different in v8.

If however, you create an Umbraco website from the Azure portal, then yes these should be pre-configured for you but I don't think we have a v8 build on the Azure portal yet.

@JoseMarcenaro
Copy link
Contributor

Thanks for explaining the required config settings.

But in v8, even with the above settings, the exact error happens when you enable Slot Swap in the Azure App Service: before the swap both slots are working ok, but after the Swap the new "Production" slot throws the error every time. A full App Service restart is needed to make things working again.

If you work with a single slot everything is Ok,

@Shazwazza
Copy link
Contributor

That's interesting, will assume the same problem exists in v7 too since it's the same paradigm.

I think the only way around such behavior is to have an option to not have a persisted cache file, or name the cache file based on the AppDomainAppId + MachineName (should be unique among processes).

@zpqrtbnk what do you think here?

@JoseMarcenaro
Copy link
Contributor

Note: We have a site on Umbraco 7.13.2 running on Azure with slot swapping (and the "LocalTempStorage" ="EnvironmentTemp" app setting) and we never run into this problem with the XML cache file.
Maybe because it's not open all the time?

@zpqrtbnk
Copy link
Contributor

zpqrtbnk commented Apr 4, 2019

NuCache stores its file in Path.Combine(_globalSettings.LocalTempPath, "NuCache") and therefore respects the <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" /> setting. The temp path combines the application id and the site name, but not the machine name, as the local temp is supposed to be local, ie per-server.

I don't know much about "slot swaps" in Azure. But, if they are running on the same machine, same application id, same site name, same temp dir then... there might be a collision? That would require some troubleshooting to get it right. But then, as you mention, it's not only NuCache but other things too.

Now indeed, the difference is that v8 locks the NuCache files for as long as it's running where v7 was "using" the Xml file from time to time = the two sites may cohabit (but that was a bad idea).

So... to make it short, I'd like to hear more about Slot Swaps and, on all slots, get the value of HostingEnvironment.SiteName, HostingEnvironment.ApplicationID, and %temp%.

@JoseMarcenaro
Copy link
Contributor

@zpqrtbnk thanks for looking into this. I will post today the requested information about the Slot Swaps and those values.

@JoseMarcenaro
Copy link
Contributor

This is what I got on each slot, before and after the swap:

<!-- STAGE slot, before swap

    HostingEnvironment.SiteName = 'terniumcomdev__e951'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1464357220/ROOT'
    %temp% = 'D:\local\Temp'
    -->
<!-- PRODUCTION slot, before swap
    HostingEnvironment.SiteName = 'terniumcomdev__dd91'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1742369212/ROOT'
    %temp% = 'D:\local\Temp'
    -->

<!-- STAGE slot, after swap
    HostingEnvironment.SiteName = 'terniumcomdev__dd91'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1742369212/ROOT'
    %temp% = 'D:\local\Temp'
    -->

<!-- PRODUCTION slot, after swap 
    HostingEnvironment.SiteName = 'terniumcomdev__e951'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1464357220/ROOT'
    %temp% = 'D:\local\Temp'
    -->

And this time the error was not thrown.
In a previous execution (with different slots) I got the following:

System.IO.IOException: The process cannot access the file 'D:\local\Temp\UmbracoData\854d45b396372eb551323cc722f33136\NuCache\NuCache.Content.db' because it is being used by another process.

I am positively sure that this setting was in effect when the error was thrown:

<add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />

but this one was not (may this be the issue?)

<add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" />

In my latest test (no error) both settings were in effect.

@zpqrtbnk
Copy link
Contributor

zpqrtbnk commented Apr 5, 2019

The Lucene setting has no impact on NuCache. The fact that you are seeing an error such as The process cannot access the file 'D:\local\Temp\UmbracoData\854...136\NuCache\NuCache.Content.db' indicates that you are indeed using the proper local temp storage setting.

NuCache uses the built-in MainDom mechanism to ensure that only one app domain at a time can own the cache (and the associated files). MainDom uses a machine-wide named lock; the name is built by combining the application id and the application physical path.

(so obviously I should also have asked you the HostingEnvironment.ApplicationPhysicalPath values...)

A site cannot, in theory, even try to access the NuCache file until it has aquired the MainDom lock. Therefore, for the error to happen... the site must own the machine-wide lock on (app.id, app.path) and yet someone else must lock the files at (app.id, site.name), meaning

  • either there is something weird with app.path, or
  • "machine-wide" is the key here, the two slots don't run on the same physical server yet share the same diskspace

Any chance you can get the ApplicationPhysicalPath values?

And... I don't know enough about "slots" to figure out the physical server thing. Ideas?

@JoseMarcenaro
Copy link
Contributor

Just tested it. The ApplicationPhysicalPath value is always D:\home\site\wwwroot\ in both slots, before and after the swap.

Regarding the physical server... it is not fully documented, but many posts assume it is the same because all slots share the same App Service Plan = the same set of resources (i.e. 3.75 Gb RAM).

@zpqrtbnk
Copy link
Contributor

zpqrtbnk commented Apr 7, 2019

So... assuming that all sites run on the same physical server (for now), that leaves us with app.path.

D:\home\site\wwwroot\ is a kinda virtual path that Azure uses, so basically all apps are always under that path, and that can be annoying in some cases... but in our case, it should indeed make things even more safer.

Did not realize you posted a log file with the original issue - now looking at that file.

@JoseMarcenaro
Copy link
Contributor

JoseMarcenaro commented Apr 7, 2019

@zpqrtbnk - the log file is not mine, so it does not apply to the slot issue. Thanks.

@JoseMarcenaro
Copy link
Contributor

In any case, for some mysterious reason the "process cannot access the file NuCache.Content.db" error has not appeared lately - in all slots swaps during the last three days.
I will post more information - including a log - if it does happen again. Thanks for your help!

@zpqrtbnk
Copy link
Contributor

zpqrtbnk commented Apr 7, 2019

Thanks for the update.

Even though it's not your log... the significant lines are:

2019-03-19 03:16:49,695 [P9488/D3/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing e8bc0a7]
2019-03-19 03:16:50,759 [P9488/D3/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:17:39,322 [P9488/D3/T17] INFO   Umbraco.Core.MainDom - Released (environment)
2019-03-19 03:25:35,501 [P9488/D4/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 761ae5c]
2019-03-19 03:25:36,074 [P9488/D4/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:27:45,813 [P9488/D4/T16] INFO   Umbraco.Core.MainDom - Released (environment)
2019-03-19 03:27:49,388 [P9488/D5/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 9ce9254]
2019-03-19 03:27:50,970 [P9488/D5/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:28:24,123 [P9488/D5/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booted. (34755ms) [Timing 9ce9254]
2019-03-19 03:28:59,434 [P9472/D2/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 56405ae]
2019-03-19 03:29:06,148 [P9472/D2/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:29:31,320 [P9472/D2/T1] ERROR  Umbraco.Core.Runtime.CoreRuntime - Boot failed. (32637ms) [Timing 56405ae]
Umbraco.Core.Exceptions.BootFailedException: Boot failed. ---> System.IO.IOException: ...
2019-03-19 03:29:46,191 [P9488/D5/T13] INFO   Umbraco.Core.MainDom - Released (environment)

Where we see that a new process (9472) starts while the previous one (9488) is running, and acquires the MainDom lock before the previous process releases it, thus hitting an exception when trying to read the cache. That should not be possible...

If anything happens again, thanks for reporting. Meanwhile, I've made a few changes so that in 8.0.2 we log all the important infos (app.path, app.id, etc) when the site boots.

@JoseMarcenaro
Copy link
Contributor

JoseMarcenaro commented Apr 9, 2019

Hi @zpqrtbnk
Some more info (the exception is being thrown again in my Azure environment).
A log is attached, the same behavior you described happens in the last lines of the log.

Note that in this case:

  • This is not some kind of "race condition": twelve minutes passed between the first lock acquire and the following attempt.
  • There is no lock release nor application shutdown between the first acquisition (at 18:45) and the following attempt (at 18:59)

UmbracoTraceLog.RD2818784FF824.20190409.zip

I'm running v8.0.1 installed via the nuget.org package.

Hope it helps.

@JoseMarcenaro
Copy link
Contributor

JoseMarcenaro commented Apr 9, 2019

@zpqrtbnk - I could further isolate the problem.

I performed the following steps:

  • at 19:33 GMT stopped the "stage" slot and deleted the log file
  • then I started the app (on the stage slot) and got into Umbraco - that is the start of the attached log file, at 19:35
  • then I "went to play outside" and did nothing with the app for twenty minutes (I am the only user, for now). I checked the log, and nothing had happen during that time.
  • at 19:55 I performed a slot swap (line 16 of the log)
    ... and voilà! the app successfully acquired a new lock, with a different PID, without shutting down the previous instance.

So it looks like this is related to the Azure Slot swap - although it is the same machine, for some reason the process starts over without having shut down the previous one.

UmbracoTraceLog.RD2818784FF824.20190409.zip

@JoseMarcenaro
Copy link
Contributor

One last comment about the slot swap process:
the "swapped" Web app remains the same, in the same machine .. but the URLs that hit it change (i.e. from contoso-stage.azurewebsites.net to contoso.azurewebsites.net). I don't know if immediately after the switch the Azure infrastructure sends some kind of wakeup call.

@zpqrtbnk
Copy link
Contributor

Hey - thanks for the details - just FYI I am away at the Barcelona meetup, with little time for this issue - will resume work on it at the end of the week (so don't feel bad if I don't reply).

@zpqrtbnk
Copy link
Contributor

zpqrtbnk commented Apr 12, 2019

Have looked at your logfile and... here is the thing: the new process (P564) is starting for a different application (since you swapped). So P8108 was for the other application. It does not have to release its lock for P564 to acquire it... OTOH the lock has to be released... but that should happen on the other slot. And so... I would love to see the other log. Is this possible?

Note that it still does not explain what is going on with the cache. I am currently running some experiments with Azure AppServices to try to get a better understanding of it all.

Or maybe not - further experimenting.

@zpqrtbnk
Copy link
Contributor

I have created an Azure App Service hosting a simple MVC web app, with two slots. Had to have one per-slot appSetting in order to force slot-swaps to trigger a restart. Then, have our MainDom system and some temp file lock to the app... and I can swap slots, and see the MainDom lock being properly released, and swaps happening without issues.

I have to accept that something is wrong, considering your log, but for now I am running out of idea.

During the swap, on my test, the old process terminates and then a new process starts. In your log, the new process starts before the old process terminates, but that should not be a problem: this is precisely why we have the MainDom lock.

Is this happening every time you swap, or only from time to time? Is this on a production / live site? To which extend would it be possible for you to run some custom DLLs which would include more tracing / logging?

@JoseMarcenaro
Copy link
Contributor

@zpqrtbnk ,
The error is no longer happening.
By the time I reported this and send you the log, it was happening on every swap.

I did not change anything in Umbraco since then - the same package 8.0.1 is used.

I'm almost sure the difference is that, while trying to workaround the error, I completely disabled Application Insights on Azure - it was enabled by default when I first created the App Service, and I did not opt out of it as I usually do.

To verify if this was the cause I enabled again Application Insights in both slots and something strange happened:

  • On both slots, immediately after installing Application Insights, I went into the site and the same NuCache locking error was thrown - no swap involved, just installing Application Insights which restarts the app. I'm attaching this log, just in case. UmbracoTraceLog.RD2818784FF824.20190413.zip. Error is at the end of the log.
  • I temporarily fixed the error by stopping and starting the App Service, the site started normally.
  • Further swaps between slots worked fine with no errors

So looking in retrospect, now my thoughts are:

  • the error is not related to the swap itself, but to the restart after Application Insights is installed or removed.
  • the reason why this happened to me on every swap is that when creating a stage slot for an app service that has Application Insights installed, some App Settings are duplicated but the feature is not correctly installed in the new slot - so each swap means activating / deactivating Application Insights

Current situation:

  • I can run the site normally (no errors) with or without Application Insights. So it's not a priority for me that you spend more time on the issue.
  • On the other hand, if you want to get to the bone of what happens, I can gladly help by installing any DLL that collects additional information. This is not a production environment yet. And I found a way to reproduce the issue: Enable Application Insights on an App Service that doesn't have it.

Thanks for your help!

@zpqrtbnk
Copy link
Contributor

Thanks for keeping up with the detective work ;-) I have tried to experiment with enabling/disabling AppliCation Insights on my test app, but still cannot reproduce the out-of-order lock management that you see.

Happy that it works for you, but it annoys me.

Will lower the priority of this issue, but still, will try to send you a DLL to test, that would log way more details about what is going on. Stay tuned!

@zpqrtbnk
Copy link
Contributor

This: Debug.zip contains a patched DLL that just logs infos when acquiring the main lock (process id, but also user name, app id and maindom hash) - if you have a moment to try it...

@Shazwazza
Copy link
Contributor

There will always be more than one app domain in netframework (since very early days) when an appdomain 'restarts' because when you bump the web.config or restart, another appdomain starts while the current one is unwinding. An appdomain cannot be 'restarted', it can just be started or shutdown, netframework makes us think it's 'restarting' but it's just another appdomain starting while another is ending. You can in fact have several appdomains running at once if you have a constant shutdown loop. IIS also aids in this too and depends on how it's configured. This behavior is baked into netframework. This is why it's difficult to synchronize access to files in netframework that maintain locks (i.e. lucene, nucache, etc...). This is why MainDom exists. The default MainDom uses a system-wide semaphore named lock. This type of lock doesn't work in Azure or linux which is why we use the DB as the distributed lock with SqlMainDomLock.

MainDom will depend on your setup but sounds like perhaps SqlMainDomLock should be used in your scenario.

Keep in mind, you cannot 'share' (i.e. network share) files between 2 IIS instance without some very particular configurations.

If you switch the underlying physical path of a site while it's running (not sure how that can be done without a restart), then it won't have access to the original files that are locked in order to unlock them. File locks are always done by the OS. Once they are locked, the OS will keep them locked until the original thing that locked them unlocks them.

If you use the

    value="SqlMainDomLock" />
<add
    key="Umbraco.Core.LocalTempStorage"
    value="EnvironmentTemp" />

like you would on azure, this will store these types of files in the current processes %TEMP% directory. This is set using environment variables so it's possible you can redirect this anyways. I don't really understand why making each individual folder for logs, lucene and nucache would help in any way.

This locking issue to me sounds like an hosting configuration issue due to how deployments are being done.

@enkelmedia
Copy link
Contributor

enkelmedia commented May 27, 2021

I do see the scenario that one app domain is in the process of shutting down while another starts but are you saying that its normal for the startup to run in two different app domains at the same time like my logs is showing? Or is this the shutdown loop that you are referring to? The logs that you see is from one website on one machine, nothing shared over network etc. But the "closing" app domain (which is the same website) of course logs to the same folder.

The change of the "Physical Folder"-path of the website in IIS will trigger a restart, I can see in the logs that locks are released for the MainDom-lock (log says something like "MainDom lock released". But during the startup the boot process throws because the NuCache files is still in use by something. But looking at the implementation in the Umbraco source I can see that the release-code does dispose the BTree-classes so given that - files should be "free/released" when the site start. The thing that make me wonder is that it seems from the logs as if two different app domains is starting at the same time, this could potentially be the issue, that one of these keeps the lock on the files while the other fails. But I don't understand why two app domains would start at the same time.

I mean one option for us is to rebuild the NuCache from db after each release - this would work fine for smaller sites but the current site we're migrating has lots of content (nucache-files around 25mb) so it would be very nice to find a way to avoid having to rebuild this during startup. But given that this issue is hard to nail, if we rebuild the NuCache its useful to be able to put ie. the Examine indexes in another folder to avoid having to rebuild this as well (I know that the LuceneDirectoryFactory can solve this). An interesting thing here is that in the setup that we have problem with here both share NuCache and ExamineFiles between the released - but it's always the NuCache files that are used by something else and that make the newly started website throw.

But at the end of the day, this does not look like a MainDom-issue more a issue with the NuCache files and locks on them, either because they are not released or because something in the startup locks them.

Edit:
@Shazwazza I would be happy to share some of the logs files, maybe your trained eye can spot something that I miss. Let me know if you would like to have a look and where to send them.

@enkelmedia
Copy link
Contributor

Looking more at this now.

I actually created a deploy-script that will copy the NuCache-folder and the DistCache-folder from TEMP in from the old release-folder to the new release-folder. This to avoid the lock-issues on the files. Still, the startup will run on 3-4 different AppDomains simultaneously, some of the App domains will say that the local db does not exist (but I'm 100% sure that the file is there) and the last AppDomain start will find the cache, still the first 3 AppDomains will try to build the cache again. This happens both when deploying via Octopus and when just recycling the App Pool.

This only happens when the site is under load, the stage-environment works as I would expect, using the copied temp-files. I'm actually not sure if this is a Umbraco issue, it feels strange that the AppDomains Init-method is executed multiple times simultaneously as this should not be the case according to the docs it should only run for the first request to the app after restart.

Another thing that is also interesting is that we get about 300-400 log files outputed in the log directory during the 60-70 seconds it takes for the site to start again efter the deploy

image

The problem with multiple log files is something that I have not seen in prev. version ie. we run a similar site on 8.8.1 with quite some load and this site only outputs one log file. I will admit that this is on a different server, different IIS etc.

@John-Blair
Copy link
Contributor

Could you maybe take a step back and simplify your release? - as your approach is definitely non-standard - doing anything non-standard is likely to get you into a death spiral situation. My approach is just - stop the website, copy new release into previous release folder overwriting existing files (so you automatically get re-use of the existing TEMP files for nucache and examine - which are not included in the new release), and start the website back up. That's it. No messing about with recycling app pools or copying folders of temp files or re-pointing IIS to a new folder. Is there a good reason you cannot take my approach?

@enkelmedia
Copy link
Contributor

enkelmedia commented May 28, 2021

@John-Blair The approach is the default behavior in Octopus Deploy, it's basically creating a new folder with the new release and changed the Websites "Phisycal Folder" property to point to the new folder. The thing is that we would like to avoid stopping the server before to avoid ugly errors for visitors, our goal is to make the folder-switch and have the site restart again - a visitor would just interpret this as a slow load of the page. It's not a manual recycle but something that happens as a side-effect of the folder-switch - as you might know a change to settings on AppPools or websites can result in restarts of the webapp.

@Shazwazza So any way, I continued my research around this, from what I understand, the deployment-process with Octopus will make multiple changes to different config files after the "Physical Folder"-change. This will result in multiple very rapid restarts of the Website and since the site is under load this means that after each config-change a small number of request might be start to be processed. Since IIS will continue to serve these request in the AppDomain that received the requests it starts to boot Umbraco in multiple AppDomains. The booting of Umbraco does not lock between AppDomains so each AppDomain (that is a result of numerus config changes by Octopus) will continue to boot Umbraco.

I guess this is why I see a lot of log files during startup, I've tried to create a custom build with a lock around the this code

        internal void HandleApplicationStart(object sender, EventArgs evargs)
        {
            // ******** THIS IS WHERE EVERYTHING BEGINS ********
            
            lock (_startupLock)
            {
                // create the register for the application, and boot
                // the boot manager is responsible for registrations
                var register = GetRegister();
                _runtime = GetRuntime();
                _runtime.Boot(register);
            }
        }

But this did not work netither withe a MarshalByRefObject or a Mutex - i really don't get why its booting multiple times.

At the end of the day, the way that we are deploying is kind of special and the fact that the site starts many times in a short period of time is probably something that is very hard to handle in the Umbraco core code.

As of now, I'll stick with my copy of the Cache and let the boot rebuild this ad random times - the lock issue is gone after we copy the cache before each release.

@John-Blair
Copy link
Contributor

ok....fyi we used octopus deploy - 5 years ago not now - and we did not create a new folder for each release we re-used the same folder - so maybe checkout the config options on the version of octopus deploy you are using.

@John-Blair
Copy link
Contributor

wrt not stopping the server...the way we approached that was have the site running on 2 servers and we changed the in-house dns config to switch to the standby server (which ran in read only mode) during an upgrade and then switch back once the upgrade was done and tested.

@enkelmedia
Copy link
Contributor

@John-Blair Each release gets it own folder so that there is always a easy way to rollback to a working state if something goes wrong. Octopus would basically switch the "Phycial Path" of the website back to the old release if we rollback. This is standard Octopus setup with IIS and we will not deploy to the same folder - that is not an option for us.

We could stop the website before the release but this would mean that the site starts returning "Service Unavailable" to visitors and the total downtime will be longer using this approach. It would be an option but a bad option, the root problem is that the locks on the files does not get released when the app pool recycle, we have plenty of ways to work around that but I guess that the main idea with this issue here on GitHub is to figure out and solve the root cause and not provide work arounds.

Sounds like a nice setup with the DNS-switch but in our case we have around 15-20 sites and I would like to keep the deploy process as simple as possible.

@Oxygen-cl
Copy link

Hi,
We are still experiencing issues on Azure solutions.
Examine is having troubles opening and closing all the time.
We have made the azure config changes, and upgraded to v8.15.1

The errors occures on app restarts, deploys and on reindexing.

We are seeing a couple of errors in the logs.
1: Cannot access the NuCache file:
System.IO.IOException: The process cannot access the file 'D:\local\Temp\UmbracoData\147617b457ab74697cdb627fc1c4d9bb\NuCache\NuCache.Content.db' because it is being used by another process. at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options) at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options) at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options) at CSharpTest.Net.Collections.BPlusTree2.OptionsV2.CreateStorage() at CSharpTest.Net.Collections.BPlusTree2.NodeCacheBase..ctor(BPlusTreeOptions2 options) at CSharpTest.Net.Collections.BPlusTree2.NodeCacheNone..ctor(BPlusTreeOptions2 options) at CSharpTest.Net.Collections.BPlusTree2..ctor(BPlusTreeOptions2 ioptions) at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists, ContentDataSerializer contentDataSerializer) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister() at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.b__36_0() at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func1 valueFactory) at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.EnsureCaches() at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.CreatePublishedSnapshot(String previewToken) at System.Lazy1.CreateValue() at System.Lazy1.LazyInitValue() at DIF.UmbracO2.Services.DIFContentService.GetGlobalFolder(UmbracoContext umbracoContext) at DIF.UmbracO2.CustomIndexes.EmployeeIndex.EmployeeIndexPopulator.PopulateEmployees() at DIF.UmbracO2.CustomIndexes.EmployeeIndex.EmployeeIndexPopulator.PopulateIndexes(IReadOnlyList1 indexes) at Umbraco.Examine.IndexRebuilder.RebuildIndex(String indexName) at DIF.UmbracO2.Services.ViaRitzauImportService.StartPressImport(UmbracoContext umbracoContext)`

2: Reader is closed:
2021-08-31_8-30-00

  1. Azure error:
    High number of active threads processing ASP.NET requests
    Azure App Services detected that 166 threads are processing an ASP.NET request. High number of threads in a process can lead to thread pool starvation and context switching which lead to high CPU.
    GCFrame GCFrame HelperMethodFrame Examine.LuceneEngine.Providers.LuceneSearcher+ReaderReopener.ScheduleReopen Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher Examine.LuceneEngine.Providers.LuceneSearcher.GetAllIndexedFields Examine.LuceneEngine.Providers.BaseLuceneSearcher.CreateQuery Examine.LuceneEngine.Providers.BaseLuceneSearcher.CreateQuery DIF.UmbracO2.Services.NewsService.GetNewsList DIF.UmbracO2.Controllers.Api.NewsSearchController.GetNewsList DynamicClass.lambda_method System.Web.Http.Controllers.ReflectedHttpActionDescriptor+ActionExecutor+<>c__DisplayClass6_2.b__2 System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync System.Web.Http.Controllers.ApiControllerActionInvoker+d__1.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.Controllers.ApiControllerActionInvoker+d__1 System.Web.Http]]
    System.Web.Http.Controllers.ApiControllerActionInvoker.InvokeActionAsyncCore
    System.Web.Http.Controllers.ApiControllerActionInvoker.InvokeActionAsync
    System.Web.Http.Controllers.ActionFilterResult+d__5.MoveNext
    System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.Controllers.ActionFilterResult+d__5 System.Web.Http]] System.Web.Http.Controllers.ActionFilterResult.ExecuteAsync System.Web.Http.Filters.AuthorizationFilterAttribute+d__3.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.Filters.AuthorizationFilterAttribute+d__3 System.Web.Http]]
    System.Web.Http.Filters.AuthorizationFilterAttribute.ExecuteAuthorizationFilterAsyncCore
    System.Web.Http.Filters.AuthorizationFilterAttribute.System.Web.Http.Filters.IAuthorizationFilter.ExecuteAuthorizationFilterAsync
    System.Web.Http.Controllers.AuthorizationFilterResult.ExecuteAsync
    System.Web.Http.ApiController.ExecuteAsync
    System.Web.Http.Dispatcher.HttpControllerDispatcher+d__15.MoveNext
    System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.Dispatcher.HttpControllerDispatcher+d__15 System.Web.Http]] System.Web.Http.Dispatcher.HttpControllerDispatcher.SendAsync System.Net.Http.HttpMessageInvoker.SendAsync System.Web.Http.Dispatcher.HttpRoutingDispatcher.SendAsync System.Net.Http.DelegatingHandler.SendAsync System.Web.Http.HttpServer+d__24.MoveNext System.Runtime.CompilerServices.AsyncTaskMethodBuilder1[[System.__Canon mscorlib]].Start[[System.Web.Http.HttpServer+d__24 System.Web.Http]]
    System.Web.Http.HttpServer.SendAsync
    System.Net.Http.HttpMessageInvoker.SendAsync
    System.Web.Http.WebHost.HttpControllerHandler+d__12.MoveNext
    System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[System.Web.Http.WebHost.HttpControllerHandler+d__12 System.Web.Http.WebHost]]
    System.Web.Http.WebHost.HttpControllerHandler.ProcessRequestAsyncCore
    System.Web.TaskAsyncHelper.BeginTask
    System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute
    System.Web.HttpApplication.ExecuteStepImpl
    System.Web.HttpApplication.ExecuteStep
    System.Web.HttpApplication+PipelineStepManager.ResumeSteps
    System.Web.HttpApplication.BeginProcessRequestNotification
    System.Web.HttpRuntime.ProcessRequestNotificationPrivate
    System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper
    System.Web.Hosting.PipelineRuntime.ProcessRequestNotification
    DomainNeutralILStubClass.IL_STUB_ReversePInvoke
    InlinedCallFrame
    InlinedCallFrame
    DomainNeutralILStubClass.IL_STUB_PInvoke
    System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper
    System.Web.Hosting.PipelineRuntime.ProcessRequestNotification
    DomainNeutralILStubClass.IL_STUB_ReversePInvoke
    ContextTransitionFrame
    `

@jaandrews
Copy link
Contributor

I'm also running into this nucache error with a load balanced azure site. In my case, I suspect it's a latency issue, as the database in based in the US, but the only front end server that has had this issue is in France. If that is the cause, one potential solution would to add support for azure cosmos. Not sure that is feasible to implement, since it is a nosql database, but it should improve database read speeds for multi region sites.

@p-m-j
Copy link
Contributor

p-m-j commented Feb 15, 2022

@jaandrews which maindomlock implemenation are you using?

For azure this should be configured to SqlMainDomLock

@jaandrews
Copy link
Contributor

@p-m-j
I just have

<add key="Umbraco.Core.MainDom.Lock" value="SqlMainDomLock" />

in the web config. Site is running umbraco 8.17.1 with cloudflare acting as the load balancer in front of azure.

@p-m-j
Copy link
Contributor

p-m-j commented Feb 17, 2022

@jaandrews presumably you also have EnvironmentTemp for LocalTempStorageLocation

Any other details you can provide about the issue, is this on startup or just happens with no obvious trigger?
Would you be able to send stack traces / logs?

@jaandrews
Copy link
Contributor

@p-m-j Yeah, LocalTempStorage is already set to EnvironmentTemp. As for the error I got, it's the same one as at the beginning of this thread. After taking a second look at that log, I confirmed the error I encounter stems from a timeout error. The error message is "The thread has been aborted, because the request has timed out." which is then followed by the nucache error (see
here for the error and the log messages leading up to it). From the logs it seems like it's timing out attempting to read a file (assuming it's the nucache file). Not sure what, if anything I can tweak to avoid that.

@chriskarkowsky
Copy link

chriskarkowsky commented Feb 18, 2022

We received this error yesterday as well with our Azure App Service. App was running, then failed to boot. A complete restart of the App Service was necessary to fix this.

"Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.\n\n-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.\n\n-> System.IO.IOException: The process cannot access the file 'C:\\local\\Temp\\UmbracoData\\be080f813588b61e6bd0c5b32b217f8b\\NuCache\\NuCache.Content.db' because it is being used by another process.

<add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" /> <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />

Are both within the web.config, and

Umbraco.Core.MainDom.Lock = SqlMainDomLock
is set within the Application Settings.

@jaandrews
Copy link
Contributor

@chriskarkowsky Is there a timeout error in front of that error in your logs (would look something like the logs I linked to in my previous comment). If so increasing the executionTimout of the httpRuntime in the web config might help. I've been in touch with umbraco support and that's one of the solutions they've suggested. Makes sense, but I still need to try it myself. So it would be something like

<httpRuntime executionTimeout="1800"/>

which would have it timeout after 30 minutes. Not ideal, but would avoid the site crashing if it works as expected.

@p-m-j p-m-j reopened this Feb 18, 2022
@chriskarkowsky
Copy link

@jaandrews I have this timeout exception:
{"@t":"2022-02-17T20:48:44.8174278Z","@mt":"Unhandled controller exception occurred for request '{RequestUrl}'","@l":"Error","@x":"System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out\r\n at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)\r\n at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\r\n `

However the timestamp was about an hour before the Boot Failure messages. It seems your case was right before the Boot Failure.

What I find strange is that has a Boot Failure log on the same ProcessID as it was currently running on. This means Umbraco is trying to boot itself again on the same Process? Restarting the App Service would spin up a new process and boot, which is why i'm guessing this what fixed my case.

@robertjf
Copy link
Contributor Author

Just wanted to note that we have been experiencing this issue consistently over the past week with v9.3.0 not only on an Azure AppService, but also in local development (VS2022). The only thing that made any difference was disabling the Disk Cache databases as per the documentation on Umbraco. (https://our.umbraco.com/Documentation/Reference/V9-Config/NuCacheSettings/)

@zBestData
Copy link

Just wanted to note that we have been experiencing this issue consistently over the past week with v9.3.0 not only on an Azure AppService, but also in local development (VS2022). The only thing that made any difference was disabling the Disk Cache databases as per the documentation on Umbraco. (https://our.umbraco.com/Documentation/Reference/V9-Config/NuCacheSettings/)

Thank you! I have been having the same problem with Version 9.3.0!

@robertjf
Copy link
Contributor Author

I can't help feeling that something changed in 9.3.0 that has exacerbated the issue - we only started seeing this behaviour when we upgraded from the previous version...

@robertjf
Copy link
Contributor Author

robertjf commented Feb 24, 2022

I just had this re-occur in Umbraco Cloud on Umbraco 8.17.2:

Message:
An unhandled exception occurred

									
Exception:
System.IO.IOException: The process cannot access the file 'C:\local\Temp\UmbracoData\748c65bf441f4d693b51602dbfa2e300\NuCache\NuCache.Content.db' because it is being used by another process.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)
   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)
   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNone..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)
   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists, ContentDataSerializer contentDataSerializer)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister()
   at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.<EnsureCaches>b__36_0()
   at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.EnsureCaches()
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.CreatePublishedSnapshot(String previewToken)
   at System.Lazy`1.CreateValue()
   at System.Lazy`1.LazyInitValue()
   at Umbraco.Web.UmbracoInjectedModule.EnsureHasContent(UmbracoContext context, HttpContextBase httpContext)
   at Umbraco.Web.UmbracoInjectedModule.EnsureUmbracoRoutablePage(UmbracoContext context, HttpContextBase httpContext)
   at Umbraco.Web.UmbracoInjectedModule.ProcessRequest(HttpContextBase httpContext)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step)
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

reading through the logs, I see this a little further down right before I get the NuCache errors above (~ every 4 seconds):

									
Message:
Exception ("4aa0a49a").
The thread has been aborted, because the request has timed out.

									
Exception:
System.Threading.ThreadAbortException: Thread was being aborted.
   at SNIReadSyncOverAsync(SNI_ConnWrapper* , SNI_Packet** , Int32 )
   at SNINativeMethodWrapper.SNIReadSyncOverAsync(SafeHandle pConn, IntPtr& packet, Int32 timeout)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
   at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
   at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
   at System.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Byte[] buff, Int32 offset, Int32 len, Int32& totalRead)
   at System.Data.SqlClient.TdsParserStateObject.TryReadString(Int32 length, String& value)
   at System.Data.SqlClient.TdsParser.TryReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj)
   at System.Data.SqlClient.TdsParser.TryReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj, SqlCommandColumnEncryptionSetting columnEncryptionOverride, String columnName)
   at System.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32 i, Boolean readHeaderOnly)
   at System.Data.SqlClient.SqlDataReader.TryReadColumn(Int32 i, Boolean setTimeout, Boolean allowPartiallyReadColumn)
   at System.Data.SqlClient.SqlDataReader.GetValues(Object[] values)
   at NPoco.RowMappers.PropertyMapper.Map(DbDataReader dataReader, RowMapperContext context)
   at NPoco.MappingFactory.Map(DbDataReader dataReader, Object instance)
   at NPoco.Database.<Read>d__156`1.MoveNext()
   at NPoco.Database.<QueryImp>d__164`1.MoveNext()
   at Umbraco.Core.Persistence.NPocoDatabaseExtensions.<QueryPaged>d__8`1.MoveNext()
   at Umbraco.Web.PublishedCache.NuCache.DataSource.DatabaseDataSource.<GetAllContentSources>d__13.MoveNext()
   at Umbraco.Web.PublishedCache.NuCache.ContentStore.SetAllFastSortedLocked(IEnumerable`1 kits, Boolean fromDb)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.LoadContentFromDatabaseLocked(IScope scope, Boolean onStartup)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.LockAndLoadContent(Func`2 action)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.<EnsureCaches>b__36_0()

@robertjf
Copy link
Contributor Author

@p-m-j I'm because this issue is re-opened, does that mean that Umbraco are actively investigating again?

@p-m-j
Copy link
Contributor

p-m-j commented Feb 24, 2022

Apologies, re-opened on a whim because it's clear from the activity there are still problems in this area, I shouldn't have re-opened as this issue already has an associated fix with a version number.

If you have steps to reproduce please open a new issue.

In this case, we fixed #5924 for version 8.1.1 - which wasn't enough, but that's why this issue is closed and included in the release notes.

Sorry again for any confusion.

@p-m-j p-m-j closed this as completed Feb 24, 2022
@iqb-dawn
Copy link

@p-m-j we are using Umbraco version 8.12.2. The site was running since 8 months until this morning we got this issue,

{
    "@t": "2022-03-17T08:17:25.2359719Z",
    "@mt": "{StartMessage} [Timing {TimingId}]",
    "StartMessage": "Resolving composer types.",
    "TimingId": "9a80bb3",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "INFO ",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}
{
    "@t": "2022-03-17T08:17:33.4274703Z",
    "@mt": "{EndMessage} ({Duration}ms) [Timing {TimingId}]",
    "EndMessage": "Resolved.",
    "Duration": 8191,
    "TimingId": "9a80bb3",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "INFO ",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}
{
    "@t": "2022-03-17T08:17:49.3132993Z",
    "@mt": "{FailMessage} ({Duration}ms) [Timing {TimingId}]",
    "@l": "Error",
    "@x": "Umbraco.Core.Exceptions.BootFailedException: Boot failed. ---> System.IO.IOException: The process cannot access the file 'C:\\local\\Temp\\UmbracoData\\a41b629950826bb5a8cb20844bdcd974\\NuCache\\NuCache.Content.db' because it is being used by another process.\r\n   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)\r\n   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)\r\n   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)\r\n   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)\r\n   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()\r\n   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNone..ctor(BPlusTreeOptions`2 options)\r\n   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)\r\n   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists) in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\DataSource\\BTree.cs:line 27\r\n   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService.MainDomRegister() in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\PublishedSnapshotService.cs:line 174\r\n   at Umbraco.Core.Runtime.MainDom.Register(Action install, Action release, Int32 weight) in D:\\a\\1\\s\\src\\Umbraco.Core\\Runtime\\MainDom.cs:line 89\r\n   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(PublishedSnapshotServiceOptions options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IProfilingLogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders) in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\PublishedSnapshotService.cs:line 128\r\n   at DynamicMethod(Object[] )\r\n   at LightInject.ServiceContainer.<>c__DisplayClass150_0.<WrapAsFuncDelegate>b__0() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3798\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.EmitConstructorDependency(IEmitter emitter, Dependency dependency) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4158\r\n   at LightInject.ServiceContainer.EmitConstructorDependencies(ConstructionInfo constructionInfo, IEmitter emitter, Action`1 decoratorTargetEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4120\r\n   at LightInject.ServiceContainer.EmitNewInstanceUsingImplementingType(IEmitter emitter, ConstructionInfo constructionInfo, Action`1 decoratorTargetEmitMethod) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4080\r\n   at LightInject.ServiceContainer.EmitNewInstance(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4034\r\n   at LightInject.ServiceContainer.EmitNewInstanceWithDecorators(ServiceRegistration serviceRegistration, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3929\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__2(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.CreateDelegate(Type serviceType, String serviceName, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4743\r\n   at LightInject.ServiceContainer.CreateDefaultDelegate(Type serviceType, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4705\r\n   at LightInject.ServiceContainer.GetInstance(Type serviceType) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3437\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111\r\n   at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItem(IFactory factory, Type itemType) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 33\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.<>c__DisplayClass10_0.<CreateItems>b__0(Type x) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 103\r\n   at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext()\r\n   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)\r\n   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 102\r\n   at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 25\r\n   at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateCollection(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 120\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.<>c__DisplayClass20_0`1.<Register>b__0(IServiceFactory f) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 172\r\n   at DynamicMethod(Object[] )\r\n   at LightInject.ServiceContainer.<>c__DisplayClass150_0.<WrapAsFuncDelegate>b__0() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3798\r\n   at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657\r\n   at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169\r\n   at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656\r\n   at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649\r\n   at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856\r\n   at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776\r\n   at LightInject.ServiceContainer.CreateDelegate(Type serviceType, String serviceName, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4743\r\n   at LightInject.ServiceContainer.CreateDefaultDelegate(Type serviceType, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4705\r\n   at LightInject.ServiceContainer.GetInstance(Type serviceType) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3437\r\n   at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111\r\n   at Umbraco.Core.FactoryExtensions.GetInstance[T](IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\FactoryExtensions.cs:line 23\r\n   at Umbraco.Core.Runtime.CoreRuntime.Boot(IRegister register, DisposableTimer timer) in D:\\a\\1\\s\\src\\Umbraco.Core\\Runtime\\CoreRuntime.cs:line 204\r\n   --- End of inner exception stack trace ---",
    "FailMessage": "Boot failed.",
    "Duration": 99252,
    "TimingId": "3b6648c",
    "SourceContext": "Umbraco.Core.Runtime.CoreRuntime",
    "ProcessId": 12564,
    "ProcessName": "w3wp",
    "ThreadId": 1,
    "AppDomainId": 2,
    "AppDomainAppId": "LMW3SVC1154430189ROOT",
    "MachineName": "DW0SDWK0003EO",
    "Log4NetLevel": "ERROR",
    "HttpRequestNumber": 1,
    "HttpRequestId": "5104ab84-400a-4e30-8512-bd4ae8199a7c"
}

We have these setings in Azure,

  {
    "name": "Umbraco.Core.LocalTempStorage",
    "value": "EnvironmentTemp",
    "slotSetting": false
  },
  {
    "name": "Umbraco.Core.MainDom.Lock",
    "value": "SqlMainDomLock",
    "slotSetting": false
  },
  {
    "name": "Umbraco.Examine.LuceneDirectoryFactory",
    "value": "Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine",
    "slotSetting": false
  }

I have restarted the Azure web app to make it work. There is only one server. We are fear that this will happen again, What can we do?

@p-m-j
Copy link
Contributor

p-m-j commented Mar 17, 2022

@iqb-dawn if it’s a small site the easiest thing that guarantees you will never get exceptions thrown relating to the nucache physical files is to just turn off the nucache physical files see example here https://our.umbraco.com/documentation/Reference/V9-Config/NuCacheSettings/#additional-settings, the only down side being for a large site it may take longer to boot.

@soreng
Copy link
Contributor

soreng commented Apr 27, 2022

@p-m-j as per above

we are using Umbraco version 8.12.2

Switching of the file creation is a v9 feature AFAIK.

@p-m-j
Copy link
Contributor

p-m-j commented Apr 27, 2022

Switching off the file creation is a v9 feature AFAIK.

The same setting was around in v8, the documentation for it is hidden away on this page https://our.umbraco.com/Documentation/Fundamentals/Setup/Server-Setup/Load-Balancing/azure-web-apps-v8 (search this page for PublishedSnapshotServiceOptions)


I'm going to lock conversation here, please open a new issue if you have steps to reproduce or detailed logs.

@umbraco umbraco locked as resolved and limited conversation to collaborators Apr 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.