-
-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ci cache for windows #2507
Fix ci cache for windows #2507
Conversation
Yep. I've seen |
Before workflows were configured: haskell-language-server/.github/workflows/test.yml Lines 83 to 87 in 07b9310
So the configuration of the cabal store directory has not changed. |
Ok i was wrong, So The So not sure if using @Anton-Latukha thanks for investigating it |
So maybe the change of |
You are right about Windows build, Windows builds indeed ran 40m-1h longer than other builds, I guessed that they just compile & run tests that long. (& I am biased to look at Linux builds & at macOS as soon they would have fast hardware). https://github.com/haskell/haskell-language-server/runs/4571121606?check_suite_focus=true Shows that it both gets the relevant cache hit & then builds the store. So I agree. That means that Would look into setup action code. |
I also seem like rember setup action switching or considering |
yeah there is an issue about but no action afaik
Afair it is still using chocolatey to install ghc and cabal. So maybe it worth to check the code of the chocolatey install here: https://github.com/Mistuke/CabalChoco/blob/master/3.6.2.0/cabal/tools/chocolateyInstall.ps1 The code i linked in the previous comment is: # If running on Github actions, configure the package to pick things up
if (($null -ne $Env:GITHUB_ACTIONS) -and ("" -ne $Env:GITHUB_ACTIONS)) {
# Update the path on github actions as without so it won't be able to find
# cabal.
echo "$cabal_path" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
# We probably don't need this since choco itself is already on the PATH
# But it won't hurt to make sure.
$choco_bin = Join-Path $env:ChocolateyInstall "bin"
echo "$choco_bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
# New GHC Packages will add themselves to the PATH, but older ones don't.
# So let's find which one the user installed and add them to the pathh.
$files = get-childitem $binRoot -include ghc.exe -recurse
foreach ($file in $files) {
$fileDir = Split-Path "$file"
echo "$fileDir" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
# Also set a global SR.
UpdateCabal-Config "store-dir" "$($env:SystemDrive)\SR"
} So it tries to change the cabal store dir adding in the global cabal config
|
Ok, sorry, it is 02:37 at my location. I would look at the situation anew tomorrow. |
c854db2
to
c3c710c
Compare
Also note that the sources get redownloaded. It is while the setting to cache them was: - if: runner.os == 'Windows'
name: (Windows) Platform config
run: |
echo "CABAL_PKGS_DIR=~\\AppData\\cabal\\packages" >> $GITHUB_ENV (it is seen for example in https://github.com/haskell/haskell-language-server/runs/4574676443?check_suite_focus=true) So we try to cache Also during work, to restart the caching anew - you can advance the Hackage index (cabal in any case rolls back to the last nearby fixed state), that both would load old caches, but workflows would make sure new information is saved in the new cache. As for example, with Windows, the case may be that not full cache was saved somehow & gets reused (it can happen during initial cache runs in PR, for example, So all which are mentioned above - are details. But they are refuted (the is enough implication to mostly refute them in my logic system). It seems to be some Windows setup & GitHub CI-related thing. For example, does CI allows to override directories in the root of a disk, because overriding root dirs is a security concern. Maybe it saved stack files, but cabal files are elsewhere. Maybe chocolatey package changed something & we keep supplying old cache & its file structure does not match the new one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reset the review status so I can be called again when it would be suitable to review.
Please, request me to look into the code again, when we would have what to dialog about in it.
So far I have no idea why caching does not work for Windows.
(I plan to try running choco in VM also, to understand what it does myself, then I would have the ability of a deeper look into the code of things)
I am investigating the cache issue in windows in this pr in my fork: jneira#58, branch https://github.com/jneira/haskell-language-server/tree/fix-windows-cache2 Some things i discovered:
It seems to me that it is a cabal bug in windows only triggered in ci. Locally it works fine and dont rebuild deps. |
I am gonna trace the keys used to compute the hash |
Yes. It can be as simple as constant cache inheritance. During merge I've looked through cache builds that they started & we're saving cache. In Just inserting a space somewhere or advance Hackage index. It may fix it. |
Yeah that will be the next step, invalidate cache keys prefixing them with Otoh triggering two sucessive builds in the same pr should trigger the cache hit with the good lib hashes. However in the main repo succesive builds of the same pr seems to not work. Will use a controlled pr in my fork to confirm or discard that. |
Nota bene: GitHub in the pipeline has a feature to manage the caching storage, on release of it maintainers would be able to navigate the caching storage. So we would be able to go, remove the problematic block & it would regenerate by itself or by just rerun of |
c3c710c
to
24273ca
Compare
Ok i have the evidence of the direct cause of the issue. I've traced the package hash keys of the existing package hash and the new one for one depedency
I think the
(Observe the Ok so invalidating the cache should fix it. what could we do? To consider:
|
However i dont see anything in the pr which could cause that change: should have the opposite effect as th 5th parameter of the ps script which changes from false to true is
🤔 |
Well in any case those little changes to adapt the source cache to the new $CABAL_DIR are good imo |
This rebase check would run a while, caching update is still ongoing in But be ready for the merge 🧑🔧 |
the build still took 38 min and the packages were downloded 🤔 https://github.com/haskell/haskell-language-server/runs/4591267873?check_suite_focus=true |
It is because rebase was done before the cache was built in So PR used old caches & run the build in parallel & saved the caches in parallel. But thankfully all that happens localized to the PR scope. That is why I restarted builds in #2506 after caching was done in Further - the caching rebuilds would be faster (the deps already cached properly, so it would run at speed of project build (also ~50 minutes faster then first run we had)) & after the cache is proper - waits of |
CABAL_DIR
:\\AppData\\cabal
in windows and~/.cabal
in the other osCABAL_DIR
toC:\cabal
. See Expose ghcup binary to PATH on windows actions/runner-images#4264 (comment)steps.HaskEnvSetup.outputs.cabal-store
but as you can see in this run the effective cache action is:C:\sr
???? afaik that is the global cache dir for stack 🤦. The default cabal store dir in windows is~.\AppData\cabal\store
(well before the CABAL_DIR change)I have to confirm it though
This pr embrace the use of CABAL_DIR (setting it to a default value common for all os's if it is not set) for set the cache paths
We can restore the use of
steps.HaskEnvSetup.outputs.cabal-store
when the value is fixed upstream.