Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Improve reliability of Windows CI #7617

Closed
1 of 5 tasks
m-allanson opened this issue Aug 24, 2018 · 12 comments
Closed
1 of 5 tasks

[EPIC] Improve reliability of Windows CI #7617

m-allanson opened this issue Aug 24, 2018 · 12 comments
Assignees
Labels
stale? Issue that may be closed soon due to the original author not responding any more.

Comments

@m-allanson
Copy link
Contributor

m-allanson commented Aug 24, 2018

Who will own this?

What Area of Responsibility does this fall into? Who will own the work, and who needs to be aware of the work?

Area of Responsibility:

Select the Area of Responsibility most impacted by this Epic

  • OSS

Summary

Gatsby uses a free Appveyor account to run Windows CI tests. The tests are very slow, and sometimes don't get reported at all. This means that PRs are often not tested on Windows before being merged in.

How will this impact Gatsby?

Domains

List the impacted domains here

Components

List the impacted Components here

Goals

What are the top 3 goals you want to accomplish with this epic? All goals should be specific, measurable, actionable, realistic, and timebound.

How will we know this epic is a success?

What changes must we see, or what must be created for us to know the project was a success. How will we know when the project is done? How will we measure success?

User Can Statement

  • User can...

Metrics to Measure Success

  • We will see an increase /decrease in...

Additional Description

In a few sentences, describe the current status of the epic, what we know, and what's already been done.

What are the risks to the epic?

In a few sentences, describe what high-level questions we still need to answer about the project. How could this go wrong? What are the trade-offs? Do we need to close a door to go through this one?

What questions do we still need to answer, or what resources do we need?

Is there research to be done? Are there things we don’t know? Are there documents we need access to? Is there contact info we need? Add those questions as bullet points here.

How will we complete the epic?

What are the steps involved in taking this from idea through to reality?

How else could we accomplish the same goal?

Are there other ways to accomplish the goals you listed above? How else could we do the same thing?

Next Steps

  • Under Pipeline select Proposed Epics (only if you are NOT the AoR owner)
  • Under Assignees select the AoR Owneryou listed in the Epic
  • Under Labels select Epic
  • Select Create Epic

You're all done!

@vtenfys
Copy link
Contributor

vtenfys commented Aug 26, 2018

cc @m-allanson @KyleAMathews I've created #7652 which describes the two problems currently causing all Windows tests to fail. However, fixing these issues won't impact on the speed of Windows testing, so there might still be problems.

@m-allanson
Copy link
Contributor Author

Great stuff, thanks @davidbailey00 👍

Here's some WIP notes on improving AppVeyor build times, which should also help with reliability.

Rolling builds

There's a "rolling builds" configuration setting for AppVeyor that will tell it to only test the newest commit from any given PR: https://www.appveyor.com/docs/build-configuration/#rolling-builds. This has to be enabled through the AppVeyor UI.

From the Appveyor docs:

"rolling builds" are great for very active OSS projects with lengthy queue. Whenever you do a new commit to the same branch OR pull request all current queued/running builds for that branch or PR are cancelled and the new one is queued. Other words, rolling builds make sure that only the most recent commit is built.

I can't see this option in the AppVeyor UI, I assume @KyleAMathews needs to give @pieh and myself additional permissions on the AppVeyor account?

Fail strategy

Appeyor's default behaviour is to run all build jobs even if one of them fails. There is a fast_finish option which will cancel all other jobs as soon as one job fails.

https://www.appveyor.com/docs/build-configuration/#failing-strategy

Concurrent jobs

Appveyor offers one concurrent job for OSS builds. Additional concurrency can be added by paying for a basic account, and then paying $25/month per additional concurrent job: https://www.appveyor.com/pricing/

Investigate caching

Cache node_modules between builds? Cache anything else?

https://www.appveyor.com/docs/build-cache/

Job matrix configuration

There is an install script that cancels most jobs in the matrix, running them only for releases or forced builds. However, this script does not run until after the repo has been cloned for each job, meaning they can take a couple of minutes to be cancelled. See example.

Can this functionality be replicated via Appveyor's config options? See config reference.

An alternative would be to temporarily drop these extra jobs, and look at adding them back in once everything else here has been investigated.

I assume these jobs don't run on every PR because they take a while - but it seems counterproductive to have tests that are only run under certain conditions. Maybe we should reduce the number of jobs in the matrix and always run them. Instead of having many jobs that are only run under certain conditions.

Other things to investigate

@KyleAMathews
Copy link
Contributor

Nice investigation @m-allanson! Stopping builds + paying for more concurrency seems like easy wins.

@m-allanson
Copy link
Contributor Author

@KyleAMathews has enabled the rolling builds feature

@m-allanson
Copy link
Contributor Author

This could be worth looking at: https://azure.microsoft.com/en-us/blog/announcing-azure-pipelines-with-unlimited-ci-cd-minutes-for-open-source/

@jeremyepling
Copy link

I'm a product manager on Azure Pipelines. Let me know if you have any questions or suggestions.

@pieh
Copy link
Contributor

pieh commented Oct 8, 2018

@jeremyepling Sorry for getting back to You late, we just recently started experimenting with Azure Pipelines and right now we are facing git checkout CRLF/LF problem:
in #8836 there is attempt to fix our unit tests for windows (which passes currently for appveyor CI but fails in our rudimentary Azure Pipelines setup) - snapshot don't match most likely because saved snapshot are checked out with CRLF style line endings (as opposed to LF style line endings that function we tests produce). Is there option to set Azure Pipelines checkout to use LF line endings?

I saw there is checkout configuration, but it doesn't seem to cover that part - https://docs.microsoft.com/en-us/azure/devops/pipelines/yaml-schema?view=vsts#checkout

@pieh
Copy link
Contributor

pieh commented Oct 8, 2018

Seems like we can use .gitattributes to handle CRLF/LF issues - #8922 :)

@jeremyepling
Copy link

I got a successful build after I created a .gitattributes file that sets all line endings to LF for your repository. This file will override Git user settings for CRLF.

I'd like to have a way to specify gitconfig values from the Azure Pipelines YAML, but we don't have that yet.

@jeremyepling
Copy link

I should have refreshed the page earlier. I'm just now seeing your comment. :)

@gatsbot
Copy link

gatsbot bot commented Jan 20, 2019

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Jan 20, 2019
@gatsbot
Copy link

gatsbot bot commented Feb 1, 2019

Hey again!

It’s been 30 since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

@gatsbot gatsbot bot closed this as completed Feb 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale? Issue that may be closed soon due to the original author not responding any more.
Projects
None yet
Development

No branches or pull requests

5 participants