Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4343] Show warning in UI if scheduler is not running #5127

Merged
merged 1 commit into from
May 29, 2019

Conversation

ashb
Copy link
Member

@ashb ashb commented Apr 17, 2019

First commit is from #5125, look at the second commit only for now.
Make sure you have checked all steps below.

Jira

Description

  • Now that the webserver is more stateless, if the scheduler is not
    running the list of dags won't populate, making it harder for new
    starters to work out what is going on.

    When scheduler has never run (I cleared Jobs table):
    Screen Shot 2019-04-17 at 19 12 49

    **When scheduler not running (either crashed, or has stopped successfully)
    Screen Shot 2019-04-17 at 18 47 56

    (the time has a tooltip to display the full ISO8601 heartbeat)

    Login screen/anything not subclassed from AirflowBaseView doesn't display the message.

    On Login (not changed, doesn't show warning):

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason: added tests for helper functions on BaseJob

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

/cc @Fokko as discussed.

@ashb ashb force-pushed the warning-if-scheduler-not-running branch from cb0bf50 to a00fe0d Compare April 17, 2019 21:08
@ashb
Copy link
Member Author

ashb commented Apr 17, 2019

@XD-DENG PTAL too - I've subtly changed the /health endpoint here to use functions from airflow.jobs.

@ashb ashb force-pushed the warning-if-scheduler-not-running branch from a00fe0d to 0af1963 Compare April 17, 2019 21:14
@XD-DENG
Copy link
Member

XD-DENG commented Apr 18, 2019

Awesome feature 👍Will have a deeper dive later.

@ashb ashb force-pushed the warning-if-scheduler-not-running branch 2 times, most recently from 6b31eae to 37f0f2e Compare April 18, 2019 10:14
@ashb ashb marked this pull request as ready for review April 18, 2019 16:14
@ashb ashb force-pushed the warning-if-scheduler-not-running branch 2 times, most recently from e4f7d50 to 17dbba0 Compare April 18, 2019 17:11
Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @ashb I really like this feature 👍

only some comments..

@ashb ashb force-pushed the warning-if-scheduler-not-running branch from 17dbba0 to e3115b2 Compare April 23, 2019 08:23
@sdikby
Copy link

sdikby commented Apr 24, 2019

@ashb Nice feature. A suggestion here: adding a link (with the warning) to the scheduler logs would help debug quicker. It should show only a portion of the logs.

@ashb
Copy link
Member Author

ashb commented Apr 24, 2019

@sdikby That would be nice but is non-trival to do so I'm not going to do it as part of this PR.

@ashb ashb force-pushed the warning-if-scheduler-not-running branch 4 times, most recently from a53202e to 58f1b1f Compare April 29, 2019 08:29
@ashb ashb force-pushed the warning-if-scheduler-not-running branch from 58f1b1f to 47c1bc0 Compare April 29, 2019 21:49
@XD-DENG XD-DENG self-requested a review April 30, 2019 02:21
@ashb ashb force-pushed the warning-if-scheduler-not-running branch 2 times, most recently from f8f8682 to a5f4f22 Compare April 30, 2019 16:19
airflow/jobs.py Outdated

:param grace_multiplier: multiplier of heartrate to require heart beat
within
:type grace_multiplier: number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove this from here and add type hint in L130

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type hint's aren't picked up by docs yet are they?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I don't know tbh :-D

Just for our future reference, we can use this:
https://github.com/agronholm/sphinx-autodoc-typehints

@ashb
Copy link
Member Author

ashb commented May 1, 2019

I think something odd/unexpected is up with the Kube tests as a result of this PR. They consistently fail but don't on master.

airflow/jobs.py Outdated
@@ -94,25 +94,54 @@ class BaseJob(Base, LoggingMixin):
Index('idx_job_state_heartbeat', state, latest_heartbeat),
)

heartrate = conf.getfloat('scheduler', 'JOB_HEARTBEAT_SEC'),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEAR GOD. That took like 4 hours to get enough of a working set up to get the log for!

@ashb ashb force-pushed the warning-if-scheduler-not-running branch from a5f4f22 to 6add858 Compare May 3, 2019 13:03
@codecov-io
Copy link

codecov-io commented May 3, 2019

Codecov Report

Merging #5127 into master will increase coverage by <.01%.
The diff coverage is 90.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5127      +/-   ##
==========================================
+ Coverage   78.93%   78.93%   +<.01%     
==========================================
  Files         480      480              
  Lines       30129    30146      +17     
==========================================
+ Hits        23782    23797      +15     
- Misses       6347     6349       +2
Impacted Files Coverage Δ
airflow/jobs/base_job.py 86.42% <100%> (+1.46%) ⬆️
airflow/www/views.py 75.72% <100%> (+0.03%) ⬆️
airflow/macros/__init__.py 90.9% <33.33%> (-9.1%) ⬇️
airflow/jobs/scheduler_job.py 70.32% <83.33%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a8a4d32...657a2ee. Read the comment docs.

@ashb ashb force-pushed the warning-if-scheduler-not-running branch 3 times, most recently from 17ff921 to a2b1370 Compare May 28, 2019 13:14
Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF
@ashb ashb force-pushed the warning-if-scheduler-not-running branch from a2b1370 to 657a2ee Compare May 28, 2019 15:33
@ashb ashb merged commit 3dcfe28 into apache:master May 29, 2019
@ashb ashb deleted the warning-if-scheduler-not-running branch May 29, 2019 09:50
ashb added a commit to ashb/airflow that referenced this pull request Jun 6, 2019
…#5127)

Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF

(cherry picked from commit 3dcfe28)
andriisoldatenko pushed a commit to andriisoldatenko/airflow that referenced this pull request Jul 26, 2019
…#5127)

Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
…#5127)

Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF
dharamsk pushed a commit to postmates/airflow that referenced this pull request Aug 8, 2019
…#5127)

Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF

(cherry picked from commit 3dcfe28)
ashb added a commit to ashb/airflow that referenced this pull request Oct 10, 2019
Somehow in the cherry pick these tests got missed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants