Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reload gunicorn workers #32102

Merged
merged 7 commits into from
Aug 7, 2023
Merged

Conversation

AVMusorin
Copy link
Contributor

Since gunicorn can't reload a new code if starts with --preload setting, we need to check reload_on_plugin_change before set it up.

Gunicorn can't reload a new code because the code is preloaded into the master process and worker are launched with fork, they will still have the old code.

@boring-cyborg boring-cyborg bot added area:CLI area:webserver Webserver related Issues labels Jun 23, 2023
@AVMusorin AVMusorin force-pushed the fix-gunicorn-reload branch from 9edbb17 to 13f12b0 Compare June 23, 2023 17:03
@potiuk potiuk requested a review from ephraimbuddy June 23, 2023 17:11
potiuk
potiuk previously requested changes Jun 28, 2023
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a good reason we added --preload, https://github.com/apache/airflow/pull/27297/files and it was to remove lots of integrity warnings generated by parallel gunicorns trying to update the database in parallell and generating race conditions.

This cahnge means that when "reload_on_plugin_change" is turned on, we will start getting those warnings, unless the root cause of the problem has been fixed (I think it hasn't been fixed).

So I think this change only makes sense to be implemented if we find another way to prevent the integrity errors. For example adding some locking strategy for the DB operations that generated integrity warnings - without it, it will be a regression.

@potiuk
Copy link
Member

potiuk commented Jun 28, 2023

cc: @ephraimbuddy ^^ would love your comment on that one

@ephraimbuddy
Copy link
Contributor

I think there was a good reason we added --preload, https://github.com/apache/airflow/pull/27297/files and it was to remove lots of integrity warnings generated by parallel gunicorns trying to update the database in parallell and generating race conditions.

This cahnge means that when "reload_on_plugin_change" is turned on, we will start getting those warnings, unless the root cause of the problem has been fixed (I think it hasn't been fixed).

So I think this change only makes sense to be implemented if we find another way to prevent the integrity errors. For example adding some locking strategy for the DB operations that generated integrity warnings - without it, it will be a regression.

Exactly my thoughts. Without fixing the source of the IntegrityErrors, this change will be a regression.

@AVMusorin How about adding a new configuration that when set to True(default), enables this preloading instead of using the plugin's config? That way it won't be a breaking change.

@AVMusorin
Copy link
Contributor Author

How about adding a new configuration that when set to True(default), enables this preloading instead of using the plugin's config? That way it won't be a breaking change.

Agree, no problem for me.
Thanks for review.

@potiuk
Copy link
Member

potiuk commented Jun 29, 2023

How about adding a new configuration that when set to True(default), enables this preloading instead of using the plugin's config? That way it won't be a breaking change.

Agree, no problem for me. Thanks for review.

OK. Ping us when added. Fine for me too.

@potiuk potiuk mentioned this pull request Jul 21, 2023
2 tasks
@potiuk potiuk linked an issue Jul 21, 2023 that may be closed by this pull request
2 tasks
@ephraimbuddy
Copy link
Contributor

After much debugging on this, I found that once --preload is used, you cannot reload plugins even if you set reload_on_plugin_change to True. The reason is that the flask app is loaded into gunicorn worker together with the plugins:

init_plugins(flask_app)
.

I'd suggest using the reload_on_plugin_change to toggle the --preload. What do you think @potiuk

@AVMusorin AVMusorin force-pushed the fix-gunicorn-reload branch from 13f12b0 to 3287fab Compare July 28, 2023 09:14
@potiuk
Copy link
Member

potiuk commented Jul 30, 2023

I'd suggest using the reload_on_plugin_change to toggle the --preload. What do you think @potiuk

Sure, if you think it's fine - but it will bring the previous behaviour if reload_on_plugin_change is true - not sure if the effect of it was devastating or just Logs with IntegrityError - this what we are opening to.

If you think It's fine for me - so I won't block it for 2.7.0 - reloading on plugin change is useful, and maybe having Integrity check errors is the price to pay. Maybe it would be a good idea to add a comment for that option that when it is enabled it will (or may? not sure if it happened always) produce those kind of errors and this is expected...

@potiuk potiuk dismissed their stale review July 30, 2023 14:53

Removing my "require changes" after Ephraim's comments.

@potiuk potiuk force-pushed the fix-gunicorn-reload branch from 3287fab to 1418f49 Compare July 30, 2023 14:53
@AVMusorin
Copy link
Contributor Author

Maybe it would be a good idea to add a comment for that option that when it is enabled it will (or may? not sure if it happened always) produce those kind of errors and this is expected...

There was a good description in comments, I moved it to warning message.

Since gunicorn can't reload a new code if starts with ``--preload``
setting, we need to check ``reload_on_plugin_change`` before set it up.

Gunicorn can't reload a new code because the code is preloaded into the
master process and worker are launched with ``fork``, they will still have
the old code.
@AVMusorin AVMusorin force-pushed the fix-gunicorn-reload branch from c5080af to c37f687 Compare August 1, 2023 06:22
@ephraimbuddy
Copy link
Contributor

I'd suggest using the reload_on_plugin_change to toggle the --preload. What do you think @potiuk

Sure, if you think it's fine - but it will bring the previous behaviour if reload_on_plugin_change is true - not sure if the effect of it was devastating or just Logs with IntegrityError - this what we are opening to.

If you think It's fine for me - so I won't block it for 2.7.0 - reloading on plugin change is useful, and maybe having Integrity check errors is the price to pay. Maybe it would be a good idea to add a comment for that option that when it is enabled it will (or may? not sure if it happened always) produce those kind of errors and this is expected...

Would like to hear what @jedcunningham @dstandish and @ashb think about this. I will request the changes back until we have agreed on this. In production, it doesn't make sense to reload plugin, so this gives me concern

Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent accidental merge

@ashb
Copy link
Member

ashb commented Aug 1, 2023

The whole point of preload is quite literally to pre-load all the code so that the workers start up quicker. The downside to that is they code changes anywhere won't be reflected.

If you want plugins to reload don't use preload.

The pr message as it stands looks good to me

airflow/cli/commands/webserver_command.py Outdated Show resolved Hide resolved
airflow/cli/commands/webserver_command.py Outdated Show resolved Hide resolved
airflow/config_templates/config.yml Outdated Show resolved Hide resolved
AVMusorin and others added 3 commits August 2, 2023 08:44
Co-authored-by: Jed Cunningham <[email protected]>
Co-authored-by: Jed Cunningham <[email protected]>
Co-authored-by: Jed Cunningham <[email protected]>
@AVMusorin AVMusorin requested a review from ephraimbuddy August 2, 2023 19:15
@ephraimbuddy ephraimbuddy added this to the Airflow 2.7.0 milestone Aug 6, 2023
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple of commits to refine the message. The warning messag especially has some wording issues.

@ephraimbuddy ephraimbuddy merged commit f78a836 into apache:main Aug 7, 2023
@AVMusorin AVMusorin deleted the fix-gunicorn-reload branch August 7, 2023 08:17
ephraimbuddy pushed a commit that referenced this pull request Aug 8, 2023
* Toggle gunicorn --preload with reload_on_plugin_change

Since gunicorn can't reload a new code if starts with ``--preload``
setting, we need to check ``reload_on_plugin_change`` before set it up.

Gunicorn can't reload a new code because the code is preloaded into the
master process and worker are launched with ``fork``, they will still have
the old code.

* added warning message

* Improve warning message

Co-authored-by: Jed Cunningham <[email protected]>

* Improve warning message

Co-authored-by: Jed Cunningham <[email protected]>

* Improve warning message

Co-authored-by: Jed Cunningham <[email protected]>

* Minor tweak of warning message

* Mention prod issue in config description

---------

Co-authored-by: Jed Cunningham <[email protected]>
Co-authored-by: Tzu-ping Chung <[email protected]>
(cherry picked from commit f78a836)
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Aug 9, 2023
@mujiannan
Copy link

I think we should allow users to set auto-reload-when-plugins-change for scheduler also. The parsing_pre_import_modules doesn't take effect in some conditions, for example, a new-registered custom timetable in a new plugin.

@potiuk
Copy link
Member

potiuk commented Jan 6, 2024

I think we should allow users to set auto-reload-when-plugins-change for scheduler also. The parsing_pre_import_modules doesn't take effect in some conditions, for example, a new-registered custom timetable in a new plugin.

Feel free to open PRs for that Or create a feature proposal. Commenting on closed issue is a straight way to your comment being completely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI area:webserver Webserver related Issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make "--preload" flag optional
7 participants