Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly set json_provider_class on Flask app so it uses our encoder #26554

Merged
merged 1 commit into from
Sep 21, 2022

Conversation

ashb
Copy link
Member

@ashb ashb commented Sep 21, 2022

Setting json_provider_class where we did had no effect, as it turns out Flask() sets self.json = self.json_provider_class(self), so we were setting it too late to take any affect.

Closes #26546, #26527

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Sep 21, 2022
@ashb ashb added kind:bug This is a clearly a bug priority:high High priority bug that should be patched quickly but does not require immediate new release labels Sep 21, 2022
@ashb ashb added this to the Airflow 2.4.1 milestone Sep 21, 2022
@ashb ashb added type:bug-fix Changelog: Bug Fixes and removed kind:bug This is a clearly a bug labels Sep 21, 2022
@ashb ashb requested review from eladkal and potiuk September 21, 2022 12:23
@ashb ashb marked this pull request as ready for review September 21, 2022 12:29
Copy link
Contributor

@tirkarthi tirkarthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was working on a PR for this too. The UI shows error with below traceback for this patch and I am not sure why tests didn't catch this. The places need to use app.json.dumps instead of AirflowJsonEncoder like below patch along with other places too.

Edit : Tests fail, sorry I was running tests probably without patch.

[2022-09-21 18:43:28,419] {_internal.py:224} INFO - 127.0.0.1 - - [21/Sep/2022 18:43:28] "POST /last_dagruns HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 2548, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.handle_exception(e)
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/karthikeyan/stuff/python/airflow/.env/lib/python3.10/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/karthikeyan/stuff/python/airflow/airflow/www/auth.py", line 47, in decorated
    return func(*args, **kwargs)
  File "/home/karthikeyan/stuff/python/airflow/airflow/utils/session.py", line 75, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/karthikeyan/stuff/python/airflow/airflow/www/views.py", line 1049, in last_dagruns
    return wwwutils.json_response(resp)
  File "/home/karthikeyan/stuff/python/airflow/airflow/www/utils.py", line 328, in json_response
    response=json.dumps(obj, indent=4, cls=AirflowJsonEncoder), status=200, mimetype="application/json"
  File "/usr/lib/python3.10/json/__init__.py", line 234, in dumps
    return cls(
  File "/home/karthikeyan/stuff/python/airflow/airflow/utils/json.py", line 47, in __init__
    super().__init__(*args, **kwargs)
TypeError: JSONProvider.__init__() got an unexpected keyword argument 'skipkeys'
diff --git a/airflow/www/utils.py b/airflow/www/utils.py
index d0efa611d5..5e57ade1d8 100644
--- a/airflow/www/utils.py
+++ b/airflow/www/utils.py
@@ -45,6 +45,7 @@ from airflow.models import errors
 from airflow.models.dagwarning import DagWarning
 from airflow.models.taskinstance import TaskInstance
 from airflow.utils import timezone
+from airflow.utils.airflow_flask_app import get_airflow_app
 from airflow.utils.code_utils import get_python_source
 from airflow.utils.helpers import alchemy_to_dict
 from airflow.utils.json import AirflowJsonEncoder
@@ -324,8 +325,9 @@ def epoch(dttm):
 
 def json_response(obj):
     """Returns a json response from a json serializable python object"""
+    app = get_airflow_app()
     return Response(
-        response=json.dumps(obj, indent=4, cls=AirflowJsonEncoder), status=200, mimetype="application/json"
+        response=app.json.dumps(obj, indent=4), status=200, mimetype="application/json"
     )
 
 
diff --git a/airflow/www/views.py b/airflow/www/views.py
index 2c6701505c..ce334bfb65 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -3397,8 +3397,9 @@ class Airflow(AirflowBaseView):
     @action_logging
     def task_instances(self):
         """Shows task instances."""
+        app = get_airflow_app()
         dag_id = request.args.get('dag_id')
-        dag = get_airflow_app().dag_bag.get_dag(dag_id)
+        dag = app.dag_bag.get_dag(dag_id)
 
         dttm = request.args.get('execution_date')
         if dttm:
@@ -3412,7 +3413,7 @@ class Airflow(AirflowBaseView):
                 for ti in dag.get_task_instances(dttm, dttm)
             }
 
-        return json.dumps(task_instances, cls=utils_json.AirflowJsonEncoder)
+        return json.dumps(task_instances, dumps=app.json.dumps)
 
     @expose('/object/grid_data')
     @auth.has_access(
@@ -3423,8 +3424,9 @@ class Airflow(AirflowBaseView):
     )
     def grid_data(self):
         """Returns grid data"""
+        app = get_airflow_app()
         dag_id = request.args.get('dag_id')
-        dag = get_airflow_app().dag_bag.get_dag(dag_id)
+        dag = app.dag_bag.get_dag(dag_id)
 
         if not dag:
             return {'error': f"can't find dag {dag_id}"}, 404
@@ -3467,7 +3469,7 @@ class Airflow(AirflowBaseView):
             }
         # avoid spaces to reduce payload size
         return (
-            htmlsafe_json_dumps(data, separators=(',', ':'), cls=utils_json.AirflowJsonEncoder),
+            htmlsafe_json_dumps(data, separators=(',', ':'), dumps=app.json.dumps),
             {'Content-Type': 'application/json; charset=utf-8'},
         )
 
@@ -3475,7 +3477,8 @@ class Airflow(AirflowBaseView):
     @auth.has_access([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG)])
     def next_run_datasets(self, dag_id):
         """Returns datasets necessary, and their status, for the next dag run"""
-        dag = get_airflow_app().dag_bag.get_dag(dag_id)
+        app = get_airflow_app()
+        dag = app.dag_bag.get_dag(dag_id)
         if not dag:
             return {'error': f"can't find dag {dag_id}"}, 404
 
@@ -3510,7 +3513,7 @@ class Airflow(AirflowBaseView):
                 .all()
             ]
         return (
-            htmlsafe_json_dumps(data, separators=(',', ':'), cls=utils_json.AirflowJsonEncoder),
+            htmlsafe_json_dumps(data, separators=(',', ':'), dumps=app.json.dumps),
             {'Content-Type': 'application/json; charset=utf-8'},
         )
 
@@ -3524,6 +3527,7 @@ class Airflow(AirflowBaseView):
         """Returns dataset dependencies graph."""
         nodes_dict: dict[str, Any] = {}
         edge_tuples: set[dict[str, str]] = set()
+        app = get_airflow_app()
 
         for dag, dependencies in SerializedDagModel.get_dag_dependencies().items():
             dag_node_id = f"dag:{dag}"
@@ -3547,7 +3551,7 @@ class Airflow(AirflowBaseView):
         }
 
         return (
-            htmlsafe_json_dumps(data, separators=(',', ':'), cls=utils_json.AirflowJsonEncoder),
+            htmlsafe_json_dumps(data, separators=(',', ':'), dumps=app.json.dumps),
             {'Content-Type': 'application/json; charset=utf-8'},
         )

@ashb
Copy link
Member Author

ashb commented Sep 21, 2022

@tirkarthi Looks like your fix is better/bigger than mine, so do you want to add a separate PR for those cases? Edit: Oh I see, nm, I'm on it.

I'll remove the "close" marker from this PR as I don't think it fully fixes it? (I admit to not testing in the app, just using the minimal repro case from the closed PR)

Edit 2: a better fix then get_airflow_app() is to use https://flask.palletsprojects.com/en/2.2.x/api/?highlight=json#module-flask.json I think.

@tirkarthi
Copy link
Contributor

@ashb Feel free to use the above patch. I guess there is some change due to inheriting from DefaultJSONProvider where the cls argument used inside htmlsafe_json_dumps fails along with other places using json.dumps from stdlib. Since htmlsafe_json_dumps has dumps argument passing app.json.dumps will have similar effect as earlier where cls was passed. json.dumps can be replaced by app.json.dumps.

@tirkarthi
Copy link
Contributor

yes, flask.json seems better. There is note from 2.2 that current_app.json.dumps is used internally that will make the patch simpler. https://flask.palletsprojects.com/en/2.2.x/api/?highlight=json#flask.json.dumps

Setting `json_provider_class` where we did had no effect, as it turns
out `Flask()` sets `self.json = self.json_provider_class(self)`, so we
were setting it too late.
@ashb ashb force-pushed the json-encoder-fix-k8s branch from de53509 to 608f232 Compare September 21, 2022 16:16
@ashb
Copy link
Member Author

ashb commented Sep 21, 2022

Was a little bit more complex to have a provider work correctly, so asking for re-review.

@@ -17,11 +17,12 @@
# under the License.
from __future__ import annotations

import json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in the past, if user had simplejson installed, then webserver would blow up, that's why we imported from flask.json. It might be worth doing pip install simplejson just to make sure that it is compatible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was this commit: ea3d42a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested that, doesn't blow up (by which I mean tests still create the app and pass)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool -- i just tried it too, installed it and launched webserver and navigated to a view or two

@ashb ashb merged commit 378dfbe into apache:main Sep 21, 2022
@ashb ashb deleted the json-encoder-fix-k8s branch September 21, 2022 21:12
@ashb
Copy link
Member Author

ashb commented Sep 21, 2022

Failure was fixed on main already

jedcunningham pushed a commit that referenced this pull request Sep 23, 2022
…#26554)

Setting `json_provider_class` where we did had no effect, as it turns
out `Flask()` sets `self.json = self.json_provider_class(self)`, so we
were setting it too late.

(cherry picked from commit 378dfbe)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:webserver Webserver related Issues priority:high High priority bug that should be patched quickly but does not require immediate new release type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants