Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-2867] Refactor code to conform Python standards & guidelines #3714

Merged
merged 1 commit into from
Aug 7, 2018

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Aug 7, 2018

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
  • Dictionary creation should be written by dictionary literal
  • Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
  • Functions calling sets which can be replaced by set literal are now replaced by set literal
  • Replace list literals
  • Some of the static methods haven't been set static
  • Remove redundant parentheses

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    N/a, Nothing new added

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes git diff upstream/master -u -- "*.py" | flake8 --diff

@@ -186,7 +186,8 @@ def _validate(self):

self.is_validated = True

def _get_env_var_option(self, section, key):
@staticmethod
def _get_env_var_option(key):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to section?

@@ -331,7 +331,7 @@ def execute(self, context):
self.py_file, self.py_options)


class GoogleCloudBucketHelper():
class GoogleCloudBucketHelper:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For py2 compat this should probablt be class GoogleCloudBucketHelper(object)

@@ -52,10 +52,12 @@ def __init__(
destination_table,
oracle_source_conn_id,
source_sql,
source_sql_params={},
source_sql_params=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent?

def _delete_top_row_and_compress(
self,
input_file_name,
input_file_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

@kaxil kaxil changed the title [AIRFLOW-2867] Refactor code to conform Python standards & guidelines [WIP][AIRFLOW-2867] Refactor code to conform Python standards & guidelines Aug 7, 2018
@kaxil
Copy link
Member Author

kaxil commented Aug 7, 2018

Hi @ashb, Sorry, I need to sort couple of things here. I have added a WIP flag. Will ping you as soon as it is ready for review.

@kaxil kaxil force-pushed the code-cleaning branch 2 times, most recently from 74871c3 to c69025e Compare August 7, 2018 20:37
@kaxil kaxil changed the title [WIP][AIRFLOW-2867] Refactor code to conform Python standards & guidelines [AIRFLOW-2867] Refactor code to conform Python standards & guidelines Aug 7, 2018
@kaxil
Copy link
Member Author

kaxil commented Aug 7, 2018

@ashb This is now ready for review. :)

@@ -2423,7 +2423,7 @@ def test_init_proxy_user(self):
class HDFSHookTest(unittest.TestCase):
def setUp(self):
configuration.load_test_config()
os.environ['AIRFLOW_CONN_HDFS_DEFAULT'] = ('hdfs://localhost:8020')
os.environ['AIRFLOW_CONN_HDFS_DEFAULT'] = 'hdfs://localhost:8020'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not related to this Pr, but we have HDFS tests in core.py? o_O)

- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
import os


class OracleToAzureDataLakeTransfer(BaseOperator):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what has been changed to this file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line endings... :'(

quoting=csv.QUOTE_MINIMAL,
*args, **kwargs):
super(OracleToAzureDataLakeTransfer, self).__init__(*args, **kwargs)
if sql_params is None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feng-tao This line is changed. Git shows entire file has changed due to termination of line. Check https://stackoverflow.com/questions/19593909/git-diff-sees-whole-file-as-changed-when-its-not for more info

@feng-tao
Copy link
Member

feng-tao commented Aug 7, 2018

lgtm

@codecov-io
Copy link

Codecov Report

Merging #3714 into master will increase coverage by <.01%.
The diff coverage is 76.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3714      +/-   ##
==========================================
+ Coverage   77.56%   77.56%   +<.01%     
==========================================
  Files         204      204              
  Lines       15768    15770       +2     
==========================================
+ Hits        12230    12232       +2     
  Misses       3538     3538
Impacted Files Coverage Δ
airflow/www_rbac/forms.py 100% <ø> (ø) ⬆️
airflow/jobs.py 82.76% <ø> (ø) ⬆️
airflow/www/views.py 68.88% <0%> (ø) ⬆️
airflow/www_rbac/views.py 72.85% <0%> (ø) ⬆️
airflow/utils/log/gcs_task_handler.py 0% <0%> (ø) ⬆️
airflow/operators/hive_stats_operator.py 0% <0%> (ø) ⬆️
airflow/executors/dask_executor.py 2% <0%> (ø) ⬆️
airflow/operators/check_operator.py 58.26% <100%> (ø) ⬆️
airflow/hooks/S3_hook.py 95% <100%> (+0.17%) ⬆️
airflow/utils/cli.py 100% <100%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d9fecba...c3b5ea1. Read the comment docs.

@feng-tao feng-tao merged commit 120f485 into apache:master Aug 7, 2018
@@ -238,6 +238,8 @@ def create_empty_table(self,

:return:
"""
if time_partitioning is None:
time_partitioning = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaxil seems it is missed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? Setting the {} in the arguments is bad practice by the way Python creates these objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko No no, it is a bad practise to use {} or dict() as a default argument. Reference: https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments

The above change is because we had the below code:
image

which I changed to the following:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaxil but why dict() instead of {}? just want to understand for my self
as I know {} more efficient https://stackoverflow.com/questions/664118/whats-the-difference-between-dict-and. In official docs cannot see any recommendations to use dict() instead {} https://docs.python.org/3.6/library/stdtypes.html#dict
and in standard library used {} not dict()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{} would probably be better in hindsight, but there's not much in it. On my laptop:

In [5]: %timeit {}
47.2 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %timeit dict()
168 ns ± 1.77 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

so yes dict() is 3 times as "slow" as {}, but neither is particularly slow.

Copy link
Member Author

@kaxil kaxil Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xnuinside Hi, there is no specific reason for me to use {} compared to dict(). The point for this change as I mentioned earlier was to remove an empty dictionary from default arguments.

And as Ash pointed out {} dict literal is faster that dict constructor dict() but there is not huge difference as the dict builds up. Check out the below links for more detailed read:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaxil , @ashb , thx!

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @kaxil

@@ -238,6 +238,8 @@ def create_empty_table(self,

:return:
"""
if time_partitioning is None:
time_partitioning = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? Setting the {} in the arguments is bad practice by the way Python creates these objects.

@@ -105,7 +106,8 @@ def _stringify(self, iterable, joinable='\n'):
[json.dumps(doc, default=json_util.default) for doc in iterable]
)

def transform(self, docs):
@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I always put static function above the non-static one in the class order

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I would also prefer static methods above instance methods

import os


class OracleToAzureDataLakeTransfer(BaseOperator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line endings... :'(

@kaxil kaxil deleted the code-cleaning branch August 8, 2018 08:34
lxneng pushed a commit to lxneng/incubator-airflow that referenced this pull request Aug 10, 2018
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
jeffkpayne pushed a commit to bomboradata/bombora-incubator-airflow that referenced this pull request Dec 20, 2018
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
ashb pushed a commit to ashb/airflow that referenced this pull request Jan 10, 2019
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Jan 23, 2019
stlava pushed a commit to Nextdoor/airflow that referenced this pull request Sep 4, 2019
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants