-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-2867] Refactor code to conform Python standards & guidelines #3714
Conversation
airflow/configuration.py
Outdated
@@ -186,7 +186,8 @@ def _validate(self): | |||
|
|||
self.is_validated = True | |||
|
|||
def _get_env_var_option(self, section, key): | |||
@staticmethod | |||
def _get_env_var_option(key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to section
?
@@ -331,7 +331,7 @@ def execute(self, context): | |||
self.py_file, self.py_options) | |||
|
|||
|
|||
class GoogleCloudBucketHelper(): | |||
class GoogleCloudBucketHelper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For py2 compat this should probablt be class GoogleCloudBucketHelper(object)
@@ -52,10 +52,12 @@ def __init__( | |||
destination_table, | |||
oracle_source_conn_id, | |||
source_sql, | |||
source_sql_params={}, | |||
source_sql_params=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent?
def _delete_top_row_and_compress( | ||
self, | ||
input_file_name, | ||
input_file_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
Hi @ashb, Sorry, I need to sort couple of things here. I have added a WIP flag. Will ping you as soon as it is ready for review. |
74871c3
to
c69025e
Compare
@ashb This is now ready for review. :) |
@@ -2423,7 +2423,7 @@ def test_init_proxy_user(self): | |||
class HDFSHookTest(unittest.TestCase): | |||
def setUp(self): | |||
configuration.load_test_config() | |||
os.environ['AIRFLOW_CONN_HDFS_DEFAULT'] = ('hdfs://localhost:8020') | |||
os.environ['AIRFLOW_CONN_HDFS_DEFAULT'] = 'hdfs://localhost:8020' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not related to this Pr, but we have HDFS tests in core.py? o_O)
- Dictionary creation should be written by dictionary literal - Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. - Functions calling sets which can be replaced by set literal are now replaced by set literal - Replace list literals - Some of the static methods haven't been set static - Remove redundant parentheses
import os | ||
|
||
|
||
class OracleToAzureDataLakeTransfer(BaseOperator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what has been changed to this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line endings... :'(
quoting=csv.QUOTE_MINIMAL, | ||
*args, **kwargs): | ||
super(OracleToAzureDataLakeTransfer, self).__init__(*args, **kwargs) | ||
if sql_params is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feng-tao This line is changed. Git shows entire file has changed due to termination of line. Check https://stackoverflow.com/questions/19593909/git-diff-sees-whole-file-as-changed-when-its-not for more info
lgtm |
Codecov Report
@@ Coverage Diff @@
## master #3714 +/- ##
==========================================
+ Coverage 77.56% 77.56% +<.01%
==========================================
Files 204 204
Lines 15768 15770 +2
==========================================
+ Hits 12230 12232 +2
Misses 3538 3538
Continue to review full report at Codecov.
|
@@ -238,6 +238,8 @@ def create_empty_table(self, | |||
|
|||
:return: | |||
""" | |||
if time_partitioning is None: | |||
time_partitioning = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaxil seems it is missed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Setting the {}
in the arguments is bad practice by the way Python creates these objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fokko No no, it is a bad practise to use {}
or dict()
as a default argument. Reference: https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaxil but why dict() instead of {}? just want to understand for my self
as I know {} more efficient https://stackoverflow.com/questions/664118/whats-the-difference-between-dict-and. In official docs cannot see any recommendations to use dict() instead {} https://docs.python.org/3.6/library/stdtypes.html#dict
and in standard library used {} not dict()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
would probably be better in hindsight, but there's not much in it. On my laptop:
In [5]: %timeit {}
47.2 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [6]: %timeit dict()
168 ns ± 1.77 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
so yes dict() is 3 times as "slow" as {}
, but neither is particularly slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xnuinside Hi, there is no specific reason for me to use {}
compared to dict()
. The point for this change as I mentioned earlier was to remove an empty dictionary from default arguments.
And as Ash pointed out {}
dict literal is faster that dict constructor dict()
but there is not huge difference as the dict builds up. Check out the below links for more detailed read:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @kaxil
@@ -238,6 +238,8 @@ def create_empty_table(self, | |||
|
|||
:return: | |||
""" | |||
if time_partitioning is None: | |||
time_partitioning = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Setting the {}
in the arguments is bad practice by the way Python creates these objects.
@@ -105,7 +106,8 @@ def _stringify(self, iterable, joinable='\n'): | |||
[json.dumps(doc, default=json_util.default) for doc in iterable] | |||
) | |||
|
|||
def transform(self, docs): | |||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, I always put static function above the non-static one in the class order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I would also prefer static methods above instance methods
import os | ||
|
||
|
||
class OracleToAzureDataLakeTransfer(BaseOperator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line endings... :'(
- Dictionary creation should be written by dictionary literal - Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. - Functions calling sets which can be replaced by set literal are now replaced by set literal - Replace list literals - Some of the static methods haven't been set static - Remove redundant parentheses
- Dictionary creation should be written by dictionary literal - Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. - Functions calling sets which can be replaced by set literal are now replaced by set literal - Replace list literals - Some of the static methods haven't been set static - Remove redundant parentheses
- Dictionary creation should be written by dictionary literal - Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. - Functions calling sets which can be replaced by set literal are now replaced by set literal - Replace list literals - Some of the static methods haven't been set static - Remove redundant parentheses
- Dictionary creation should be written by dictionary literal - Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. - Functions calling sets which can be replaced by set literal are now replaced by set literal - Replace list literals - Some of the static methods haven't been set static - Remove redundant parentheses
Make sure you have checked all steps below.
Jira
Description
Tests
N/a, Nothing new added
Commits
Documentation
Code Quality
git diff upstream/master -u -- "*.py" | flake8 --diff