diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index bc96bdc6f5239..22666faf715da 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -264,7 +264,7 @@ meets these guidelines: 1. The pull request should include tests, either as doctests, unit tests, or both. The airflow repo uses [Travis CI](https://travis-ci.org/apache/airflow) to run the tests and [codecov](https://codecov.io/gh/apache/airflow) to track coverage. You can set up both for free on your fork (see the "Testing on Travis CI" section below). It will help you making sure you do not break the build with your PR and that you help increase coverage. 1. Please [rebase your fork](http://stackoverflow.com/a/7244456/1110993), squash commits, and resolve all conflicts. 1. Every pull request should have an associated [JIRA](https://issues.apache.org/jira/browse/AIRFLOW/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). The JIRA link should also be contained in the PR description. -1. Preface your commit's subject & PR's title with **[AIRFLOW-XXX]** where *XXX* is the JIRA number. We compose release notes (i.e. for Airflow releases) from all commit titles in a release. By placing the JIRA number in the commit title and hence in the release notes, Airflow users can look into JIRA and Github PRs for more details about a particular change. +1. Preface your commit's subject & PR's title with **[AIRFLOW-XXX]** where *XXX* is the JIRA number. We compose release notes (i.e. for Airflow releases) from all commit titles in a release. By placing the JIRA number in the commit title and hence in the release notes, Airflow users can look into JIRA and GitHub PRs for more details about a particular change. 1. Add an [Apache License](http://www.apache.org/legal/src-headers.html) header to all new files 1. If the pull request adds functionality, the docs should be updated as part of the same PR. Doc string are often sufficient. Make sure to follow the Sphinx compatible standards. 1. The pull request should work for Python 2.7 and 3.5. If you need help writing code that works in both Python 2 and 3, see the documentation at the [Python-Future project](http://python-future.org) (the future package is an Airflow requirement and should be used where possible). diff --git a/UPDATING.md b/UPDATING.md index ac3e38a35eb75..512b66bbd967e 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -38,7 +38,7 @@ Sensors are now accessible via `airflow.sensors` and no longer via `airflow.oper For example: `from airflow.operators.sensors import BaseSensorOperator` becomes `from airflow.sensors.base_sensor_operator import BaseSensorOperator` -### Renamed "extra" requirments for cloud providers +### Renamed "extra" requirements for cloud providers Subpackages for specific services have been combined into one variant for each cloud provider. @@ -191,7 +191,7 @@ that he has permissions on. If a new role wants to access all the dags, the admi We also provide a new cli command(``sync_perm``) to allow admin to auto sync permissions. ### Modification to `ts_nodash` macro -`ts_nodash` previously contained TimeZone information alongwith execution date. For Example: `20150101T000000+0000`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information, restoring the pre-1.10 behavior of this macro. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. +`ts_nodash` previously contained TimeZone information along with execution date. For Example: `20150101T000000+0000`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information, restoring the pre-1.10 behavior of this macro. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. Examples: * `ts_nodash`: `20150101T000000` diff --git a/airflow/contrib/example_dags/example_qubole_operator.py b/airflow/contrib/example_dags/example_qubole_operator.py index 826a50af99cd9..5f77d09ba1442 100644 --- a/airflow/contrib/example_dags/example_qubole_operator.py +++ b/airflow/contrib/example_dags/example_qubole_operator.py @@ -65,7 +65,7 @@ def compare_result(ds, **kwargs): fetch_logs=True, # If `fetch_logs`=true, will fetch qubole command logs and concatenate # them into corresponding airflow task logs - tags='aiflow_example_run', + tags='airflow_example_run', # To attach tags to qubole command, auto attach 3 tags - dag_id, task_id, run_id qubole_conn_id='qubole_default', # Connection id to submit commands inside QDS, if not set "qubole_default" is used @@ -220,7 +220,7 @@ def main(args: Array[String]) { program=prog, language='scala', arguments='--class SparkPi', - tags='aiflow_example_run', + tags='airflow_example_run', dag=dag) t11.set_upstream(branching) diff --git a/airflow/contrib/hooks/azure_data_lake_hook.py b/airflow/contrib/hooks/azure_data_lake_hook.py index 21787382209c6..9eb7af7f8a71e 100644 --- a/airflow/contrib/hooks/azure_data_lake_hook.py +++ b/airflow/contrib/hooks/azure_data_lake_hook.py @@ -77,7 +77,7 @@ def upload_file(self, local_path, remote_path, nthreads=64, overwrite=True, are not supported. :type local_path: str :param remote_path: Remote path to upload to; if multiple files, this is the - dircetory root to write within. + directory root to write within. :type remote_path: str :param nthreads: Number of threads to use. If None, uses the number of cores. :type nthreads: int diff --git a/airflow/contrib/hooks/fs_hook.py b/airflow/contrib/hooks/fs_hook.py index 6832f20c225c1..1aa528b6205dc 100644 --- a/airflow/contrib/hooks/fs_hook.py +++ b/airflow/contrib/hooks/fs_hook.py @@ -30,7 +30,7 @@ class FSHook(BaseHook): example: Conn Id: fs_test Conn Type: File (path) - Host, Shchema, Login, Password, Port: empty + Host, Schema, Login, Password, Port: empty Extra: {"path": "/tmp"} """ diff --git a/airflow/contrib/hooks/qubole_hook.py b/airflow/contrib/hooks/qubole_hook.py index 1c98f26afcd00..df11a50d5d8d3 100755 --- a/airflow/contrib/hooks/qubole_hook.py +++ b/airflow/contrib/hooks/qubole_hook.py @@ -194,7 +194,7 @@ def get_jobs_id(self, ti): """ Get jobs associated with a Qubole commands :param ti: Task Instance of the dag, used to determine the Quboles command id - :return: Job informations assoiciated with command + :return: Job information associated with command """ if self.cmd is None: cmd_id = ti.xcom_pull(key="qbol_cmd_id", task_ids=self.task_id) diff --git a/airflow/contrib/hooks/salesforce_hook.py b/airflow/contrib/hooks/salesforce_hook.py index ba5c7e8d9a4d8..a1756b6530b6a 100644 --- a/airflow/contrib/hooks/salesforce_hook.py +++ b/airflow/contrib/hooks/salesforce_hook.py @@ -276,7 +276,7 @@ def write_object_to_file( schema = self.describe_object(object_name) - # possible columns that can be convereted to timestamps + # possible columns that can be converted to timestamps # are the ones that are either date or datetime types # strings are too general and we risk unintentional conversion possible_timestamp_cols = [ diff --git a/airflow/contrib/operators/awsbatch_operator.py b/airflow/contrib/operators/awsbatch_operator.py index 3c778e6e685cc..baf54603ac157 100644 --- a/airflow/contrib/operators/awsbatch_operator.py +++ b/airflow/contrib/operators/awsbatch_operator.py @@ -33,7 +33,7 @@ class AWSBatchOperator(BaseOperator): """ Execute a job on AWS Batch Service - .. warning: the queue parameter was renamed to job_queue to segreggate the + .. warning: the queue parameter was renamed to job_queue to segregate the internal CeleryExecutor queue from the AWS Batch internal queue. :param job_name: the name for the job that will run on AWS Batch (templated) diff --git a/airflow/contrib/operators/bigquery_check_operator.py b/airflow/contrib/operators/bigquery_check_operator.py index 247a1ae7fba1b..afb600a3d9120 100644 --- a/airflow/contrib/operators/bigquery_check_operator.py +++ b/airflow/contrib/operators/bigquery_check_operator.py @@ -48,7 +48,7 @@ class BigQueryCheckOperator(CheckOperator): This operator can be used as a data quality check in your pipeline, and depending on where you put it in your DAG, you have the choice to stop the critical path, preventing from - publishing dubious data, or on the side and receive email alterts + publishing dubious data, or on the side and receive email alerts without stopping the progress of the DAG. :param sql: the sql to be executed diff --git a/airflow/contrib/operators/cassandra_to_gcs.py b/airflow/contrib/operators/cassandra_to_gcs.py index 95107a497fd61..6819eca404ebb 100644 --- a/airflow/contrib/operators/cassandra_to_gcs.py +++ b/airflow/contrib/operators/cassandra_to_gcs.py @@ -266,7 +266,7 @@ def convert_tuple_type(cls, name, value): """ Converts a tuple to RECORD that contains n fields, each will be converted to its corresponding data type in bq and will be named 'field_', where - index is determined by the order of the tuple elments defined in cassandra. + index is determined by the order of the tuple elements defined in cassandra. """ names = ['field_' + str(i) for i in range(len(value))] values = [cls.convert_value(name, value) for name, value in zip(names, value)] @@ -276,7 +276,7 @@ def convert_tuple_type(cls, name, value): def convert_map_type(cls, name, value): """ Converts a map to a repeated RECORD that contains two fields: 'key' and 'value', - each will be converted to its corresopnding data type in BQ. + each will be converted to its corresponding data type in BQ. """ converted_map = [] for k, v in zip(value.keys(), value.values()): diff --git a/airflow/contrib/operators/dataflow_operator.py b/airflow/contrib/operators/dataflow_operator.py index 0f7ead15d6293..e880642f6067c 100644 --- a/airflow/contrib/operators/dataflow_operator.py +++ b/airflow/contrib/operators/dataflow_operator.py @@ -92,7 +92,7 @@ class DataFlowJavaOperator(BaseOperator): Cloud Platform for the dataflow job status while the job is in the JOB_STATE_RUNNING state. :type poll_sleep: int - :param job_class: The name of the dataflow job class to be executued, it + :param job_class: The name of the dataflow job class to be executed, it is often not the main class configured in the dataflow jar file. :type job_class: str diff --git a/airflow/contrib/operators/dataproc_operator.py b/airflow/contrib/operators/dataproc_operator.py index 8ff26969e32b5..e64cd25ef534f 100644 --- a/airflow/contrib/operators/dataproc_operator.py +++ b/airflow/contrib/operators/dataproc_operator.py @@ -1376,7 +1376,7 @@ def execute(self, context): self.hook.wait(self.start()) def start(self, context): - raise AirflowException('plese start a workflow operation') + raise AirflowException('Please start a workflow operation') class DataprocWorkflowTemplateInstantiateOperator(DataprocWorkflowTemplateBaseOperator): diff --git a/airflow/contrib/operators/druid_operator.py b/airflow/contrib/operators/druid_operator.py index 1436d99f28d6f..75d552fec5a5b 100644 --- a/airflow/contrib/operators/druid_operator.py +++ b/airflow/contrib/operators/druid_operator.py @@ -60,5 +60,5 @@ def execute(self, context): druid_ingest_conn_id=self.conn_id, max_ingestion_time=self.max_ingestion_time ) - self.log.info("Sumitting %s", self.index_spec_str) + self.log.info("Submitting %s", self.index_spec_str) hook.submit_indexing_job(self.index_spec_str) diff --git a/airflow/operators/druid_check_operator.py b/airflow/operators/druid_check_operator.py index 39674fdd3983e..514f61fc88988 100644 --- a/airflow/operators/druid_check_operator.py +++ b/airflow/operators/druid_check_operator.py @@ -47,7 +47,7 @@ class DruidCheckOperator(CheckOperator): This operator can be used as a data quality check in your pipeline, and depending on where you put it in your DAG, you have the choice to stop the critical path, preventing from - publishing dubious data, or on the side and receive email alterts + publishing dubious data, or on the side and receive email alerts without stopping the progress of the DAG. :param sql: the sql to be executed diff --git a/airflow/operators/presto_check_operator.py b/airflow/operators/presto_check_operator.py index 16f5bc0212a15..d70dcaa7d25af 100644 --- a/airflow/operators/presto_check_operator.py +++ b/airflow/operators/presto_check_operator.py @@ -48,7 +48,7 @@ class PrestoCheckOperator(CheckOperator): This operator can be used as a data quality check in your pipeline, and depending on where you put it in your DAG, you have the choice to stop the critical path, preventing from - publishing dubious data, or on the side and receive email alterts + publishing dubious data, or on the side and receive email alerts without stopping the progress of the DAG. :param sql: the sql to be executed diff --git a/airflow/www/app.py b/airflow/www/app.py index fde99743c58fd..ca82175bc6aab 100644 --- a/airflow/www/app.py +++ b/airflow/www/app.py @@ -135,7 +135,7 @@ def init_views(appbuilder): href='https://airflow.apache.org/', category="Docs", category_icon="fa-cube") - appbuilder.add_link("Github", + appbuilder.add_link("GitHub", href='https://github.com/apache/airflow', category="Docs") appbuilder.add_link('Version', diff --git a/dev/airflow-pr b/dev/airflow-pr index 42e01cc9e8ee8..641f0119e2023 100755 --- a/dev/airflow-pr +++ b/dev/airflow-pr @@ -63,7 +63,7 @@ AIRFLOW_GIT_LOCATION = os.environ.get( "AIRFLOW_GIT", os.path.dirname(os.path.dirname(os.path.realpath(__file__)))) -# Remote name which points to the Gihub site +# Remote name which points to the GitHub site GITHUB_REMOTE_NAME = os.environ.get("GITHUB_REMOTE_NAME", "github") # OAuth key used for issuing requests against the GitHub API. If this is not # defined, then requests will be unauthenticated. You should only need to diff --git a/docs/howto/operator.rst b/docs/howto/operator.rst index 65a82310c3dc6..34752383a3774 100644 --- a/docs/howto/operator.rst +++ b/docs/howto/operator.rst @@ -534,7 +534,7 @@ it will be retrieved from the GCP connection used. Both variants are shown: Advanced """""""" -When creating a table, you can specify the optional ``initial_split_keys`` and ``column_familes``. +When creating a table, you can specify the optional ``initial_split_keys`` and ``column_families``. Please refer to the Python Client for Google Cloud Bigtable documentation `for Table `_ and `for Column Families `_. diff --git a/docs/installation.rst b/docs/installation.rst index 11523180314a8..5db7f15fae84e 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -75,7 +75,7 @@ Here's the list of the subpackages and what they enable: | gcp_api | ``pip install apache-airflow[gcp_api]`` | Google Cloud Platform hooks and operators | | | | (using ``google-api-python-client``) | +---------------------+---------------------------------------------------+----------------------------------------------------------------------+ -| github_enterprise | ``pip install apache-airflow[github_enterprise]`` | Github Enterprise auth backend | +| github_enterprise | ``pip install apache-airflow[github_enterprise]`` | GitHub Enterprise auth backend | +---------------------+---------------------------------------------------+----------------------------------------------------------------------+ | google_auth | ``pip install apache-airflow[google_auth]`` | Google auth backend | +---------------------+---------------------------------------------------+----------------------------------------------------------------------+ diff --git a/docs/project.rst b/docs/project.rst index 7d91077488c17..d480c79491f8c 100644 --- a/docs/project.rst +++ b/docs/project.rst @@ -23,7 +23,7 @@ History Airflow was started in October 2014 by Maxime Beauchemin at Airbnb. It was open source from the very first commit and officially brought under -the Airbnb Github and announced in June 2015. +the Airbnb GitHub and announced in June 2015. The project joined the Apache Software Foundation's incubation program in March 2016. diff --git a/docs/security.rst b/docs/security.rst index d23e84e43feef..5e1a2b2713382 100644 --- a/docs/security.rst +++ b/docs/security.rst @@ -502,7 +502,7 @@ on limited web views 'LogModelView', 'Docs', 'Documentation', - 'Github', + 'GitHub', 'About', 'Version', 'VersionView',